1. For the Assistant you want to red team, click on the three dots next to it.#
2. Click on Red Team to open the red teaming tab.#
3. Here you will be able to see the list of red team attacks with statuses Pending, Running, and Completed#
4. Let’s create a new Attack by clicking on Create Attack.#
5. Type a Name for the attack that reflects the scenario or risk area you want to test.#
6. In the red teaming tab, choose the Attack Type you want to run#
7. From the list of available Probes, select the probe set that matches your chosen attack type or scenario.#
Each probe, converter, and detector will include a description outlining its use case to help you select the appropriate option for your scenario. For a complete guide on the different types of probes, converters, and detectors, view Assistant Red Teaming 9. Choose the Scorer that will evaluate the assistant’s responses.#
10. Review your configuration: Attack Type, Probe, Converter, and Scorer to ensure they match your testing goals.#
12. A new job will appear in the jobs table with status Pending. After a short time, the status will update to Running as the attack is executed.#
14. Once the attack completes, the job status will change to Completed.#
15. Click on the completed job to open the detailed results view.#
16. Here you can inspect the prompts, model responses, and Scorer outputs for each red team example.#
17. You can also download the complete html report to view the vulnerable prompts (if any).#
Use these results to identify failures, unsafe behaviour, or other issues, and decide on follow-up fixes for your assistant.