Thumbnail image

Copilot Studio risk evaluations with the AI Red Teaming Agent

Table of contents

In today’s rapidly evolving AI landscape, ensuring the safety and reliability of AI systems is vital. One effective approach to achieving this is through AI red teaming, best practice in the responsible development of systems and features using Large Language Models (LLMs).

In this article, I will briefly showcase the AI Red Teaming Agent and provide an example of how it can be used to evaluate agents created with Copilot Studio.

Why AI Red Teaming?

AI Red teaming is crucial for uncovering and identifying potential harm in AI systems. While it is not a replacement for systematic measurement and mitigation work, AI red teamers play a vital role in enabling measurement strategies to validate and amplify the effectiveness of mitigations.

AI Red teaming aims to identify potential risks, assess the risk surface, and develop a list of issues that can inform what should be measured and addressed.

AI Red teams play a vital role in ensuring that AI-based solutions adhere to Responsible AI (RAI) principles, such as Fairness, Reliability and Safety, Privacy and Security, and Inclusiveness.

The AI Red Teaming Agent

The AI Red Teaming Agent leverages Microsoft’s open-source framework Python Risk Identification Tool (PyRIT), along with Azure AI Foundry’s Risk and Safety Evaluations. You may remember PyRIT from my previous post.

This combination helps assess safety issues in three ways:

  1. Automated scans for content risks by simulating adversarial probing.
  2. Evaluating and scoring each attack-response pair to generate insightful metrics such as Attack Success Rate (ASR).
  3. Generating a scorecard of the attack probing techniques and risk categories to help decide if the system is ready for deployment.

The AI Red Teaming Agent includes built-in probing prompts for multiple attack strategies and risk categories but also lets you use custom prompts to test your AI system.

AI Red Teaming Agent is perfectly explained in this video by Minsoo Thigpen with Seth Juarez. I strongly recommend taking a view, it’s well worth the time.

Evaluating Copilot Studio agents

To illustrate the practical application of the AI Red Teaming Agent and its customization capabilities, I have created an example using it to test an agent created and published with Copilot Studio. This example demonstrates how to perform a risk evaluation and showcases the effectiveness of the AI Red Teaming Agent in identifying and mitigating potential risks.

You can find the example here, please download it, test it with your own implementation and come to us with feedback!:

The notebook starts with some instructions on how to install pre-requisites and dependencies and how to provide the required secrets in your environment to connect to pre-existing required components (AI Foundry, Storage Account, Copilot Studio agent, …).

Then there is a node that allows to quickly test if the connectivity is correct, and hopefully getting a fun joke out of Copilot 😊

After this first test, in the RedTeam class you decide the risk categories you want to test and the number of objectives per category (I suggest choosing a small number for testing, and then increasing when running real evaluations):

And using this class the specific evaluations can be ran specifying the attack strategies, that might be selected individually, in groups, or combined:

The notebook also includes instructions on how to use your own prompts, but that’s too much for this article 😊

NOTE: This example uses the Agents-for-Python library, which is currently in early preview. The AI Red Teaming Agent is also in preview. However, both components can be integrated and utilized effectively in their current state to get valuable outcomes.

Conclusion:

AI Red teaming is an essential practice in the responsible development of AI systems. By leveraging the AI Red Teaming Agent to evaluate our Copilot Studio agents, we can ensure that our AI systems are safe, reliable, and ready for deployment.

Learn More:

For more information about the tools and ideas leveraged by the AI Red Teaming Agent, you can refer to the following resources:

Published first in LinkedIn, let me a comment there to read your thoughts!

Related Posts