Evaluate your AI application's manners with PyRIT [ENG]

Wed, Jul 23, 2025
3-minute read

Leer en Español

If you are working with AI lately (and you likely are, according to your LinkedIn profile 😜), you might have missed a hidden gem in terms of making your AI solutions safer and, by extension, more responsible. This is a topic that motivates me a lot lately, so I will come back to it in next articles too.

I have been playing with this lite tool quite extensively in the last months for my bachelor’s thesis, and I wanted to share some words to unhide it for you and hopefully motivate every AI development team to include it in their lifecycle procedures.

Python Risk Identification Tool for generative AI (PyRIT)

PyRIT is a nicely modular open-source tool written in Python by the Microsoft Red Team. If you want to create a red team for your company (and you should, I have to insist, if you want to create responsible AI solutions), this tool will provide everything you need for your first risk evaluations and the required automation to make them more stable and programmatic, setting the right foundations for incremental improvements.

PyRIT is designed for adversarial probing. Its main function is simpler than it seems: it allows you to automate sending prompts to your AI conversational solution (it might be a model or a finished application) and get the answers for analysis.

Why is it important to test your Gen AI systems for risks?

Default implementation provides a full set of predefined probing prompts created by the Microsoft Red Team to cover the most common cyberattack strategies against AI solutions, so you can do a first set of tests out of the box, and you can also provide your own prompts to refine the specific tests that your own red team is producing.

Its modular design allows for great extensibility, but it is so powerful out of the box that most likely no customizations are needed at all to begin getting value out of it.

PyRIT architectural design includes a number of modules that can be extended individually

On top of that, with a great integration with AI Foundry (in preview), you will be able to see your evaluation results in a nice dashboard automatically updated after every run. Or you can download the raw results to integrate them into your own analytics platform. Your choice.

Where do I start?!

Download it from GitHub (install it as any other Python module in your environment) and check out their documentation as starting point. I recommend starting with their cookbooks to get a few examples working, so you can understand how to extend them to your application’s specific needs:

Repo: https://github.com/Azure/PyRIT
Documentation: https://azure.github.io/PyRIT/
Cookbooks: https://azure.github.io/PyRIT/cookbooks/README.html

Get started with our documentation on how to run an automated scan for safety risks with the AI Red Teaming Agent.

https://learn.microsoft.com/en-us/azure/ai-foundry/concepts/ai-red-teaming-agent

English Red Teaming Testing AI Generative AI LLM AI Ethics Responsible AI

Evaluate your AI application's manners with PyRIT [ENG]

Leer en Español

Python Risk Identification Tool for generative AI (PyRIT)

Where do I start?!

Related Posts