AI Glossary

Red Teaming (AI)

Systematically probing AI systems for vulnerabilities, failures, and harmful behaviors by simulating adversarial attacks and edge cases.

Process

Red teamers try to make the model produce harmful, biased, factually wrong, or otherwise problematic outputs. They test jailbreaks, prompt injections, edge cases, and creative misuse scenarios.

Importance

Red teaming is now standard practice before major model releases. Companies like Anthropic, OpenAI, and Google employ dedicated red teams. External red teaming (by independent researchers) provides additional coverage.

← Back to AI Glossary

Last updated: March 5, 2026