Control 2.20: Adversarial Testing and Red Team Framework
Control ID: 2.20 Pillar: Management Regulatory Reference: OCC 2011-12, Fed SR 11-7, FINRA Rule 3110, NIST AI RMF, MITRE ATLAS Last UI Verified: February 2026 Governance Levels: Baseline / Recommended / Regulated Last Verified: 2026-02-03
Objective
Establish a proactive adversarial testing program to identify vulnerabilities in AI agents before deployment through red team exercises, prompt injection testing, jailbreak simulations, and robustness validation using golden datasets.
Why This Matters for FSI
- OCC 2011-12: Model validation includes stress testing and robustness assessment
- Fed SR 11-7: Independent model review should include adversarial scenarios
- FINRA Rule 3110: Testing evidence demonstrates due diligence for AI supervision
- NIST AI RMF: Structured adversarial testing supports AI risk management
Control Description
This control establishes adversarial testing through:
- Red Team Program - Structured approach to adversarial testing with defined scope and rules
- Attack Library - Curated test cases for prompt injection, jailbreaking, and manipulation
- Golden Dataset Validation - Reference datasets for measuring agent accuracy and robustness
- CI/CD Integration - Automated adversarial testing in deployment pipeline
- Evidence Documentation - Maintain records for regulatory examination
- Remediation Tracking - SLA-based remediation of identified vulnerabilities
Key Configuration Points
- Establish red team program with designated team members and rules of engagement
- Create attack library covering OWASP LLM Top 10 (2025) and MITRE ATLAS techniques
- Build golden dataset with 100+ domain-specific Q&A pairs
- Integrate automated adversarial testing in pre-deployment pipeline
- Define remediation SLAs: Critical (24h), High (7d), Medium (30d)
- Schedule quarterly red team exercises for Zone 3 agents
- Preserve test evidence per retention requirements (7+ years)
Zone-Specific Requirements
| Zone | Requirement | Rationale |
|---|---|---|
| Zone 1 (Personal) | Basic prompt injection tests; pre-deployment only | Low risk, minimal adversarial exposure |
| Zone 2 (Team) | Comprehensive test suite; quarterly testing; golden dataset validation | Shared agents warrant structured testing |
| Zone 3 (Enterprise) | Full red team program; continuous testing; third-party assessment annually; immediate remediation | Customer-facing requires maximum adversarial validation |
Roles & Responsibilities
| Role | Responsibility |
|---|---|
| AI Governance Lead | Red team program governance, remediation oversight |
| Security Team | Execute adversarial testing, maintain attack library |
| QA Lead | Golden dataset management, CI/CD integration |
| Agent Owner | Remediate vulnerabilities, implement fixes |
Related Controls
| Control | Relationship |
|---|---|
| 1.21 - Adversarial Input Logging | Detection complements proactive testing |
| 2.5 - Testing & Validation | Adversarial testing is part of validation |
| 2.6 - Model Risk Management | Red team supports model validation |
| 2.11 - Bias Testing | Complementary testing approach |
Implementation Playbooks
Step-by-Step Implementation
This control has detailed playbooks for implementation, automation, testing, and troubleshooting:
- Portal Walkthrough — Step-by-step portal configuration
- PowerShell Setup — Automation scripts
- Verification & Testing — Test cases and evidence collection
- Troubleshooting — Common issues and resolutions
Verification Criteria
Confirm control effectiveness by verifying:
- Red team program documented with rules of engagement
- Attack library covers OWASP LLM Top 10 (2025) vulnerability categories
- Golden dataset contains representative domain-specific Q&A pairs
- Automated adversarial testing executes in deployment pipeline
- Remediation tracking shows SLA compliance for identified vulnerabilities
Additional Resources
- MITRE ATLAS: AI Adversarial Threat Landscape (approximately 12 core tactics and over 100 techniques)
- OWASP Top 10 for LLM Applications
- Microsoft AI Red Team
- NIST AI Risk Management Framework
Updated: February 2026 | Version: v1.2 | UI Verification Status: Current