Control 2.20: Adversarial Testing and Red Team Framework

Control ID: 2.20
Pillar: Management
Regulatory Reference: OCC Bulletin 2026-13 (formerly OCC Bulletin 2011-12), Federal Reserve SR 26-2 (formerly SR 11-7), FINRA Rule 3110, FINRA RN 24-09, NIST AI RMF (with Generative AI Profile, NIST AI 600-1), MITRE ATLAS, OWASP Top 10 for LLM Applications (2025)
Last UI Verified: May 2026
Governance Levels: Baseline / Recommended / Regulated

Objective

Establish a proactive adversarial testing program to identify vulnerabilities in AI agents before deployment through red team exercises, prompt injection testing, jailbreak simulations, and robustness validation using golden datasets.

Why This Matters for FSI

OCC Bulletin 2026-13 (formerly OCC Bulletin 2011-12) / Federal Reserve SR 26-2 (formerly SR 11-7): Effective challenge of model design helps meet supervisory expectations for independent stress and robustness testing of AI agents used in business decisions.
FINRA Rule 3110 / Regulatory Notice 24-09 (Generative AI / LLM Guidance): Documented adversarial testing supports the firm's supervisory system and Written Supervisory Procedures (WSPs) for AI tools used by associated persons.
SEC Rule 17a-4(b)(4) / 18a-6: Red-team test artifacts (prompts, responses, severity, remediation) are business records that must be preserved when they evidence supervision of an AI system.
GLBA 501(b) Safeguards Rule: Periodic adversarial assessment helps demonstrate ongoing evaluation of safeguards on systems handling customer NPI.
NIST AI RMF (MEASURE 2.6, MANAGE 4.1): Structured red-team programs support the "Measure" and "Manage" functions for generative AI risk.

No companion solution by design

Not all controls have a companion solution in FSI-AgentGov-Solutions; solution mapping is selective by design. This control is operated via native Microsoft admin surfaces and verified by the framework's assessment-engine collectors. See the Solutions Index for the catalog and coverage scope.

Control Description

This control establishes a proactive adversarial testing program — distinct from the detection posture in Control 1.21 (Adversarial Input Logging) and the general QA posture in Control 2.5 (Testing & Validation). The program is built on six elements:

Red Team Program Charter — Documented scope, rules of engagement, authorization, separation of duties, and out-of-scope assets. Aligned with the firm's WSPs and the Microsoft AI Red Team operating model.
Attack Library — Curated, version-controlled test cases covering OWASP Top 10 for LLM Applications (2025), MITRE ATLAS techniques, and FSI-specific abuse cases (NPI exfiltration, MNPI elicitation, unsuitable-recommendation prompting, suitability bypass).
Golden Dataset Validation — Reference datasets of representative domain-specific Q&A pairs used to detect regressions in agent accuracy and refusal behaviour after configuration or model changes.
Tooling and Automation — Integration with Microsoft PyRIT (Python Risk Identification Toolkit) for orchestrated probes against Azure OpenAI / Microsoft Foundry (formerly Azure AI Foundry) -backed agents, and Microsoft Foundry built-in Risk and Safety Evaluations for content-safety scoring.
Evidence and Remediation Tracking — SHA-256-hashed, WORM-retained test packs with severity classification (Critical / High / Medium / Low) and SLA-based remediation tracked to closure.
Cadence and Independence — Pre-deployment gates plus quarterly (Zone 2) or continuous + annual third-party (Zone 3) cycles, with operator role separation enforced for model-validation independence per OCC Bulletin 2026-13 (formerly OCC 2011-12).

Adversarial testing is preventive; Control 1.21 is detective. Both should be deployed together for Zone 2 and Zone 3 agents.

Key Configuration Points

Charter the red team program with named operators, rules of engagement, authorization, and out-of-scope assets; review annually.
Maintain an attack library that maps to OWASP Top 10 for LLM Applications (2025) and MITRE ATLAS technique IDs; version-control under change management.
Build and maintain a golden dataset of at least 100 domain-specific Q&A pairs per Zone 3 agent; refresh after any material model, prompt, or grounding-source change.
Integrate adversarial probes into the pre-deployment pipeline (gate releases on a defined defense-rate threshold per zone).
Use Microsoft PyRIT for orchestrated probing of Azure OpenAI / Microsoft Foundry (formerly Azure AI Foundry)-backed agents; use Microsoft Foundry Risk and Safety Evaluations for content-safety scoring.
Define remediation SLAs: Critical (24 hours to fix or compensating control), High (7 days), Medium (30 days), Low (next release).
Schedule quarterly red-team exercises for Zone 2 agents; continuous testing plus annual independent third-party assessment for Zone 3.
Preserve test artefacts (prompts, responses, severity, remediation, SHA-256 manifest) to immutable storage for the firm's records-retention horizon (commonly 6+ years for FINRA 4511 / SEC 17a-4(b)(4)).
Enforce operator role separation — the red-team operator may not also be the agent's primary developer or owner without a co-signer.

Zone-Specific Requirements

Zone	Requirement	Rationale
Zone 1 (Personal)	Basic prompt injection tests; pre-deployment only	Low risk, minimal adversarial exposure
Zone 2 (Team)	Comprehensive test suite; quarterly testing; golden dataset validation	Shared agents warrant structured testing
Zone 3 (Enterprise)	Full red team program; continuous testing; third-party assessment annually; immediate remediation	Customer-facing requires maximum adversarial validation

Roles & Responsibilities

Role	Responsibility
AI Governance Lead	Charter, scope, rules of engagement, remediation oversight, and reporting to the AI governance committee
Security Architect	Maintain attack library; align test coverage with MITRE ATLAS and OWASP LLM Top 10 (2025)
Cloud Security Architect	Execute adversarial probes; tune detection patterns; reconcile findings with Control 1.21 detection telemetry
Model Risk Manager	Independent challenge per OCC Bulletin 2026-13 (formerly OCC 2011-12) / Fed SR 26-2 (formerly SR 11-7); sign off on golden dataset adequacy
Agent Owner	Remediate identified vulnerabilities within SLA; provide configuration and prompt-design fixes
Microsoft Copilot Studio Agent Author	Implement remediations (topic, instruction, grounding-scope, content-filter changes)
Compliance Officer	Confirm WSP language reflects documented testing cadence and findings retention

Control	Relationship
1.21 - Adversarial Input Logging	Detection complements proactive testing
2.5 - Testing & Validation	Adversarial testing is part of validation
2.6 - Model Risk Management	Red team supports model validation
2.11 - Bias Testing	Complementary testing approach

Implementation Playbooks

Step-by-Step Implementation

This control has detailed playbooks for implementation, automation, testing, and troubleshooting:

Portal Walkthrough — Step-by-step portal configuration
PowerShell Setup — Automation scripts
Verification & Testing — Test cases and evidence collection
Troubleshooting — Common issues and resolutions

Verification Criteria

Confirm control effectiveness by verifying:

Red team program charter exists, is approved, and documents scope, authorization, rules of engagement, and operator role separation.
Attack library covers all OWASP Top 10 for LLM Applications (2025) categories and at least the high-relevance MITRE ATLAS techniques for the agent's deployment surface.
Golden dataset for each Zone 3 agent contains ≥100 representative Q&A pairs and shows a documented refresh date within the last quarter.
Pre-deployment adversarial probes run automatically and gate releases against a defined defense-rate threshold; evidence pack (prompts, responses, severity, manifest with SHA-256 hashes) is generated per run.
Remediation tracking shows SLA conformance for identified vulnerabilities; overdue items have a documented compensating control and risk-acceptance signature.
For Zone 3 agents, an annual independent third-party adversarial assessment report is on file and findings are tracked to closure.
Test artefacts are stored on immutable / WORM-capable media for the firm's records-retention horizon and are reconcilable with Control 1.21 detection events by ConversationId / AgentId.

Additional Resources

Updated: June 2026 | Version: v1.6.2 | UI Verification Status: Current