Control 2.11: Bias Testing and Fairness Assessment

Control ID: 2.11
Pillar: Management
Regulatory Reference: Fed SR 26-2 (formerly SR 11-7), OCC Bulletin 2026-13 (formerly OCC 2011-12), ECOA / Reg B (15 U.S.C. § 1691, 12 CFR Part 1002), FINRA Rule 3110, FINRA 2026 Annual Regulatory Oversight Report (AI focus), CFPB Circular 2023-03 (adverse action notices using AI), NIST AI RMF (MEASURE-2.11)
Last UI Verified: May 2026
Governance Levels: Baseline / Recommended / Regulated

Objective

Implement systematic bias testing and fairness assessment for AI agents to identify and remediate discriminatory outputs, supporting compliance with fair lending laws and regulatory expectations for AI fairness in financial services.

Why This Matters for FSI

FINRA Rule 3110 (Supervision): Written supervisory procedures should cover AI systems used in investor communications and recommendations, including evidence of fairness review prior to deployment.
FINRA 2026 Annual Regulatory Oversight Report: Identifies AI / generative AI as an examination priority, with explicit focus on bias, accuracy, and supervisory documentation.
Fed SR 26-2 (formerly SR 11-7) / OCC Bulletin 2026-13 (formerly OCC 2011-12) (Model Risk Management): Model validation should assess outcomes for potential disparate impact across protected classes; effective challenge by an independent function is expected for higher-risk models.
ECOA / Regulation B (12 CFR § 1002.4): Prohibits discrimination on a prohibited basis in any aspect of a credit transaction. Applies whenever an agent influences credit, lending, account-opening, pricing, or insurance underwriting decisions.
CFPB Circular 2023-03: Reaffirms that creditors using AI / complex models must still provide specific and accurate adverse action reasons under ECOA — opaque "black-box" justifications are not compliant.
SEC Predictive Data Analytics proposal & Reg BI: Examinations focus on conflicts of interest and fairness in AI used for retail customer interactions; pairs with Control 2.18.
NIST AI RMF 1.0 (MEASURE-2.11): Establishes the baseline expectation that fairness and bias of AI systems are evaluated and results documented.

ECOA Applicability

Equal Credit Opportunity Act (ECOA) and Regulation B requirements apply specifically when AI agents influence credit, lending, account-opening, pricing, or insurance underwriting decisions. For agents not involved in these functions, focus on FINRA Rule 3110 supervision, Fed SR 26-2 (formerly SR 11-7) / OCC Bulletin 2026-13 (formerly OCC 2011-12) model risk requirements, and SEC Reg BI fairness expectations. State fair lending laws may extend protected classes (e.g., sexual orientation, gender identity, military status) — consult legal counsel.

Control Description

This control establishes a bias-testing and fairness program through:

Protected Class Identification — Define classes per ECOA (race, color, religion, national origin, sex, marital status, age, receipt of public assistance, good-faith exercise of Consumer Credit Protection Act rights) plus any state-law additions.
Fairness Metrics — Combine outcome-rate measures (demographic parity, disparate impact ratio / four-fifths rule), error-rate measures (equalized odds, equal opportunity), and probability calibration. No single metric is sufficient; selection should match the agent's use case and regulatory context.
Test Dataset Construction — Build representative test datasets that span protected classes, with documented methodology and statistical power calculations. Synthetic data is preferred over production customer data to manage privacy risk under GLBA Safeguards Rule.
Bias Detection Procedures — Run agent outputs across demographic groups, classify outcomes against pre-defined criteria, and apply statistical significance testing (chi-square, Fisher's exact, regression) before declaring pass/fail.
Remediation Workflow — Triage findings by severity, apply system-prompt or knowledge-source changes, and re-test before redeployment. Material model changes invoke re-validation under Fed SR 26-2 (formerly SR 11-7) effective challenge.
Independent Validation (Zone 3) — A function independent of the model owner reviews methodology, results, and remediation, consistent with Fed SR 26-2 (formerly SR 11-7) effective-challenge expectations.
Audit-Defensible Evidence — Retain test inputs, raw outputs, statistical analysis, sign-offs, and remediation history in storage configured for WORM retention (per SEC 17a-4(f) / FINRA Rule 4511 record-keeping requirements).

This control aligns with NIST AI RMF MEASURE-2.11 (fairness and bias evaluated and results documented) and is a required component of the model-risk lifecycle defined in Control 2.6.

Fair Lending — CFPB / Reg B Adverse-Action Notice Subsection

CFPB has elevated AI-driven adverse-action notice generation to an active enforcement priority. CFPB Circular 2022-03 stated that creditors using "complex algorithms," including "black-box" machine-learning models, are not exempt from ECOA's specific-and-accurate adverse-action reason requirement under Regulation B (12 CFR § 1002.9). CFPB Circular 2023-03 reinforced this — using the agency's sample adverse-action checklist forms is not a safe harbor when an AI model relies on factors outside those listed boxes. The U.S. Department of the Treasury's Uses, Opportunities, and Risks of Artificial Intelligence in the Financial Services Sector report (December 2024) further called out fair-lending principal-reasons disclosure as a recurring supervisory concern across federal financial regulators.

Bias testing alone is necessary but not sufficient for ECOA compliance. A statistically fair model that cannot generate a specific, accurate, customer-readable principal-reasons disclosure for a denied applicant fails Reg B § 1002.9(a)(2)(i). This subsection adds the operational control that complements the bias-testing program above:

Adverse-action notice generation control. When an AI agent contributes — directly or as part of a model-assisted workflow — to a credit denial, counter-offer, or other adverse action under ECOA, the firm must generate a written adverse-action notice that lists the principal reasons for the action in plain, customer-readable language. The control responsibility is to capture, persist, and surface those reasons at the point the decision is made, not to reconstruct them after the fact.
Human-in-the-loop checkpoint (Zone 2 / Zone 3). Any agent-recommended adverse credit action requires a documented human reviewer step before the decision is communicated to the consumer. The reviewer affirms (a) the adverse-action notice's principal reasons accurately describe the model factors that drove the outcome, and (b) any factor that cannot be expressed in plain language was not a principal driver.
Audit trail of principal-reasons logic. For every adverse action where an AI agent contributed, retain: the model version and SHAP / feature-importance output (or equivalent explainability artifact), the mapping from model features to plain-language reason codes, the reviewer attestation, and the final notice text delivered to the consumer. Retain in WORM-configured storage per the FINRA / SEC retention period.
Model explainability requirement. Models used in any credit, lending, account-opening, or pricing decision pipeline must produce per-decision feature attributions sufficient to support principal-reasons generation. A model that cannot explain individual decisions in customer-readable terms is not approvable for those use cases under this control, regardless of aggregate fairness scores.
Mapping to Reg B sample-form usage. If the firm uses CFPB sample adverse-action checklist forms, document which checkbox each model factor maps to and what additional written reasons are appended for factors that fall outside the sample-form vocabulary. CFPB Circular 2023-03 specifically warns that ticking a generic checkbox is not compliant when the actual driver is a complex model factor.

This subsection applies whenever an agent is in scope for ECOA / Regulation B (credit, lending, account-opening, pricing, insurance underwriting). Implementation requires coordinated work between Compliance, Model Risk, the agent owner, and the consumer-communication / disclosure operations team. Organizations should validate the principal-reasons generation pathway end-to-end against representative denied-applicant scenarios before promoting any in-scope agent to Zone 3.

Related Automation

See Control 2.18 for the complementary Conflict of Interest Testing solution. No dedicated bias-testing automation package is currently published for this control; the playbooks below describe the implementation pattern using PowerShell, Power Automate, and Power BI.

Key Configuration Points

Define protected classes relevant to the agent's use case (ECOA + state-specific) and document the rationale for any class scoped out.
Create fairness test datasets with representative demographic distribution and documented sample-size / statistical-power justification.
Establish baseline fairness metrics before deployment, including the disparate-impact ratio (four-fifths rule) for any agent that influences credit, lending, hiring-adjacent, or pricing decisions.
Configure automated bias testing in the CI/CD or release pipeline for Zone 3 agents; gate production promotion on test results.
Set remediation SLAs by severity: Critical (24 hours), High (7 days), Medium (30 days). Material model changes trigger re-validation under Fed SR 26-2 (formerly SR 11-7).
Capture audit-defensible evidence (test inputs, raw outputs, statistical analysis, SHA-256 manifest) in WORM-configured storage.
Schedule recurring bias assessments — quarterly minimum for Zone 3, after every material change, and on protected-class data refreshes.
Require independent validation sign-off for Zone 3 agents before production deployment and on each quarterly cycle.

Zone-Specific Requirements

Zone	Requirement	Rationale
Zone 1 (Personal)	Awareness training; report suspected bias; annual review	Low external impact, basic awareness needed
Zone 2 (Team)	Pre-deployment bias testing; documented assessment; quarterly review	Shared agents warrant structured testing
Zone 3 (Enterprise)	Comprehensive fairness assessment; automated monitoring; independent validation; remediation SLAs	Customer-facing requires rigorous bias controls

Roles & Responsibilities

Role	Responsibility
AI Governance Lead	Define testing requirements, oversee fairness program, approve methodology
Model Risk Manager	Provide independent challenge of methodology and results (Fed SR 26-2 (formerly SR 11-7))
Data Science Team	Develop fairness metrics, execute statistical analysis, document methodology
Compliance Officer	Validate regulatory alignment, sign off on ECOA / Reg B applicability scoping
Agent Owner	Remediate identified bias, implement corrective actions, request re-validation
Purview Compliance Admin	Configure WORM retention for fairness evidence (SEC 17a-4 / FINRA 4511)

Control	Relationship
2.6 - Model Risk Management	Bias testing is a required component of model validation under Fed SR 26-2 (formerly SR 11-7)
2.5 - Testing & Validation	Fairness testing is integrated with broader QA gates
2.18 - Conflict of Interest Testing	Complementary testing for recommendation bias (COI Testing Framework)
3.10 - Hallucination Feedback	Bias-related findings feed quality and feedback management
3.3 - Compliance and Regulatory Reporting	Bias-testing evidence rolls up into FINRA / SEC examination reporting

Implementation Playbooks

Step-by-Step Implementation

This control has detailed playbooks for implementation, automation, testing, and troubleshooting:

Portal Walkthrough — Step-by-step portal configuration
PowerShell Setup — Automation scripts
Verification & Testing — Test cases and evidence collection
Troubleshooting — Common issues and resolutions

Verification Criteria

Confirm control effectiveness by verifying:

Protected classes documented per ECOA and applicable state law, with rationale for any class scoped out
Fairness test dataset includes representative demographic distribution with sample-size justification
Baseline fairness metrics established and documented (demographic parity, equalized odds, calibration, disparate-impact ratio)
Bias testing executed before every Zone 3 agent deployment and on each quarterly cycle
Bias assessment report includes statistical significance testing (chi-square / Fisher / regression) and disparate-impact ratio against the four-fifths rule
Independent validation sign-off recorded for Zone 3 agents (Fed SR 26-2 (formerly SR 11-7) effective challenge)
Evidence retained in WORM-configured storage with SHA-256 integrity manifest for the FINRA / SEC retention period

Additional Resources

Standards crosswalk

NIST AI RMF Crosswalk (FSI-AgentGov) — Control 2.11 maps to MEASURE 2.11 (primary: fairness and bias evaluation), MAP 5.1 (upstream impact identification), and MEASURE 2.6 (general performance measurement)

Updated: June 2026 | Version: v1.6.2 | UI Verification Status: Current