Skip to content

Control 3.8: Model Risk Management Alignment (OCC 2011-12 / SR 11-7)

Control ID: 3.8 Pillar: Compliance & Audit Regulatory Reference: OCC Bulletin 2011-12 (Supervisory Guidance on Model Risk Management), OCC Bulletin 2025-26 (Model Risk Management — Community Bank Proportionality), Federal Reserve SR 11-7, ECOA (Equal Credit Opportunity Act), Fair Housing Act Last Verified: 2026-02-17 Governance Levels: Baseline / Recommended / Regulated


Objective

Align Microsoft 365 Copilot governance with model risk management (MRM) frameworks established under OCC Bulletin 2011-12 and Federal Reserve SR 11-7, including model inventory, validation scope, ongoing monitoring, and documentation -- while recognizing that M365 Copilot is a vendor-provided model with limited internal validation scope, and focusing governance on usage controls and output monitoring. Address fair lending and ECOA considerations where Copilot outputs could reflect bias in lending, advisory, or customer service contexts. Apply the proportionality guidance from OCC Bulletin 2025-26 to align MRM depth with the institution's size, complexity, and Copilot usage scope.

Why This Matters for FSI

OCC Bulletin 2011-12 and Federal Reserve SR Letter 11-7 define model risk management expectations for banking organizations. These frameworks require institutions to identify, manage, and control the risks associated with models used in decision-making, reporting, and risk management activities. While the guidance was written before the widespread adoption of large language models (LLMs), regulatory expectations have expanded to include AI systems that influence or support business decisions.

M365 Copilot presents a unique MRM challenge: it is a general-purpose LLM provided by Microsoft as a vendor service, not a model developed in-house for a specific quantitative purpose. SR 11-7 defines a "model" as "a quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates." Copilot straddles this definition — it processes input and produces output, but it is not a traditional quantitative model designed for a specific financial decision. Institutions cannot validate the model's internal architecture, training data, or algorithmic mechanics in the traditional MRM sense. However, institutions can -- and regulators expect them to -- govern how the model is used, monitor its outputs, document its limitations, and implement compensating controls for known risks.

OCC Bulletin 2025-26 proportionality guidance: In 2025, the OCC issued updated model risk management guidance acknowledging that community banks and smaller institutions require proportional MRM frameworks. OCC Bulletin 2025-26 clarifies that the depth and complexity of model risk management should be commensurate with the institution's size, complexity, and risk profile. For community banks and smaller institutions deploying Copilot, this means a simplified MRM approach -- model inventory registration, basic output monitoring, and vendor due diligence review -- satisfies regulatory expectations without the full validation infrastructure expected of large complex banking organizations. Larger institutions with more complex Copilot deployments (client-facing activities, lending workflows, regulated advisory) should apply the full OCC 2011-12 / SR 11-7 framework.

The fair lending dimension adds urgency to MRM alignment. If Copilot generates customer communications, assists with lending decisions, or supports advisory activities, its outputs could inadvertently reflect biases that violate the Equal Credit Opportunity Act (ECOA) or the Fair Housing Act. Institutions must monitor for and mitigate discriminatory outcomes in Copilot-assisted activities.

Control Description

This control addresses the MRM framework alignment for M365 Copilot, organized around the three pillars of OCC Bulletin 2011-12: model development (adaptation for vendor models), model validation, and ongoing monitoring.

Copilot's Model Status Under SR 11-7

The foundational question for any institution's MRM program is whether M365 Copilot meets the SR 11-7 definition of a "model." The practical answer is that Copilot occupies a gray zone: it processes input data and produces output, but it is not a traditional quantitative model built to generate specific financial estimates. Most institutions resolve this by registering Copilot in their model inventory as a limited-scope or lower-tier model, with governance focused on usage controls, output monitoring, and vendor due diligence rather than the validation infrastructure designed for in-house quantitative models.

The most common approach is to classify Copilot as a Tier 3 (limited-scope) model when used primarily for internal productivity (document drafting, meeting summaries, research support), with escalation to Tier 2 or Tier 1 when deployed in client-facing, lending, or advisory workflows. This tier assignment should be documented in the model inventory with supporting rationale.

MRM Framework Adaptation for Vendor-Provided LLMs

Traditional MRM Element M365 Copilot Adaptation
Model development and implementation Vendor due diligence (Microsoft), usage policy definition, scope limitation
Model validation Output quality monitoring, use-case-specific testing, vendor attestation review
Ongoing monitoring Copilot output monitoring, usage analytics, bias detection, performance tracking
Model inventory Register Copilot in the enterprise model inventory with usage scope and risk tier
Model documentation Document Copilot capabilities, limitations, intended uses, and prohibited uses
Model governance AI governance committee oversight, periodic review, risk appetite alignment

Model Inventory Entry for M365 Copilot

Field Value
Model name Microsoft 365 Copilot
Model type Large language model (LLM) -- vendor-provided SaaS
Vendor Microsoft Corporation
Model owner (internal) [CTO / Head of AI Governance]
Risk tier Tier 3 (limited-scope) for internal productivity; Tier 2 for business unit decision support; Tier 1 for client-facing or lending workflows
Use cases Document generation, communication drafting, data summarization, financial analysis assistance, research, meeting support
Prohibited uses Autonomous investment recommendations, unsupervised lending decisions, compliance certifications without human review
Validation approach Output monitoring, use-case testing, vendor attestation review
Validation frequency Quarterly output review; annual comprehensive assessment
Data inputs Microsoft Graph (user's emails, files, chats, calendar), real-time prompt context
Outputs Natural language text, document content, summaries, analysis
Known limitations Hallucination risk, no real-time market data, potential bias in generated content, no guaranteed factual accuracy
Compensating controls Human review requirements, output verification procedures, Communication Compliance monitoring, DLP policies

Validation Scope for Vendor-Provided Models

Since institutions cannot perform traditional model validation (reviewing source code, training data, or algorithm mechanics), the validation scope focuses on:

  1. Vendor due diligence: Review Microsoft's responsible AI documentation, model cards, safety reports, and third-party audits (see Control 1.10)
  2. Use-case testing: Test Copilot outputs for specific FSI use cases to identify quality, accuracy, and bias issues
  3. Output monitoring: Continuously monitor Copilot outputs for hallucinations, inaccuracies, and biased language
  4. Boundary testing: Verify that Copilot respects configured guardrails (DLP, sensitivity labels, information barriers)
  5. Performance benchmarking: Establish baseline metrics for Copilot output quality and track over time

Fair Lending and ECOA Considerations

Risk Area Description Monitoring Approach
Lending communications Copilot-drafted communications to loan applicants may use language that differs based on applicant demographics Sample Copilot-drafted lending communications for disparate language patterns
Customer segmentation Copilot-generated analysis may reflect historical biases in customer data Review Copilot-assisted segmentation outputs for protected class disparities
Product recommendations Copilot-assisted product suggestions may vary based on customer characteristics reflected in grounding data Test product recommendation outputs across diverse customer profiles
Complaint responses Copilot-drafted complaint responses may exhibit different tone or helpfulness based on customer demographics Sample and compare Copilot responses across customer segments
Marketing content Copilot-generated marketing may inadvertently target or exclude protected classes Review Copilot marketing content for fair lending compliance

Copilot Surface Coverage

Copilot Surface MRM Relevance Risk Level Monitoring Focus
Word Copilot High -- generates financial documents, loan documents, client proposals High Output accuracy, bias in generated language, factual correctness
Excel Copilot High -- generates formulas, financial analyses, calculations High Calculation accuracy, formula correctness, analytical soundness
Outlook Copilot High -- drafts client communications, lending correspondence High Language bias, UDAAP compliance, fair lending language
Microsoft 365 Copilot Chat Moderate -- research and analysis supporting decisions Moderate Hallucination rate, source attribution, factual accuracy
Teams Copilot Moderate -- meeting recaps, client conversation summaries Moderate Accuracy of summaries, completeness, bias in summarization
PowerPoint Copilot Moderate -- client presentations, investment reviews Moderate Data accuracy in generated slides, balanced presentation
Copilot Pages Low-Moderate -- collaborative content creation Low Content accuracy when used for decision-support documents

Governance Levels

Baseline

Applies to institutions of all sizes, calibrated to actual Copilot usage scope per OCC Bulletin 2025-26 proportionality guidance. OCC Bulletin 2025-26 (October 2025) supplements — but does not replace — SR 11-7/OCC 2011-12, providing practical clarifications for community banks. Annual validation is not mandatory for lower-risk models per this bulletin:

  • Register M365 Copilot in the enterprise model inventory with tier classification (Tier 3 for internal productivity, with documentation of the proportionality rationale for community banks per OCC Bulletin 2025-26). The AI inventory entry should include: tool name, version, purpose, business owner, deployment date, and risk classification
  • Document Copilot's intended uses and prohibited uses within the institution
  • Document the usage scope and confirm it aligns with the selected model tier, applying the proportionality principle — validation rigor scales with risk complexity and materiality:
    • Custom AI (credit scoring, fraud detection): Full SR 11-7 validation required
    • M365 Copilot for productivity (no automated decision-making): Lower-risk; vendor due diligence + AI inventory entry sufficient
    • Copilot in regulated workflows (loan memos, customer communications): Elevated risk; SR 11-7 applies proportionally
  • Establish basic output monitoring through supervisory review of Copilot-assisted activities
  • Document Copilot's known limitations and communicate them to all users
  • Include Copilot in the institution's vendor risk management program (see Control 1.10)
  • Designate an internal model owner responsible for Copilot MRM alignment
  • Conduct use-case-specific output quality assessments quarterly
  • Implement automated output monitoring for hallucination and accuracy indicators
  • Establish fair lending testing protocols for Copilot-assisted lending and advisory activities
  • Create Copilot-specific model risk documentation that follows OCC Bulletin 2011-12 structure, including these required elements:
    1. AI inventory entry: Tool name, version, purpose, business owner, deployment date, risk classification
    2. Vendor due diligence: Microsoft service agreements, DPA (Data Processing Agreement), transparency documentation
    3. Use case risk assessment: Input data types, output use, whether outputs inform regulated decisions
    4. Control environment documentation: Data access controls, audit logging enablement, DLP policies, retention policies
    5. Ongoing monitoring plan: Frequency of control testing, escalation procedures
    6. Change management log: Track Microsoft Copilot updates via Message Center
  • Implement periodic output quality monitoring to track accuracy and hallucination rates over time
  • Leverage Microsoft tools to collect MRM-relevant data:
    • DSPM for AI: Prompt/response pairs, sensitive data exposure events, oversharing alerts — feeds risk assessment
    • Purview Audit (UAL): Interaction logs, accessed resources, sensitivity labels on accessed files
    • Microsoft Defender for Cloud Apps: Copilot anomaly detection, unusual usage patterns
    • Viva Insights Copilot Dashboard: Usage frequency, adoption metrics, active user counts
    • Insider Risk Management (Risky AI usage): Alerts on risky Copilot interactions — document as control test results
  • Review Microsoft's responsible AI reports and model updates annually
  • Integrate Copilot MRM into the institution's model risk governance committee
  • Track Copilot output quality metrics over time and report trends to the AI governance committee
  • Document compensating controls for each identified model risk area
  • Conduct vendor due diligence review of Microsoft Copilot documentation (SOC 2, AI Impact Assessment, data processing terms)

Regulated

Applies the full OCC Bulletin 2011-12 / SR 11-7 MRM lifecycle for institutions deploying Copilot in client-facing, lending, or complex financial workflows. Note: For Copilot used solely as a productivity tool with no automated decision-making, annual comprehensive validation is not mandatory per OCC Bulletin 2025-26 — the regulated tier applies when Copilot outputs materially affect bank decisions, customer communications, or lending activities:

  • Conduct annual comprehensive MRM assessment for Copilot aligned with OCC 2011-12 / SR 11-7 expectations, including validation report focused on usage controls and output quality (not internal model architecture, which is Microsoft's responsibility as vendor)
  • Implement ongoing fair lending monitoring for Copilot-assisted activities with statistical testing for disparate impact
  • Commission independent third-party assessment of Copilot governance framework effectiveness
  • Maintain model risk quantification metrics (e.g., output error rates, hallucination frequency, bias detection rates)
  • Establish model risk appetite limits for Copilot and trigger enhanced controls when limits are approached
  • Prepare examination-ready MRM documentation package for OCC, Federal Reserve, or other banking regulators, organized to include: AI inventory, vendor due diligence, use case risk assessments, control environment documentation, monitoring plan, and change management log
  • Implement challenger or benchmark testing where Copilot outputs for specific use cases are compared against human or alternative system outputs
  • Conduct stress testing of Copilot governance controls (e.g., what happens when Copilot output quality degrades)

Setup & Configuration

Step 1: Register Copilot in Model Inventory

  1. Open the institution's model inventory system (GRC platform, spreadsheet, or MRM tool)
  2. Create a new model entry using the Model Inventory Entry template in this control
  3. Classify Copilot at the appropriate risk tier based on actual usage scope and the institution's MRM policy — applying OCC Bulletin 2025-26 proportionality where appropriate:
    • Tier 1 (High): If Copilot is used in client-facing activities, lending workflows, or financial reporting — full MRM lifecycle applies
    • Tier 2 (Medium): If Copilot is used only for internal productivity without direct client impact — output monitoring and periodic vendor review
    • Tier 3 (Low / Limited-scope): If Copilot usage is limited to non-regulated administrative tasks — model inventory entry with usage scope documentation; community banks with primarily internal Copilot use may apply this tier per OCC Bulletin 2025-26 proportionality
  4. Assign an internal model owner and document the governance chain
  5. Document the proportionality rationale if applying a simplified MRM approach (cite OCC Bulletin 2025-26 and describe how Copilot usage scope and institution size support the tier selection)

Step 2: Document Use-Case Scope

  1. Catalog all approved Copilot use cases within the institution:
Use Case Business Unit Risk Level Approval Status
Client email drafting Wealth management High Approved with supervisory review
Financial analysis assistance Corporate finance High Approved with output verification
Meeting summarization All departments Low Approved
Research and information gathering Compliance Medium Approved
Lending communication drafting Consumer lending High Approved with fair lending monitoring
Marketing content generation Marketing High Approved with FINRA 2210 review
  1. Document prohibited use cases and the rationale for each prohibition
  2. Establish a process for requesting approval of new Copilot use cases

Step 3: Establish Output Monitoring

  1. Define output quality metrics for each high-risk use case:
    • Accuracy: Percentage of Copilot outputs containing factual errors (target: < 5%)
    • Hallucination rate: Percentage of Copilot outputs containing fabricated information (target: < 2%)
    • Bias indicators: Number of Copilot outputs flagged for potential bias (target: declining trend)
    • Compliance alignment: Percentage of Copilot-drafted communications passing regulatory review (target: > 95%)
  2. Implement monitoring through:
    • Communication Compliance policy results (see Control 3.4)
    • Supervisory review sampling results (see Control 3.6)
    • User-reported output errors
    • Periodic quality audits of Copilot outputs

Step 4: Implement Fair Lending Monitoring

  1. Identify Copilot-assisted activities that intersect with fair lending obligations:
    • Loan application correspondence
    • Credit decision support
    • Customer communication for loan products
    • Marketing for lending products
  2. Establish testing methodology:
    • Sample Copilot-drafted lending communications across demographic segments
    • Compare language, tone, helpfulness, and accuracy across segments
    • Test for disparate impact using paired testing scenarios
  3. Document findings and remediation actions
  4. Report fair lending monitoring results to the fair lending officer and compliance committee

Step 5: Prepare MRM Documentation Package

Create a Copilot MRM documentation package containing:

  1. Model overview: Capabilities, limitations, vendor information, and tier classification with proportionality rationale (if applicable)
  2. Use-case register: Approved and prohibited uses
  3. Validation approach: Output monitoring, use-case testing, vendor attestation
  4. Monitoring results: Quarterly output quality metrics and trends
  5. Fair lending assessment: Testing methodology and results
  6. Compensating controls: Human review requirements, DLP, Communication Compliance
  7. Vendor due diligence: Microsoft responsible AI documentation review
  8. Risk assessment: Residual risk after compensating controls
  9. Governance: Committee oversight, review cadence, escalation procedures

Financial Sector Considerations

OCC and Federal Reserve Examination Focus

Banking regulators increasingly ask about AI and LLM governance during examinations. Institutions should expect questions about:

  • How Copilot is classified in the model inventory and whether the proportionality determination is documented
  • What validation has been performed on Copilot outputs (recognizing the vendor-model limitation)
  • How fair lending risks from Copilot are monitored
  • What compensating controls exist for model risk areas that cannot be directly validated
  • How the institution assesses vendor (Microsoft) model governance practices

Proportionality Principle (OCC Bulletin 2025-26)

MRM expectations scale with usage scope, institutional size, and risk. OCC Bulletin 2025-26 formalizes what practitioners have long recognized: a community bank using Copilot only for internal meeting summaries and document drafting has materially different MRM obligations than a large complex banking organization using Copilot to draft lending communications or assist with investment recommendations.

For institutions applying the proportionality principle, the documentation requirement is clear: record the usage scope, institution characteristics, and regulatory citation (OCC Bulletin 2025-26) supporting the simplified MRM approach. This documentation is the primary evidence that the institution made a reasoned, supported proportionality determination — not an informal decision to skip the MRM framework.

Vendor Model Risk vs. In-House Model Risk

Traditional MRM frameworks assume the institution has visibility into model internals. For vendor-provided LLMs like Copilot, the MRM focus shifts to:

  • Input controls: What data Copilot can access (permissions, DLP, information barriers)
  • Output controls: How Copilot outputs are reviewed, verified, and approved
  • Usage controls: Which use cases are approved and which are prohibited
  • Monitoring controls: How output quality and bias are tracked over time
  • Vendor governance: How Microsoft governs the underlying model

Interagency AI Guidance

The 2023 Interagency Guidance on AI from OCC, FDIC, Federal Reserve, NCUA, and CFPB reminds institutions that existing risk management frameworks -- including MRM -- apply to AI technologies. Institutions should not treat Copilot as exempt from MRM simply because it is a vendor-provided general-purpose tool rather than a purpose-built financial model.

Verification Criteria

# Verification Step Expected Outcome Governance Level
1 Verify Copilot is registered in the model inventory Model entry exists with complete fields including risk tier, model owner, and proportionality rationale (if applicable) Baseline
2 Review use-case documentation Approved and prohibited uses are documented and communicated Baseline
3 Verify known limitations are documented and communicated Limitation documentation is available to all Copilot users Baseline
4 Verify proportionality determination is documented OCC Bulletin 2025-26 cited with usage scope and institution characteristics supporting tier selection Baseline (community banks)
5 Review quarterly output quality metrics Metrics are tracked and reported for high-risk use cases Recommended
6 Verify periodic output quality monitoring is implemented Output monitoring covers accuracy, hallucination rate, and bias indicators Recommended
7 Verify fair lending testing is conducted Testing methodology and results are documented Recommended
8 Review AI governance committee minutes for Copilot discussion Copilot MRM is a regular agenda item Recommended
9 Verify annual comprehensive MRM assessment is completed Assessment follows OCC 2011-12 structure and documents findings Regulated
10 Review third-party assessment results Independent assessment validates governance framework effectiveness Regulated
11 Verify examination-ready MRM documentation package Complete package is assembled and current Regulated
12 Review fair lending statistical testing results Disparate impact testing shows no statistically significant bias in Copilot-assisted lending activities Regulated

Additional Resources


FSI Copilot Governance Framework v1.2.1 - March 2026