Control 3.8: Model Risk Management Alignment (OCC 2011-12 / SR 11-7)

Control ID: 3.8 Pillar: Compliance & Audit Regulatory Reference: OCC Bulletin 2011-12 (Supervisory Guidance on Model Risk Management), OCC Bulletin 2025-26 (Model Risk Management — Community Bank Proportionality), Federal Reserve SR 11-7, ECOA (Equal Credit Opportunity Act), Fair Housing Act Last Verified: 2026-02-17 Governance Levels: Baseline / Recommended / Regulated

Objective

Align Microsoft 365 Copilot governance with model risk management (MRM) frameworks established under OCC Bulletin 2011-12 and Federal Reserve SR 11-7, including model inventory, validation scope, ongoing monitoring, and documentation -- while recognizing that M365 Copilot is a vendor-provided model with limited internal validation scope, and focusing governance on usage controls and output monitoring. Address fair lending and ECOA considerations where Copilot outputs could reflect bias in lending, advisory, or customer service contexts. Apply the proportionality guidance from OCC Bulletin 2025-26 to align MRM depth with the institution's size, complexity, and Copilot usage scope.

Why This Matters for FSI

OCC Bulletin 2011-12 and Federal Reserve SR Letter 11-7 define model risk management expectations for banking organizations. These frameworks require institutions to identify, manage, and control the risks associated with models used in decision-making, reporting, and risk management activities. While the guidance was written before the widespread adoption of large language models (LLMs), regulatory expectations have expanded to include AI systems that influence or support business decisions.

M365 Copilot presents a unique MRM challenge: it is a general-purpose LLM provided by Microsoft as a vendor service, not a model developed in-house for a specific quantitative purpose. SR 11-7 defines a "model" as "a quantitative method, system, or approach that applies statistical, economic, financial, or mathematical theories, techniques, and assumptions to process input data into quantitative estimates." Copilot straddles this definition — it processes input and produces output, but it is not a traditional quantitative model designed for a specific financial decision. Institutions cannot validate the model's internal architecture, training data, or algorithmic mechanics in the traditional MRM sense. However, institutions can -- and regulators expect them to -- govern how the model is used, monitor its outputs, document its limitations, and implement compensating controls for known risks.

OCC Bulletin 2025-26 proportionality guidance: In 2025, the OCC issued updated model risk management guidance acknowledging that community banks and smaller institutions require proportional MRM frameworks. OCC Bulletin 2025-26 clarifies that the depth and complexity of model risk management should be commensurate with the institution's size, complexity, and risk profile. For community banks and smaller institutions deploying Copilot, this means a simplified MRM approach -- model inventory registration, basic output monitoring, and vendor due diligence review -- satisfies regulatory expectations without the full validation infrastructure expected of large complex banking organizations. Larger institutions with more complex Copilot deployments (client-facing activities, lending workflows, regulated advisory) should apply the full OCC 2011-12 / SR 11-7 framework.

The fair lending dimension adds urgency to MRM alignment. If Copilot generates customer communications, assists with lending decisions, or supports advisory activities, its outputs could inadvertently reflect biases that violate the Equal Credit Opportunity Act (ECOA) or the Fair Housing Act. Institutions must monitor for and mitigate discriminatory outcomes in Copilot-assisted activities.

Control Description

This control addresses the MRM framework alignment for M365 Copilot, organized around the three pillars of OCC Bulletin 2011-12: model development (adaptation for vendor models), model validation, and ongoing monitoring.

Copilot's Model Status Under SR 11-7

The foundational question for any institution's MRM program is whether M365 Copilot meets the SR 11-7 definition of a "model." The practical answer is that Copilot occupies a gray zone: it processes input data and produces output, but it is not a traditional quantitative model built to generate specific financial estimates. Most institutions resolve this by registering Copilot in their model inventory as a limited-scope or lower-tier model, with governance focused on usage controls, output monitoring, and vendor due diligence rather than the validation infrastructure designed for in-house quantitative models.

The most common approach is to classify Copilot as a Tier 3 (limited-scope) model when used primarily for internal productivity (document drafting, meeting summaries, research support), with escalation to Tier 2 or Tier 1 when deployed in client-facing, lending, or advisory workflows. This tier assignment should be documented in the model inventory with supporting rationale.

MRM Framework Adaptation for Vendor-Provided LLMs

Traditional MRM Element	M365 Copilot Adaptation
Model development and implementation	Vendor due diligence (Microsoft), usage policy definition, scope limitation
Model validation	Output quality monitoring, use-case-specific testing, vendor attestation review
Ongoing monitoring	Copilot output monitoring, usage analytics, bias detection, performance tracking
Model inventory	Register Copilot in the enterprise model inventory with usage scope and risk tier
Model documentation	Document Copilot capabilities, limitations, intended uses, and prohibited uses
Model governance	AI governance committee oversight, periodic review, risk appetite alignment

Model Inventory Entry for M365 Copilot

Field	Value
Model name	Microsoft 365 Copilot
Model type	Large language model (LLM) -- vendor-provided SaaS
Vendor	Microsoft Corporation
Model owner (internal)	[CTO / Head of AI Governance]
Risk tier	Tier 3 (limited-scope) for internal productivity; Tier 2 for business unit decision support; Tier 1 for client-facing or lending workflows
Use cases	Document generation, communication drafting, data summarization, financial analysis assistance, research, meeting support
Prohibited uses	Autonomous investment recommendations, unsupervised lending decisions, compliance certifications without human review
Validation approach	Output monitoring, use-case testing, vendor attestation review
Validation frequency	Quarterly output review; annual comprehensive assessment
Data inputs	Microsoft Graph (user's emails, files, chats, calendar), real-time prompt context
Outputs	Natural language text, document content, summaries, analysis
Known limitations	Hallucination risk, no real-time market data, potential bias in generated content, no guaranteed factual accuracy
Compensating controls	Human review requirements, output verification procedures, Communication Compliance monitoring, DLP policies

Validation Scope for Vendor-Provided Models

Since institutions cannot perform traditional model validation (reviewing source code, training data, or algorithm mechanics), the validation scope focuses on:

Vendor due diligence: Review Microsoft's responsible AI documentation, model cards, safety reports, and third-party audits (see Control 1.10)
Use-case testing: Test Copilot outputs for specific FSI use cases to identify quality, accuracy, and bias issues
Output monitoring: Continuously monitor Copilot outputs for hallucinations, inaccuracies, and biased language
Boundary testing: Verify that Copilot respects configured guardrails (DLP, sensitivity labels, information barriers)
Performance benchmarking: Establish baseline metrics for Copilot output quality and track over time

Fair Lending and ECOA Considerations

Risk Area	Description	Monitoring Approach
Lending communications	Copilot-drafted communications to loan applicants may use language that differs based on applicant demographics	Sample Copilot-drafted lending communications for disparate language patterns
Customer segmentation	Copilot-generated analysis may reflect historical biases in customer data	Review Copilot-assisted segmentation outputs for protected class disparities
Product recommendations	Copilot-assisted product suggestions may vary based on customer characteristics reflected in grounding data	Test product recommendation outputs across diverse customer profiles
Complaint responses	Copilot-drafted complaint responses may exhibit different tone or helpfulness based on customer demographics	Sample and compare Copilot responses across customer segments
Marketing content	Copilot-generated marketing may inadvertently target or exclude protected classes	Review Copilot marketing content for fair lending compliance

Copilot Surface Coverage

Copilot Surface	MRM Relevance	Risk Level	Monitoring Focus
Word Copilot	High -- generates financial documents, loan documents, client proposals	High	Output accuracy, bias in generated language, factual correctness
Excel Copilot	High -- generates formulas, financial analyses, calculations	High	Calculation accuracy, formula correctness, analytical soundness
Outlook Copilot	High -- drafts client communications, lending correspondence	High	Language bias, UDAAP compliance, fair lending language
Microsoft 365 Copilot Chat	Moderate -- research and analysis supporting decisions	Moderate	Hallucination rate, source attribution, factual accuracy
Teams Copilot	Moderate -- meeting recaps, client conversation summaries	Moderate	Accuracy of summaries, completeness, bias in summarization
PowerPoint Copilot	Moderate -- client presentations, investment reviews	Moderate	Data accuracy in generated slides, balanced presentation
Copilot Pages	Low-Moderate -- collaborative content creation	Low	Content accuracy when used for decision-support documents

Governance Levels

Baseline

Applies to institutions of all sizes, calibrated to actual Copilot usage scope per OCC Bulletin 2025-26 proportionality guidance. OCC Bulletin 2025-26 (October 2025) supplements — but does not replace — SR 11-7/OCC 2011-12, providing practical clarifications for community banks. Annual validation is not mandatory for lower-risk models per this bulletin:

Register M365 Copilot in the enterprise model inventory with tier classification (Tier 3 for internal productivity, with documentation of the proportionality rationale for community banks per OCC Bulletin 2025-26). The AI inventory entry should include: tool name, version, purpose, business owner, deployment date, and risk classification
Document Copilot's intended uses and prohibited uses within the institution
Document the usage scope and confirm it aligns with the selected model tier, applying the proportionality principle — validation rigor scales with risk complexity and materiality:
- Custom AI (credit scoring, fraud detection): Full SR 11-7 validation required
- M365 Copilot for productivity (no automated decision-making): Lower-risk; vendor due diligence + AI inventory entry sufficient
- Copilot in regulated workflows (loan memos, customer communications): Elevated risk; SR 11-7 applies proportionally
Establish basic output monitoring through supervisory review of Copilot-assisted activities
Document Copilot's known limitations and communicate them to all users
Include Copilot in the institution's vendor risk management program (see Control 1.10)
Designate an internal model owner responsible for Copilot MRM alignment

Conduct use-case-specific output quality assessments quarterly
Implement automated output monitoring for hallucination and accuracy indicators
Establish fair lending testing protocols for Copilot-assisted lending and advisory activities
Create Copilot-specific model risk documentation that follows OCC Bulletin 2011-12 structure, including these required elements:
1. AI inventory entry: Tool name, version, purpose, business owner, deployment date, risk classification
2. Vendor due diligence: Microsoft service agreements, DPA (Data Processing Agreement), transparency documentation
3. Use case risk assessment: Input data types, output use, whether outputs inform regulated decisions
4. Control environment documentation: Data access controls, audit logging enablement, DLP policies, retention policies
5. Ongoing monitoring plan: Frequency of control testing, escalation procedures
6. Change management log: Track Microsoft Copilot updates via Message Center
Implement periodic output quality monitoring to track accuracy and hallucination rates over time
Leverage Microsoft tools to collect MRM-relevant data:
- DSPM for AI: Prompt/response pairs, sensitive data exposure events, oversharing alerts — feeds risk assessment
- Purview Audit (UAL): Interaction logs, accessed resources, sensitivity labels on accessed files
- Microsoft Defender for Cloud Apps: Copilot anomaly detection, unusual usage patterns
- Viva Insights Copilot Dashboard: Usage frequency, adoption metrics, active user counts
- Insider Risk Management (Risky AI usage): Alerts on risky Copilot interactions — document as control test results
Review Microsoft's responsible AI reports and model updates annually
Integrate Copilot MRM into the institution's model risk governance committee
Track Copilot output quality metrics over time and report trends to the AI governance committee
Document compensating controls for each identified model risk area
Conduct vendor due diligence review of Microsoft Copilot documentation (SOC 2, AI Impact Assessment, data processing terms)

Regulated

Applies the full OCC Bulletin 2011-12 / SR 11-7 MRM lifecycle for institutions deploying Copilot in client-facing, lending, or complex financial workflows. Note: For Copilot used solely as a productivity tool with no automated decision-making, annual comprehensive validation is not mandatory per OCC Bulletin 2025-26 — the regulated tier applies when Copilot outputs materially affect bank decisions, customer communications, or lending activities:

Conduct annual comprehensive MRM assessment for Copilot aligned with OCC 2011-12 / SR 11-7 expectations, including validation report focused on usage controls and output quality (not internal model architecture, which is Microsoft's responsibility as vendor)
Implement ongoing fair lending monitoring for Copilot-assisted activities with statistical testing for disparate impact
Commission independent third-party assessment of Copilot governance framework effectiveness
Maintain model risk quantification metrics (e.g., output error rates, hallucination frequency, bias detection rates)
Establish model risk appetite limits for Copilot and trigger enhanced controls when limits are approached
Prepare examination-ready MRM documentation package for OCC, Federal Reserve, or other banking regulators, organized to include: AI inventory, vendor due diligence, use case risk assessments, control environment documentation, monitoring plan, and change management log
Implement challenger or benchmark testing where Copilot outputs for specific use cases are compared against human or alternative system outputs
Conduct stress testing of Copilot governance controls (e.g., what happens when Copilot output quality degrades)

Setup & Configuration

Step 1: Register Copilot in Model Inventory

Open the institution's model inventory system (GRC platform, spreadsheet, or MRM tool)
Create a new model entry using the Model Inventory Entry template in this control
Classify Copilot at the appropriate risk tier based on actual usage scope and the institution's MRM policy — applying OCC Bulletin 2025-26 proportionality where appropriate:
- Tier 1 (High): If Copilot is used in client-facing activities, lending workflows, or financial reporting — full MRM lifecycle applies
- Tier 2 (Medium): If Copilot is used only for internal productivity without direct client impact — output monitoring and periodic vendor review
- Tier 3 (Low / Limited-scope): If Copilot usage is limited to non-regulated administrative tasks — model inventory entry with usage scope documentation; community banks with primarily internal Copilot use may apply this tier per OCC Bulletin 2025-26 proportionality
Assign an internal model owner and document the governance chain
Document the proportionality rationale if applying a simplified MRM approach (cite OCC Bulletin 2025-26 and describe how Copilot usage scope and institution size support the tier selection)

Step 2: Document Use-Case Scope

Catalog all approved Copilot use cases within the institution:

Use Case	Business Unit	Risk Level	Approval Status
Client email drafting	Wealth management	High	Approved with supervisory review
Financial analysis assistance	Corporate finance	High	Approved with output verification
Meeting summarization	All departments	Low	Approved
Research and information gathering	Compliance	Medium	Approved
Lending communication drafting	Consumer lending	High	Approved with fair lending monitoring
Marketing content generation	Marketing	High	Approved with FINRA 2210 review

Document prohibited use cases and the rationale for each prohibition
Establish a process for requesting approval of new Copilot use cases

Step 3: Establish Output Monitoring

Define output quality metrics for each high-risk use case:
- Accuracy: Percentage of Copilot outputs containing factual errors (target: < 5%)
- Hallucination rate: Percentage of Copilot outputs containing fabricated information (target: < 2%)
- Bias indicators: Number of Copilot outputs flagged for potential bias (target: declining trend)
- Compliance alignment: Percentage of Copilot-drafted communications passing regulatory review (target: > 95%)
Implement monitoring through:
- Communication Compliance policy results (see Control 3.4)
- Supervisory review sampling results (see Control 3.6)
- User-reported output errors
- Periodic quality audits of Copilot outputs

Step 4: Implement Fair Lending Monitoring

Identify Copilot-assisted activities that intersect with fair lending obligations:
- Loan application correspondence
- Credit decision support
- Customer communication for loan products
- Marketing for lending products
Establish testing methodology:
- Sample Copilot-drafted lending communications across demographic segments
- Compare language, tone, helpfulness, and accuracy across segments
- Test for disparate impact using paired testing scenarios
Document findings and remediation actions
Report fair lending monitoring results to the fair lending officer and compliance committee

Step 5: Prepare MRM Documentation Package

Create a Copilot MRM documentation package containing:

Model overview: Capabilities, limitations, vendor information, and tier classification with proportionality rationale (if applicable)
Use-case register: Approved and prohibited uses
Validation approach: Output monitoring, use-case testing, vendor attestation
Monitoring results: Quarterly output quality metrics and trends
Fair lending assessment: Testing methodology and results
Compensating controls: Human review requirements, DLP, Communication Compliance
Vendor due diligence: Microsoft responsible AI documentation review
Risk assessment: Residual risk after compensating controls
Governance: Committee oversight, review cadence, escalation procedures

Financial Sector Considerations

OCC and Federal Reserve Examination Focus

Banking regulators increasingly ask about AI and LLM governance during examinations. Institutions should expect questions about:

How Copilot is classified in the model inventory and whether the proportionality determination is documented
What validation has been performed on Copilot outputs (recognizing the vendor-model limitation)
How fair lending risks from Copilot are monitored
What compensating controls exist for model risk areas that cannot be directly validated
How the institution assesses vendor (Microsoft) model governance practices

Proportionality Principle (OCC Bulletin 2025-26)

MRM expectations scale with usage scope, institutional size, and risk. OCC Bulletin 2025-26 formalizes what practitioners have long recognized: a community bank using Copilot only for internal meeting summaries and document drafting has materially different MRM obligations than a large complex banking organization using Copilot to draft lending communications or assist with investment recommendations.

For institutions applying the proportionality principle, the documentation requirement is clear: record the usage scope, institution characteristics, and regulatory citation (OCC Bulletin 2025-26) supporting the simplified MRM approach. This documentation is the primary evidence that the institution made a reasoned, supported proportionality determination — not an informal decision to skip the MRM framework.

Vendor Model Risk vs. In-House Model Risk

Traditional MRM frameworks assume the institution has visibility into model internals. For vendor-provided LLMs like Copilot, the MRM focus shifts to:

Input controls: What data Copilot can access (permissions, DLP, information barriers)
Output controls: How Copilot outputs are reviewed, verified, and approved
Usage controls: Which use cases are approved and which are prohibited
Monitoring controls: How output quality and bias are tracked over time
Vendor governance: How Microsoft governs the underlying model

Interagency AI Guidance

The 2023 Interagency Guidance on AI from OCC, FDIC, Federal Reserve, NCUA, and CFPB reminds institutions that existing risk management frameworks -- including MRM -- apply to AI technologies. Institutions should not treat Copilot as exempt from MRM simply because it is a vendor-provided general-purpose tool rather than a purpose-built financial model.

Verification Criteria

#	Verification Step	Expected Outcome	Governance Level
1	Verify Copilot is registered in the model inventory	Model entry exists with complete fields including risk tier, model owner, and proportionality rationale (if applicable)	Baseline
2	Review use-case documentation	Approved and prohibited uses are documented and communicated	Baseline
3	Verify known limitations are documented and communicated	Limitation documentation is available to all Copilot users	Baseline
4	Verify proportionality determination is documented	OCC Bulletin 2025-26 cited with usage scope and institution characteristics supporting tier selection	Baseline (community banks)
5	Review quarterly output quality metrics	Metrics are tracked and reported for high-risk use cases	Recommended
6	Verify periodic output quality monitoring is implemented	Output monitoring covers accuracy, hallucination rate, and bias indicators	Recommended
7	Verify fair lending testing is conducted	Testing methodology and results are documented	Recommended
8	Review AI governance committee minutes for Copilot discussion	Copilot MRM is a regular agenda item	Recommended
9	Verify annual comprehensive MRM assessment is completed	Assessment follows OCC 2011-12 structure and documents findings	Regulated
10	Review third-party assessment results	Independent assessment validates governance framework effectiveness	Regulated
11	Verify examination-ready MRM documentation package	Complete package is assembled and current	Regulated
12	Review fair lending statistical testing results	Disparate impact testing shows no statistically significant bias in Copilot-assisted lending activities	Regulated

Additional Resources

FSI Copilot Governance Framework v1.2.1 - March 2026