Control 3.10: Hallucination Feedback Loop
Control ID: 3.10 Pillar: Reporting Regulatory Reference: CFPB UDAAP, SOX 302, FINRA 4511, SEC 17a-4 Last UI Verified: January 2026 Governance Levels: Baseline / Recommended / Regulated Last Verified: 2026-02-03
Objective
Establish a systematic process for capturing, categorizing, and remediating AI agent hallucinations (factually incorrect, fabricated, or misleading outputs) to enable continuous improvement of agent accuracy and provide quality management evidence for regulatory purposes.
Why This Matters for FSI
- CFPB UDAAP: Tracks and remediates misleading outputs that could constitute unfair or deceptive practices
- SOX 302: Helps support accuracy of financial information delivered by AI agents
- FINRA 4511: Documents quality management processes for books and records
- SEC 17a-4: Preserves hallucination evidence as part of record retention requirements
Control Description
Detection Limitations (January 2026)
No automated hallucination detection exists in Microsoft Copilot Studio. All hallucination identification relies on manual user feedback (CSAT thumbs up/down, explicit flagging) and human review. Organizations must implement structured feedback collection and manual review workflows rather than expecting automatic detection of inaccurate outputs. Published research on LLM accuracy varies significantly by model, domain, and evaluation methodology; organizations should establish their own baseline accuracy metrics for each agent use case rather than relying on generalized industry estimates.
This control establishes a hallucination feedback loop through:
- User Feedback Collection - Thumbs down, flag, and report mechanisms
- Hallucination Categorization - Taxonomy for classifying inaccuracy types (factual errors, fabrications, outdated info, calculation errors)
- Remediation Tracking - Workflow with defined SLAs by severity (Critical: 4hrs, High: 24hrs, Medium: 72hrs)
- Trend Analysis - Pattern identification and dashboards to detect systemic problems
- Continuous Improvement - Integration with knowledge source updates and prompt refinement
Key Configuration Points
- Enable user feedback (thumbs up/down) in Copilot Studio agent settings
- Define hallucination taxonomy: Factual Error, Fabrication, Outdated, Misattribution, Calculation Error, Conflation, Overconfidence, Misleading
- Create SharePoint tracking list or integrate with ServiceNow/Jira
- Configure Power Automate workflows for automated routing and escalation
- Set up trend reporting dashboards in Power BI
- Establish severity-based SLAs and escalation paths
Automation Available
See Hallucination Tracker in FSI-AgentGov-Solutions for multi-source feedback collection, pattern detection with clustering, and integration with FINRA Supervision Workflow for high-severity hallucinations.
Mitigation Strategies
Since automated detection is not available, organizations should implement proactive mitigation:
| Strategy | Description | FSI Application |
|---|---|---|
| Explicit Fallbacks | Configure "I don't know" responses for low-confidence queries | Helps prevent fabrication in compliance-sensitive contexts |
| Grounding Requirements | Require citation of source documents for factual claims | RAG-based agents surface source attribution |
| Human-in-the-Loop | Require human approval for high-stakes outputs | Investment advice, regulatory filings |
| Response Confidence Thresholds | Filter responses below confidence threshold | Reduce low-quality outputs reaching users |
| Source Restriction | Limit knowledge sources to verified content | Reduce reliance on potentially inaccurate data |
Feedback Capture Mechanisms
| Mechanism | Native Support | Data Location | FSI Use Case |
|---|---|---|---|
| CSAT (Thumbs) | Yes | Copilot Studio Analytics | Basic quality signal |
| Custom Feedback Form | Via Topics | SharePoint/Dataverse | Structured categorization |
| Application Insights | Yes | Custom telemetry | Detailed conversation analysis |
| Conversation Transcript | Yes | Dataverse | Full context for RCA |
Zone-Specific Requirements
| Zone | Requirement | Rationale |
|---|---|---|
| Zone 1 (Personal) | Basic thumbs up/down; quarterly review of patterns | Low risk, minimal tracking needed |
| Zone 2 (Team) | Structured feedback with categorization; weekly review; integration with team issue tracking | Shared agents need quality monitoring |
| Zone 3 (Enterprise) | Comprehensive tracking with full workflow; real-time alerting; formal RCA for critical issues; regulatory-ready documentation | Customer-facing agents require rigorous quality control |
Roles & Responsibilities
| Role | Responsibility |
|---|---|
| AI Governance Lead | Feedback process ownership, remediation oversight, trend analysis |
| Power Platform Admin | Configure feedback mechanisms, workflow automation |
| QA Lead | Validate fixes, verify remediation effectiveness |
| Content Owner | Update knowledge sources based on root cause findings |
Related Controls
| Control | Relationship |
|---|---|
| 2.9 - Performance Monitoring | Baseline quality metrics for comparison |
| 3.4 - Incident Reporting | Critical issue escalation path |
| 2.16 - RAG Source Integrity | Knowledge source updates when root cause identified |
Implementation Playbooks
Step-by-Step Implementation
This control has detailed playbooks for implementation, automation, testing, and troubleshooting:
- Portal Walkthrough — Step-by-step portal configuration
- PowerShell Setup — Automation scripts
- Verification & Testing — Test cases and evidence collection
- Troubleshooting — Common issues and resolutions
Verification Criteria
Confirm control effectiveness by verifying:
- Test feedback via thumbs down creates tracking item with correct categorization
- Critical hallucinations trigger incident creation within 4-hour SLA
- Remediation workflow progresses through all status stages correctly
- Trend reports generate and display metrics (hallucination rate, MTTR, category distribution)
- Root cause findings integrate with knowledge source updates (Control 2.16)
Additional Resources
- Microsoft Learn: Copilot Studio Analytics
- Microsoft Learn: Customer Satisfaction Settings
- Microsoft Learn: Power Automate Approval Workflows
Updated: January 2026 | Version: v1.2 | UI Verification Status: Current