Skip to content

Control 2.9: Agent Performance Monitoring and Optimization

Control ID: 2.9
Pillar: Management
Regulatory Reference: FINRA 4511, FINRA 25-07, GLBA 501(b), SEC 17a-3/4, SOX 404, OCC 2011-12, Fed SR 11-7
Last UI Verified: April 2026
Governance Levels: Baseline / Recommended / Regulated


Objective

Establish comprehensive performance monitoring and optimization for AI agents to ensure reliable operation, service level compliance, and quality management. Financial regulators require demonstration that automated systems perform reliably and do not degrade over time.


Why This Matters for FSI

  • FINRA 4511, FINRA 25-07: Performance records help meet books-and-records requirements for automated systems and aid supervisory review of AI tools
  • GLBA 501(b): Continuous monitoring helps protect the integrity and quality of customer-information processing
  • SEC 17a-3/4: Agent operation telemetry retained on WORM-capable storage supports regulatory examination of automated systems (SEC 17a-4(f))
  • SOX 404: Control-effectiveness monitoring contributes evidence that automated controls operate as designed
  • OCC 2011-12 / Fed SR 11-7: Ongoing performance monitoring is a required element of model risk management (MRM) — degradation, drift, and outcome quality must be tracked over time for any model used in significant business processes

Automation Available

See Hallucination Tracker in FSI-AgentGov-Solutions for feedback aggregation for hallucination pattern analysis.

Control Description

This control establishes performance monitoring through:

  1. KPI Definition - Define metrics (response time, error rate, resolution rate, CSAT, containment rate)
  2. Platform Analytics - Enable Power Platform and Copilot Studio analytics
  3. Custom Dashboards - Create Power BI dashboards for executive reporting
  4. Alerting Configuration - Configure threshold-based alerts for degradation
  5. Anomaly Detection - Implement AI-powered anomaly detection for Zone 3 agents
  6. Review Cadence - Establish weekly/monthly/quarterly performance review processes

Built-In vs Custom Monitoring Capabilities

Capability Platform Availability Implementation
Response time metrics Built-in (Copilot Studio Analytics) Native configuration
Error rates Built-in Native configuration
CSAT scores Built-in Native configuration
Answer quality scores Built-in Native configuration
Session/conversation counts Built-in Native configuration
Application Insights telemetry Supported integration (link an App Insights resource to the Copilot Studio agent) Native integration; custom KQL required for analysis
RAI-specific telemetry Not built-in Custom implementation via Application Insights custom events
Hallucination tracking Not built-in Custom implementation using Azure AI Evaluation SDK or DSPM for AI signals
Grounding accuracy metrics Not built-in Custom implementation required (citation validation logic)

Custom Implementation Required for RAI Monitoring

Copilot Studio provides operational metrics (response time, errors, CSAT) but does not include built-in responsible AI (RAI) telemetry such as hallucination detection, grounding accuracy measurement, or content safety violation tracking. Organizations requiring these capabilities must implement custom solutions using Azure AI Evaluation SDK, Application Insights custom events, or third-party RAI monitoring tools.


Key Configuration Points

Built-In Analytics (Native)

  • Define performance KPIs per governance zone (response time, error rate, CSAT targets)
  • Enable analytics in Power Platform Admin Center → Analytics → Copilot Studio
  • Configure data export to Azure Data Lake for advanced analytics
  • Create Power BI dashboards with KPI cards, trend analysis, and SLA compliance
  • Set up Power Automate alerts for error rate > threshold (5%/2%/1% by zone)
  • Enable Azure Monitor smart detection for Zone 3 agents
  • Establish performance review meetings (weekly ops, monthly business, quarterly executive)
  • Enable Hybrid Analytics — unified analytics view combining Copilot Studio and Microsoft Foundry agent telemetry into a single monitoring pane
  • Configure Autonomous Agent Analytics — specialized dashboards for monitoring autonomous (unattended) agent performance, including trigger success rates, action completion, and escalation frequency
  • Leverage ROI / Time & Cost Savings Analytics — built-in analytics that quantify agent business value through time saved, cost reduction, and operational efficiency metrics for executive reporting
  • Link an Azure Application Insights resource to each Zone 2/3 agent for enterprise-grade telemetry (sessions, latency percentiles, error stack traces, custom events) queryable via KQL

RAI Telemetry (Custom Implementation Required)

  • Implement Application Insights custom events for content safety violations
  • Deploy Azure AI Evaluation SDK for hallucination detection in test pipelines
  • Create custom grounding accuracy metrics using citation validation logic
  • Build Power Automate flows to aggregate RAI events into dashboards
  • Configure alerting on RAI metric thresholds (e.g., hallucination rate > 5%)

Implementation Guidance

See Verification & Testing playbook for RAI telemetry implementation patterns.


Zone-Specific Requirements

Zone Requirement Implementation Rationale
Zone 1 (Personal) Basic analytics (built-in); monthly review; error rate alerting only Native configuration Low risk, minimal monitoring needed
Zone 2 (Team) Standard dashboards (built-in); weekly review; error rate and response time alerts Native configuration Shared agents need quality monitoring
Zone 3 (Enterprise) Full monitoring including RAI telemetry; daily + real-time review; all metrics including hallucination tracking; 24/7 on-call Native + custom implementation Customer-facing requires maximum visibility including RAI compliance

Zone 3 RAI Monitoring

Zone 3 agents in customer-facing roles should implement custom RAI telemetry to track hallucination rates, grounding accuracy, and content safety events. This requires custom development using Azure AI Evaluation SDK or equivalent tooling.

Sovereign Cloud Limitations (GCC / GCC High / DoD)

Copilot Studio analytics, Application Insights integration, and Power Platform analytics export to Azure Data Lake have reduced or staggered availability in US Government clouds. Verify feature parity for your specific tenant against Microsoft Learn before committing to a monitoring design — a configuration that produces complete telemetry in commercial may produce partial or empty data in GCC High / DoD, creating false-clean evidence for audit. Review the Microsoft 365 US Government service descriptions for current parity.


Roles & Responsibilities

Role Responsibility
Power Platform Admin Configure tenant analytics, enable Power Platform analytics export, manage Power BI workspaces
AI Administrator Configure Copilot Studio agent-level analytics and Application Insights linkage
AI Governance Lead Define KPIs per zone, oversee review cadence, trend analysis, MRM coordination
Model Risk Manager Review monitoring evidence against OCC 2011-12 / Fed SR 11-7 requirements
Operations Team Monitor alerts, triage degradation, execute optimization runbooks
Agent Owner Review agent-specific performance, implement improvements, attest accuracy

Control Relationship
3.1 - Agent Inventory Performance linked to inventory metadata
3.2 - Usage Analytics Usage metrics inform performance baselines
2.6 - Model Risk Management Performance monitoring supports MRM validation
3.10 - Hallucination Feedback Hallucination patterns inform model degradation (Hallucination Tracker)
1.7 - Audit Logging Performance events captured in audit logs

Implementation Playbooks

Step-by-Step Implementation

This control has detailed playbooks for implementation, automation, testing, and troubleshooting:

Agent Usage & Performance Workbook

The Agent Usage & Performance Workbook provides pre-built performance monitoring dashboards including response latency percentiles (p50/p95/p99), error rates by type, and operational health indicators — deployable via Azure Monitor with RBAC-scoped access. See the Deployment Guide for setup instructions.


Verification Criteria

Confirm control effectiveness by verifying:

Built-In Analytics (All Zones)

  1. Analytics data flows to Power Platform Admin Center reports
  2. Power BI dashboard displays current metrics with working drill-down
  3. Test alert triggers notification when threshold temporarily lowered
  4. Performance review meetings scheduled with documented actions
  5. Usage insights digest arrives at configured recipient addresses

RAI Telemetry (Zone 3, Custom Implementation)

  1. Custom Application Insights events capture content safety violations (if implemented)
  2. Hallucination detection pipeline produces accuracy reports (if implemented)
  3. Grounding accuracy metrics visible in custom dashboard (if implemented)

Verification Scope

Items 6-8 apply only to organizations that have implemented custom RAI telemetry. These are not built-in platform capabilities.


Additional Resources


Updated: April 2026 | Version: v1.4.0 | UI Verification Status: Current