Control 2.9: Agent Performance Monitoring and Optimization
Overview
Control ID: 2.9 Control Name: Agent Performance Monitoring and Optimization Regulatory Reference: FINRA 4511, GLBA 501(b), SEC 17a-3/4, SOX 404 Setup Time: 2-3 hours initial setup, ongoing monitoring
Purpose
This control establishes comprehensive performance monitoring and optimization for AI agents in financial institutions. Regulators require financial services organizations to demonstrate that automated systems perform reliably, meet service level objectives, and do not degrade over time. This control defines KPIs, monitoring infrastructure, alerting thresholds, and optimization processes to ensure agents meet performance standards required for customer-facing and business-critical operations.
Prerequisites
Primary Owner Admin Role: Power Platform Admin Supporting Roles: None
Required Licenses
- Microsoft 365 E3/E5 (for Microsoft 365 admin analytics)
- Power Platform per-user or per-app licenses
- Copilot Studio standard/premium
- Azure Monitor (for custom dashboards)
Required Permissions
- Power Platform Administrator (analytics access)
- Copilot Studio Environment Maker (agent analytics)
- Azure Monitor Contributor (custom monitoring)
Dependencies
- Control 3.1 (Agent Inventory)
- Control 3.2 (Usage Analytics)
- Control 2.6 (Model Risk Management)
Pre-Setup Checklist
- [ ] Performance baselines defined
- [ ] SLA/SLO targets documented
- [ ] Alert recipients identified
- [ ] Escalation procedures established
Governance Levels
Baseline (Level 1)
Monitor agent performance (latency, error rate, usage); document baseline metrics.
Recommended (Level 2-3)
Real-time performance dashboard; alert on performance degradation; monthly optimization review.
Regulated/High-Risk (Level 4)
Comprehensive performance monitoring with continuous anomaly detection; SLA compliance tracking.
Setup & Configuration
Step 1: Define Performance KPIs
Establish key performance indicators for agent monitoring.
Core Performance Metrics:
| Metric | Description | Tier 1 Target | Tier 2 Target | Tier 3 Target |
|---|---|---|---|---|
| Response Time | P50 latency for agent responses | < 5 sec | < 3 sec | < 2 sec |
| P95 Response Time | 95th percentile latency | < 10 sec | < 7 sec | < 5 sec |
| Error Rate | % of failed interactions | < 5% | < 2% | < 1% |
| Availability | Uptime percentage | 99% | 99.5% | 99.9% |
| Resolution Rate | % resolved without escalation | > 60% | > 75% | > 85% |
| User Satisfaction | CSAT score (1-5) | > 3.5 | > 4.0 | > 4.2 |
| Containment Rate | % handled without human | > 50% | > 65% | > 75% |
| Topic Match Rate | % correctly routed | > 70% | > 80% | > 90% |
Step 2: Enable Power Platform Analytics
Power Platform Admin Center:
- Navigate to admin.powerplatform.microsoft.com
- Go to Analytics → Copilot Studio
- Verify analytics is enabled
- Review available reports:
- Sessions: Total interactions, session duration
- Topics: Most used, escalation rates
- Outcomes: Resolution rates, abandonment
- Customer satisfaction: Survey responses
Configure Data Export (for advanced analytics):
- Go to Settings → Data export
- Enable export to Azure Data Lake or Azure Synapse
- Configure export frequency: Near real-time
Step 3: Configure Copilot Studio Analytics
Copilot Studio:
- Navigate to copilotstudio.microsoft.com
- Select your agent → Analytics
- Review built-in dashboards:
- Summary: Key metrics overview
- Customer satisfaction: Survey results
- Sessions: Detailed session analytics
- Topics: Topic performance
Enable Enhanced Analytics:
- Go to Settings → Generative AI
- Enable Conversation transcripts (with appropriate consent)
- Configure AI-generated analytics insights
Step 4: Create Custom Performance Dashboard
Build a Power BI dashboard for executive reporting.
Create Power BI Report:
- Navigate to app.powerbi.com
- Create new Report → Connect to Copilot Studio data
- Import data from Copilot Studio analytics API
- Create visualizations:
Dashboard Sections:
| Section | Visuals | Refresh |
|---|---|---|
| Executive Summary | KPI cards (sessions, resolution rate, CSAT) | Daily |
| Trend Analysis | Line charts (7-day, 30-day, 90-day trends) | Daily |
| Topic Performance | Bar chart by topic, heat map | Daily |
| Error Analysis | Error breakdown, escalation funnel | Real-time |
| SLA Compliance | Gauge charts vs. targets, trend | Real-time |
| Agent Comparison | Table comparing all agents | Weekly |
Step 5: Configure Alerting
Set up alerts for performance degradation.
Power Automate Alert Flow:
- Navigate to make.powerautomate.com
- Create Scheduled cloud flow (every 15 minutes)
- Implement performance checks:
// Flow Logic
1. Query Copilot Studio analytics API for last 15 minutes
2. Calculate current error rate, avg response time
3. Compare against threshold:
- Error rate > 5%: Send alert
- Response time > 5 seconds: Send alert
- No sessions in 30 min (for 24/7 agent): Send alert
4. If threshold exceeded:
- Send Teams message to #agent-operations channel
- Create incident in ServiceNow/Jira
- Log alert in SharePoint for tracking
Azure Monitor Alerts (for advanced scenarios):
- Navigate to portal.azure.com → Monitor
- Go to Alerts → Create alert rule
- Configure:
- Scope: Copilot Studio/Power Platform
- Condition: Custom log query for error spikes
- Action group: Email, Teams, SMS to on-call team
Step 6: Implement Anomaly Detection
Configure AI-powered anomaly detection for Tier 3 agents.
Azure Application Insights (if integrated):
- Navigate to portal.azure.com → Application Insights
- Go to Smart Detection → Settings
- Enable:
- Failure Anomalies: Auto-detect spike in failures
- Slow Response Detection: Detect latency degradation
- Memory Leak Detection: Identify resource issues
Custom Anomaly Detection with Power Automate:
- Maintain rolling 7-day average for key metrics
- Alert when current value deviates > 2 standard deviations
- Correlate anomalies with deployments/changes
Step 7: Establish Performance Review Cadence
Document recurring performance review process.
Weekly Operational Review:
- Participants: Platform team, Agent owners
- Metrics reviewed: Error rate, response time, top escalations
- Action: Immediate fixes for any red metrics
- Duration: 30 minutes
Monthly Performance Review:
- Participants: AI Governance Lead, Business owners, Platform team
- Metrics reviewed: All KPIs, trend analysis, SLA compliance
- Action: Optimization roadmap updates
- Duration: 1 hour
Quarterly Business Review:
- Participants: Executive sponsor, AI Governance Lead, Compliance
- Metrics reviewed: Business outcomes, ROI, compliance metrics
- Action: Strategic direction, budget, expansion plans
- Duration: 2 hours
Step 8: Document Optimization Process
Create optimization runbook for performance issues.
Optimization Playbook:
| Issue | Diagnostic Steps | Resolution Options |
|---|---|---|
| High Error Rate | 1. Check error logs 2. Identify failing topics 3. Review recent changes | Rollback recent changes, fix topic logic, update knowledge sources |
| Slow Response | 1. Check API latency 2. Review topic complexity 3. Check connector performance | Optimize topic flows, cache connector responses, reduce entity extractions |
| Low Resolution | 1. Analyze escalation topics 2. Review conversation transcripts 3. Check knowledge gaps | Add missing topics, improve knowledge sources, refine entity definitions |
| Poor CSAT | 1. Review low-rated conversations 2. Identify pain points 3. Analyze sentiment | Improve response quality, add escalation paths, enhance personalization |
PowerShell Configuration
# ============================================================
# Control 2.9: Agent Performance Monitoring and Optimization
# ============================================================
# Connect to required services
Connect-MgGraph -Scopes "Analytics.Read.All"
Import-Module Microsoft.PowerApps.Administration.PowerShell
# -------------------------------------------------------------
# Section 1: Retrieve Agent Performance Data
# -------------------------------------------------------------
Write-Host "Retrieving agent performance data..." -ForegroundColor Cyan
# Get all Copilot Studio agents
$Environments = Get-AdminPowerAppEnvironment
$AgentPerformance = @()
foreach ($Env in $Environments) {
# Get chatbots (Copilot Studio agents)
$Chatbots = Get-AdminPowerVirtualAgentsBots -EnvironmentName $Env.EnvironmentName -ErrorAction SilentlyContinue
foreach ($Bot in $Chatbots) {
# Note: Detailed analytics require Copilot Studio API or Dataverse queries
# This provides basic inventory with metadata
$AgentPerformance += [PSCustomObject]@{
Environment = $Env.DisplayName
AgentName = $Bot.Name
AgentId = $Bot.BotId
Created = $Bot.CreatedDateTime
Modified = $Bot.LastModifiedDateTime
Status = $Bot.Status
Version = $Bot.Version
}
}
}
Write-Host "Found $($AgentPerformance.Count) agents across all environments" -ForegroundColor Green
$AgentPerformance | Format-Table -AutoSize
# -------------------------------------------------------------
# Section 2: Configure Performance Thresholds
# -------------------------------------------------------------
Write-Host "`nConfiguring performance thresholds..." -ForegroundColor Cyan
$PerformanceThresholds = @{
Zone1 = @{
ResponseTime_P50_Seconds = 5
ResponseTime_P95_Seconds = 10
ErrorRate_Percent = 5
Availability_Percent = 99
ResolutionRate_Percent = 60
CSAT_Min = 3.5
}
Zone2 = @{
ResponseTime_P50_Seconds = 3
ResponseTime_P95_Seconds = 7
ErrorRate_Percent = 2
Availability_Percent = 99.5
ResolutionRate_Percent = 75
CSAT_Min = 4.0
}
Zone3 = @{
ResponseTime_P50_Seconds = 2
ResponseTime_P95_Seconds = 5
ErrorRate_Percent = 1
Availability_Percent = 99.9
ResolutionRate_Percent = 85
CSAT_Min = 4.2
}
}
# Export thresholds for reference
$PerformanceThresholds | ConvertTo-Json -Depth 3 | Out-File "Performance_Thresholds.json"
Write-Host "Performance thresholds exported to Performance_Thresholds.json" -ForegroundColor Green
# -------------------------------------------------------------
# Section 3: Create Performance Monitoring Dataverse Table
# -------------------------------------------------------------
Write-Host "`nDataverse table schema for performance tracking..." -ForegroundColor Cyan
$TableSchema = @"
CREATE DATAVERSE TABLE: fsi_agentperformancemetrics
Columns:
---------
| Column Name | Type | Description |
|-------------------------|-------------|---------------------------------------|
| fsi_agentid | Lookup | Link to Agent Registry |
| fsi_measurementdate | Date/Time | Timestamp of measurement |
| fsi_totalsessions | Whole Number| Total sessions in period |
| fsi_avgresponsetime | Decimal | Average response time (seconds) |
| fsi_p95responsetime | Decimal | P95 response time (seconds) |
| fsi_errorcount | Whole Number| Number of errors |
| fsi_errorrate | Decimal | Error rate percentage |
| fsi_resolutionrate | Decimal | Resolution rate percentage |
| fsi_escalationcount | Whole Number| Number of escalations |
| fsi_csatscore | Decimal | Average CSAT (1-5) |
| fsi_csatresponses | Whole Number| Number of CSAT responses |
| fsi_containmentrate | Decimal | Containment rate percentage |
| fsi_zone | Choice | Tier 1/2/3 |
| fsi_slacompliant | Yes/No | Meets SLA targets |
| fsi_periodtype | Choice | Hourly/Daily/Weekly/Monthly |
Create in: Power Apps Maker Portal -> Tables -> New table
"@
Write-Host $TableSchema -ForegroundColor Yellow
# -------------------------------------------------------------
# Section 4: Generate Sample Alert Configuration
# -------------------------------------------------------------
Write-Host "`nGenerating alert configuration..." -ForegroundColor Cyan
$AlertConfig = @{
Alerts = @(
@{
Name = "High Error Rate Alert"
Condition = "ErrorRate > Threshold"
TierThresholds = @{ Tier1 = "5%"; Tier2 = "2%"; Tier3 = "1%" }
Severity = "Critical"
Actions = @("Teams notification", "ServiceNow ticket", "Page on-call")
CheckInterval = "15 minutes"
},
@{
Name = "Slow Response Alert"
Condition = "P95ResponseTime > Threshold"
TierThresholds = @{ Tier1 = "10s"; Tier2 = "7s"; Tier3 = "5s" }
Severity = "High"
Actions = @("Teams notification", "Log to SharePoint")
CheckInterval = "15 minutes"
},
@{
Name = "Availability Alert"
Condition = "NoSessions in 30 minutes (for 24/7 agents)"
ZoneThresholds = @{ Zone3 = "24/7 monitoring" }
Severity = "Critical"
Actions = @("Page on-call", "Escalate to platform team")
CheckInterval = "5 minutes"
},
@{
Name = "CSAT Drop Alert"
Condition = "CSAT drops > 0.5 from 7-day average"
ZoneThresholds = @{ All = "0.5 point drop" }
Severity = "Medium"
Actions = @("Email to agent owner", "Add to review queue")
CheckInterval = "Daily"
},
@{
Name = "Anomaly Detection Alert"
Condition = "Any metric > 2 std dev from 7-day average"
ZoneThresholds = @{ Zone3 = "Enabled" }
Severity = "High"
Actions = @("Teams notification", "Auto-create investigation ticket")
CheckInterval = "Hourly"
}
)
}
$AlertConfig | ConvertTo-Json -Depth 4 | Out-File "Alert_Configuration.json"
Write-Host "Alert configuration exported to Alert_Configuration.json" -ForegroundColor Green
# -------------------------------------------------------------
# Section 5: Performance Review Report Template
# -------------------------------------------------------------
Write-Host "`nGenerating performance review report template..." -ForegroundColor Cyan
$ReportTemplate = @"
===============================================================================
AGENT PERFORMANCE REVIEW REPORT
Report Period: [PERIOD START] to [PERIOD END]
Generated: $(Get-Date -Format 'yyyy-MM-dd HH:mm:ss')
===============================================================================
EXECUTIVE SUMMARY
-----------------
Total Active Agents: [COUNT]
Total Sessions: [SESSION COUNT]
Average Resolution Rate: [RATE]%
Average CSAT: [SCORE]/5
SLA COMPLIANCE
--------------
| Tier | Agents | SLA Met | SLA Missed | Compliance % |
|--------|--------|---------|------------|--------------|
| Tier 1 | [X] | [Y] | [Z] | [%] |
| Tier 2 | [X] | [Y] | [Z] | [%] |
| Tier 3 | [X] | [Y] | [Z] | [%] |
TOP PERFORMING AGENTS
---------------------
1. [Agent Name] - Resolution: [X]%, CSAT: [Y], Errors: [Z]%
2. [Agent Name] - Resolution: [X]%, CSAT: [Y], Errors: [Z]%
3. [Agent Name] - Resolution: [X]%, CSAT: [Y], Errors: [Z]%
AGENTS REQUIRING ATTENTION
--------------------------
1. [Agent Name] - Issue: [Description] - Action: [Required action]
2. [Agent Name] - Issue: [Description] - Action: [Required action]
TREND ANALYSIS
--------------
- Session volume: [Trend direction and %]
- Resolution rate: [Trend direction and %]
- Error rate: [Trend direction and %]
- CSAT: [Trend direction and points]
KEY INCIDENTS THIS PERIOD
-------------------------
| Date | Agent | Issue | Duration | Root Cause | Resolution |
|------------|----------|----------------|----------|------------|------------|
| [Date] | [Agent] | [Description] | [Time] | [Cause] | [Action] |
OPTIMIZATION ACTIONS TAKEN
--------------------------
1. [Action description] - Impact: [Measured improvement]
2. [Action description] - Impact: [Measured improvement]
RECOMMENDATIONS
---------------
1. [Recommendation for improvement]
2. [Recommendation for improvement]
3. [Recommendation for improvement]
NEXT REVIEW DATE: [DATE]
===============================================================================
"@
$ReportTemplate | Out-File "Performance_Review_Template.txt"
Write-Host "Report template exported to Performance_Review_Template.txt" -ForegroundColor Green
# -------------------------------------------------------------
# Section 6: Generate Compliance Summary
# -------------------------------------------------------------
Write-Host "`nGenerating performance monitoring compliance summary..." -ForegroundColor Cyan
$ComplianceSummary = @"
===============================================================================
PERFORMANCE MONITORING COMPLIANCE STATUS
Generated: $(Get-Date -Format 'yyyy-MM-dd HH:mm:ss')
===============================================================================
CONTROL 2.9 IMPLEMENTATION STATUS
---------------------------------
[?] KPI definitions documented
[?] Power Platform analytics enabled
[?] Custom dashboards created
[?] Alerting configured
[?] Anomaly detection enabled (Tier 3)
[?] Weekly review cadence established
[?] Monthly review cadence established
[?] Optimization runbook documented
REGULATORY ALIGNMENT
--------------------
✅ FINRA 4511: Performance records maintained
✅ GLBA 501(b): Customer interaction quality monitored
✅ SEC 17a-3/4: Agent operation records preserved
✅ SOX 404: Control effectiveness monitored
NEXT STEPS
----------
1. Complete all checklist items above
2. Document baseline metrics for each agent
3. Conduct first performance review
4. Validate alert routing works correctly
===============================================================================
"@
Write-Host $ComplianceSummary
Write-Host "`nPerformance monitoring configuration complete" -ForegroundColor Green
Financial Sector Considerations
Regulatory Alignment
| Regulation | Performance Requirement | Control Implementation |
|---|---|---|
| FINRA 4511 | Records of automated system performance | Performance metrics logged and retained |
| GLBA 501(b) | Protect customer information in processing | Monitor for data handling errors |
| SEC 17a-3/4 | Maintain records of communications | Conversation analytics retained |
| SOX 404 | Monitor control effectiveness | SLA compliance tracking |
| FFIEC CAT | System performance monitoring | Real-time dashboards and alerting |
| OCC 2011-12 | Ongoing model performance monitoring | Drift detection, revalidation triggers |
Zone-Specific Configuration
| Zone | Monitoring Level | Review Cadence | Alerting |
|---|---|---|---|
| Zone 1 - Personal | Basic analytics | Monthly | Error rate only |
| Zone 2 - Team | Standard dashboards | Weekly | Error rate, response time |
| Zone 3 - Enterprise | Full monitoring + anomaly detection | Daily + real-time | All metrics, 24/7 on-call |
FSI Performance Considerations
Customer-Facing Agents:
- Response time directly impacts customer experience
- Escalation to human agent must be seamless
- CSAT tracking essential for service quality
Regulatory Communication Agents:
- Accuracy paramount (incorrect information = regulatory risk)
- All interactions must be captured for compliance
- Performance degradation may indicate data quality issues
Trading/Transaction Agents:
- Near-zero latency requirements
- Error rate tolerance near zero
- Availability critical during market hours
Verification & Testing
Verification Steps
- Analytics Enabled
- Power Platform Admin Center → Analytics → Copilot Studio
- Verify data is flowing to reports
-
Confirm export to Data Lake (if configured)
-
Dashboard Functional
- Power BI → Open performance dashboard
- Verify data refresh is current
-
Test drill-down functionality
-
Alerting Active
- Trigger test alert (temporarily lower threshold)
- Verify notification received
-
Confirm incident ticket created
-
Review Process Active
- Check calendar for scheduled reviews
- Verify review documentation exists
- Confirm actions from reviews tracked
Compliance Checklist
- [ ] Performance KPIs defined for each governance tier
- [ ] Power Platform analytics enabled
- [ ] Custom Power BI dashboard deployed
- [ ] Alert thresholds configured per governance tier
- [ ] Alert routing verified (Teams, email, SMS)
- [ ] Anomaly detection enabled (Tier 3)
- [ ] Weekly review meetings scheduled
- [ ] Monthly review process documented
- [ ] Performance data retained per policy
Troubleshooting & Validation
Issue: Analytics Data Not Appearing
Symptoms: Empty dashboards, no data in reports Solution:
- Verify analytics is enabled in Power Platform Admin Center
- Check agent has recent sessions (test with sample interaction)
- Confirm data export is configured correctly
- Review data latency (can be up to 24 hours for some reports)
Issue: Alerts Not Firing
Symptoms: Known issues not triggering alerts Solution:
- Verify Power Automate flow is turned on
- Check flow run history for errors
- Validate API connection to analytics
- Review alert threshold configuration
- Test with manually lowered thresholds
Issue: High Error Rate With No Clear Cause
Symptoms: Elevated errors, no obvious pattern Solution:
- Review error messages in conversation transcripts
- Check for API/connector failures
- Validate knowledge sources are accessible
- Review recent deployments/changes
- Check for platform-level issues (Microsoft Service Health)
Issue: Performance Dashboard Slow
Symptoms: Power BI dashboard takes long to load Solution:
- Optimize data model (reduce unnecessary columns)
- Implement incremental refresh
- Use aggregated tables for high-volume data
- Schedule refresh during off-peak hours
- Consider DirectQuery for real-time needs
Additional Resources
- Copilot Studio analytics overview
- Power Platform analytics
- Export analytics data to Azure
- Create Power BI reports from Dataverse
- Azure Monitor alerts
Related Controls
| Control | Relationship |
|---|---|
| 3.1 - Agent Inventory | Performance linked to inventory |
| 3.2 - Usage Analytics | Usage metrics inform performance |
| 2.6 - Model Risk Management | Performance monitoring part of MRM |
| 1.7 - Audit Logging | Performance events logged |
| 2.4 - BC/DR | Performance during recovery |
Support & Questions
For implementation support or questions about this control, contact:
- AI Governance Lead (governance direction)
- Compliance Officer (regulatory requirements)
- Technical Implementation Team (platform setup)
Updated: Dec 2025
Version: v1.0 Beta (Dec 2025)
UI Verification Status: ❌ Needs verification