Control 2.4: Business Continuity and Disaster Recovery
Overview
Control ID: 2.4 Control Name: Business Continuity and Disaster Recovery Pillar: Management Regulatory Reference: GLBA 501(b), SOX 404, FINRA 4511, OCC 2011-12, FFIEC BC/DR Guidance Setup Time: 3-4 hours
Purpose
Business Continuity and Disaster Recovery (BC/DR) ensures that critical Copilot Studio agents remain available or can be rapidly restored following service disruptions, regional outages, or disaster scenarios. For financial services, agent availability may be essential for customer service, trading operations, or compliance functions. This control establishes backup procedures, recovery objectives, failover capabilities, and regular testing to meet regulatory requirements for operational resilience.
This control addresses key FSI requirements:
- Recovery Time Objective (RTO): Maximum acceptable downtime
- Recovery Point Objective (RPO): Maximum acceptable data loss
- Geo-Redundancy: Cross-region failover capability
- Regular Testing: Annual DR test with documented results
- Regulatory Compliance: FFIEC BC/DR examination readiness
Prerequisites
Primary Owner Admin Role: Power Platform Admin Supporting Roles: Environment Admin
Required Licenses
| License | Purpose |
|---|---|
| Power Platform Environment capacity | Secondary region environments |
| Azure subscription | Backup storage (optional) |
| Microsoft 365 E3/E5 | Core platform availability |
Required Permissions
| Permission | Scope | Purpose |
|---|---|---|
| Power Platform Admin | Tenant-wide | Environment management |
| Environment Admin | All environments | Solution deployment and backup |
| Global Admin | Initial setup | Cross-region configuration |
| Azure Storage Contributor | Backup storage | Solution artifact storage |
Dependencies
- Control 2.1: Managed Environments - Environment governance
- Control 2.3: Change Management - Solution versioning
- Control 3.1: Agent Inventory - Critical agent list
Pre-Setup Checklist
- [ ] Critical agents identified and classified
- [ ] RTO/RPO requirements defined per agent
- [ ] Secondary region selected for geo-redundancy
- [ ] Backup storage location established
- [ ] DR testing schedule approved
Governance Levels
Baseline (Level 1)
Document BC/DR plan for critical agents; backup configurations and dependencies.
Recommended (Level 2-3)
Automated backup to secondary region; recovery time objective (RTO) <4 hours.
Regulated/High-Risk (Level 4)
Geo-redundant backup with <1 hour RTO; annual disaster recovery test with documented results.
Setup & Configuration
Step 1: Classify Agents by Criticality
Create Agent Criticality Assessment:
- Review Agent Inventory (from Control 3.1)
- Assign Criticality Tier:
| Tier | Description | RTO | RPO | Examples |
|---|---|---|---|---|
| Tier 1 - Critical | Business cannot operate | <1 hour | <15 min | Trading assistant, Payment processor |
| Tier 2 - High | Significant impact | <4 hours | <1 hour | Customer service, Compliance agent |
| Tier 3 - Medium | Moderate impact | <24 hours | <4 hours | Internal HR bot, IT help desk |
| Tier 4 - Low | Minimal impact | <72 hours | <24 hours | Personal productivity agents |
- Document in BC/DR Plan:
- Agent ID and name
- Business owner
- Criticality tier
- Dependencies (data sources, connectors, integrations)
- Recovery priority order
Step 2: Create Secondary Region Environments
Portal Path: Power Platform Admin Center → Environments → + New
- Navigate to Power Platform Admin Center (admin.powerplatform.microsoft.com)
- Select Environments → Click + New
- Create DR Environment:
- Name: "[Primary Env Name]-DR"
- Region: Select different geographic region
- Type: Production
- Purpose: Disaster recovery for [Primary Env]
- Configure Environment:
- Enable Managed Environment
- Apply same DLP policies as primary
- Configure same security roles
- Document Region Mapping: | Primary Region | DR Region | |----------------|-----------| | East US | West US |
Select two regions that meet your organization's data residency and operational resilience requirements.
Step 3: Configure Automated Solution Backup
Portal Path: Azure DevOps → Pipelines → Create backup pipeline
- Create Backup Pipeline:
# backup-pipeline.yml
trigger: none # Scheduled trigger
schedules:
- cron: "0 2 * * *" # Daily at 2 AM
displayName: Daily Solution Backup
branches:
include:
- main
always: true
pool:
vmImage: 'windows-latest'
variables:
- group: 'PowerPlatform-Backup'
- name: BackupDate
value: $[format('{0:yyyyMMdd_HHmm}', pipeline.startTime)]
stages:
- stage: BackupTier1
displayName: 'Backup Tier 1 Critical Agents'
jobs:
- job: BackupCritical
steps:
- task: PowerPlatformToolInstaller@2
displayName: 'Install Power Platform CLI'
- task: PowerPlatformExportSolution@2
displayName: 'Export Trading Assistant Solution'
inputs:
authenticationType: 'PowerPlatformSPN'
PowerPlatformSPN: 'Prod-Environment-Connection'
SolutionName: 'TradingAssistant'
SolutionOutputFile: '$(Build.ArtifactStagingDirectory)/TradingAssistant_$(BackupDate).zip'
Managed: true
- task: AzureCLI@2
displayName: 'Upload to Azure Blob'
inputs:
azureSubscription: 'DR-Storage-Connection'
scriptType: 'ps'
scriptLocation: 'inlineScript'
inlineScript: |
az storage blob upload `
--account-name fsibackupstorage `
--container-name solution-backups `
--file "$(Build.ArtifactStagingDirectory)/TradingAssistant_$(BackupDate).zip" `
--name "TradingAssistant/TradingAssistant_$(BackupDate).zip"
- task: PublishBuildArtifacts@1
inputs:
PathtoPublish: '$(Build.ArtifactStagingDirectory)'
ArtifactName: 'Tier1Backup_$(BackupDate)'
- stage: BackupTier2
displayName: 'Backup Tier 2 High Priority Agents'
dependsOn: BackupTier1
jobs:
- job: BackupHighPriority
steps:
# Similar steps for Tier 2 agents
- script: echo "Backup Tier 2 agents"
- Configure Retention:
- Daily backups: Retain 30 days
- Weekly backups: Retain 12 weeks
- Monthly backups: Retain 12 months
- Annual backups: Retain 7 years (regulatory)
Step 4: Configure Dataverse Backup Settings
Portal Path: Power Platform Admin Center → Environments → [Environment] → Backups
- Navigate to Power Platform Admin Center
- Select Environments → Choose production environment
- Click Backups in the left navigation
- Review System Backups:
- Microsoft provides automatic daily backups (retained 28 days)
- System backups occur automatically
- Configure Additional Protection:
- Click Schedule backup for on-demand backups before major changes
- Document backup schedule in BC/DR plan
- Note Limitations:
- System backups restore entire environment
- For granular agent restore, use solution-based backup
Step 5: Deploy Agents to DR Environment
Portal Path: Power Platform Admin Center → Solutions → Import
- Initial DR Deployment:
- Import latest solution backup to DR environment
- Configure environment variables for DR region
-
Test agent functionality in DR environment
-
Configure Connection References:
- Update connection references to use DR-region resources
- Create service accounts for DR environment
-
Document connection mapping
-
Automate DR Sync:
# dr-sync-pipeline.yml # Runs weekly to keep DR environment current schedules: - cron: "0 4 * * 0" # Weekly on Sunday at 4 AM displayName: Weekly DR Sync stages: - stage: SyncDR jobs: - job: DeployToDR steps: - task: PowerPlatformImportSolution@2 displayName: 'Import to DR Environment' inputs: authenticationType: 'PowerPlatformSPN' PowerPlatformSPN: 'DR-Environment-Connection' SolutionInputFile: '$(Pipeline.Workspace)/latest-backup/solution.zip'
Step 6: Create DR Runbook
Document Recovery Procedures:
-
DR Declaration Process:
Trigger Criteria: - Primary region unavailable >30 minutes - Microsoft declares regional outage - Security incident requiring failover Declaration Authority: - CIO or delegate - IT Operations Director - On-call manager (after hours) Notification List: - Executive leadership - Business unit heads - IT operations team - Customer communications team -
Recovery Runbook - Tier 1 Agents:
Step 1: Verify DR Environment Status (5 min) [ ] Login to DR environment [ ] Verify agent solutions are deployed [ ] Confirm connectivity to data sources Step 2: Activate DR Agents (15 min) [ ] Update DNS/traffic routing to DR [ ] Enable agent endpoints [ ] Verify OAuth connections Step 3: Test Agent Functionality (15 min) [ ] Execute test conversations [ ] Verify data source connectivity [ ] Confirm critical functions working Step 4: Communicate Status (5 min) [ ] Notify stakeholders of DR activation [ ] Update status page [ ] Log DR event for regulatory purposes Total Estimated Time: 40 minutes -
Failback Procedure:
Pre-Failback Checklist: [ ] Primary region confirmed stable [ ] Microsoft all-clear received [ ] Data sync verification complete Failback Steps: [ ] Synchronize any DR changes to primary [ ] Update routing to primary region [ ] Test primary agent functionality [ ] Deactivate DR agents (standby mode) [ ] Document failback completion
Step 7: Configure Monitoring and Alerting
Portal Path: Microsoft 365 Admin Center → Health → Service health
- Configure Service Health Alerts:
- Navigate to Microsoft 365 Admin Center
- Select Health → Service health
- Click Preferences → Email
-
Enable alerts for:
- Power Platform
- Dataverse
- SharePoint Online (if used for knowledge)
- Azure Active Directory
-
Create Custom Alerts:
- Power Automate flow monitoring agent availability
- Alert on agent response time degradation
-
Monitor connector health status
-
Establish Communication Plan: | Severity | Response Time | Notification | |----------|---------------|--------------| | Critical | Immediate | Phone + SMS + Email | | High | 15 minutes | Email + Teams | | Medium | 1 hour | Email | | Low | Next business day | Email |
Step 8: Schedule and Conduct DR Testing
Annual DR Test Process:
- Pre-Test Planning (2 weeks before):
- Schedule test window with stakeholders
- Notify affected business units
- Prepare test scenarios
-
Brief DR team
-
DR Test Execution:
Test Scenario: Regional Outage Simulation Phase 1: Failover (Target: <1 hour for Tier 1) - Simulate primary region unavailability - Execute DR declaration process - Activate DR environment - Route traffic to DR Phase 2: Operation (2-4 hours) - Execute business-critical transactions - Test all Tier 1 agent functions - Verify data integrity - Monitor performance Phase 3: Failback (Target: <2 hours) - Confirm primary region available - Synchronize data changes - Execute failback procedure - Verify primary operation -
Post-Test Documentation:
- Test results summary
- RTO/RPO achievement status
- Issues encountered
- Corrective actions
- Lessons learned
- Sign-off from business owners
PowerShell Configuration
Automated Solution Backup Script
# Automated BC/DR Backup Script
param(
[string]$EnvironmentUrl = "https://yourorg.crm.dynamics.com",
[string]$BackupPath = "\\fileserver\backups\PowerPlatform",
[string[]]$SolutionNames = @("TradingAssistant", "CustomerService", "ComplianceBot")
)
# Install/Import modules
Import-Module Microsoft.PowerApps.Administration.PowerShell
# Connect to Power Platform
Connect-PowerApps
# Create backup folder
$backupDate = Get-Date -Format "yyyyMMdd_HHmm"
$backupFolder = Join-Path $BackupPath $backupDate
New-Item -ItemType Directory -Path $backupFolder -Force
# Backup each solution
$backupResults = @()
foreach ($solution in $SolutionNames) {
Write-Host "Backing up solution: $solution" -ForegroundColor Cyan
try {
$outputFile = Join-Path $backupFolder "$solution.zip"
# Export solution (requires connection to Dataverse)
# Using Power Platform CLI
pac solution export --name $solution --path $outputFile --managed true
$backupResults += [PSCustomObject]@{
Solution = $solution
Status = "Success"
FilePath = $outputFile
FileSize = (Get-Item $outputFile).Length
Timestamp = Get-Date
}
}
catch {
$backupResults += [PSCustomObject]@{
Solution = $solution
Status = "Failed"
Error = $_.Exception.Message
Timestamp = Get-Date
}
}
}
# Generate backup report
$reportPath = Join-Path $backupFolder "BackupReport.html"
$html = @"
<!DOCTYPE html>
<html>
<head><title>BC/DR Backup Report</title>
<style>
body { font-family: 'Segoe UI', sans-serif; margin: 20px; }
h1 { color: #0078d4; }
.success { color: green; }
.failed { color: red; }
table { border-collapse: collapse; width: 100%; }
th, td { padding: 10px; text-align: left; border-bottom: 1px solid #ddd; }
th { background: #0078d4; color: white; }
</style>
</head>
<body>
<h1>BC/DR Backup Report</h1>
<p>Backup Date: $backupDate</p>
<p>Environment: $EnvironmentUrl</p>
<table>
<tr><th>Solution</th><th>Status</th><th>File Size</th><th>Timestamp</th></tr>
$(
$backupResults | ForEach-Object {
$statusClass = if ($_.Status -eq "Success") { "success" } else { "failed" }
"<tr><td>$($_.Solution)</td><td class='$statusClass'>$($_.Status)</td><td>$([math]::Round($_.FileSize/1MB, 2)) MB</td><td>$($_.Timestamp)</td></tr>"
}
)
</table>
</body>
</html>
"@
$html | Out-File -FilePath $reportPath
Write-Host "Backup complete. Report: $reportPath" -ForegroundColor Green
# Return results
$backupResults
DR Environment Health Check
# DR Environment Health Check Script
param(
[string]$DREnvironmentUrl = "https://yourorg-dr.crm.dynamics.com"
)
Write-Host "=== DR Environment Health Check ===" -ForegroundColor Cyan
Write-Host "Environment: $DREnvironmentUrl"
Write-Host "Timestamp: $(Get-Date)"
Write-Host ""
# Check environment connectivity
Write-Host "Checking environment connectivity..." -ForegroundColor Yellow
try {
$response = Invoke-WebRequest -Uri $DREnvironmentUrl -Method Head -TimeoutSec 30
Write-Host " Environment reachable: Yes" -ForegroundColor Green
Write-Host " Response code: $($response.StatusCode)"
}
catch {
Write-Host " Environment reachable: No" -ForegroundColor Red
Write-Host " Error: $($_.Exception.Message)"
}
# Get Power Platform environment status
Write-Host "`nChecking Power Platform status..." -ForegroundColor Yellow
Connect-PowerApps
$environments = Get-AdminPowerAppEnvironment | Where-Object {
$_.DisplayName -like "*-DR"
}
foreach ($env in $environments) {
Write-Host " Environment: $($env.DisplayName)"
Write-Host " State: $($env.EnvironmentSku)"
Write-Host " Region: $($env.Location)"
Write-Host " Last Modified: $($env.LastModifiedTime)"
}
# Check solution versions
Write-Host "`nChecking solution deployment status..." -ForegroundColor Yellow
# Would require Dataverse connection to verify solutions
Write-Host "`n=== Health Check Complete ===" -ForegroundColor Cyan
Financial Sector Considerations
Regulatory Alignment
| Regulation | Requirement | BC/DR Implementation |
|---|---|---|
| GLBA 501(b) | Protect against threats | Resilient agent architecture |
| SOX 404 | Internal controls | Documented recovery procedures |
| FINRA 4511 | Business continuity | Agent availability requirements |
| OCC 2011-12 | Operational risk | BC/DR for critical agents |
| FFIEC BC/DR | Recovery testing | Annual DR test with documentation |
| SEC Reg SCI | Systems compliance | Defined RTO/RPO objectives |
Zone-Specific Configuration
| Configuration | Zone 1 (Personal Productivity) | Zone 2 (Team Collaboration) | Zone 3 (Enterprise Managed) |
|---|---|---|---|
| RTO Target | 72 hours | 4 hours | <1 hour |
| RPO Target | 24 hours | 1 hour | 15 minutes |
| Backup Frequency | Weekly | Daily | Continuous |
| DR Environment | None | Warm standby | Hot standby |
| DR Testing | None | Annual | Quarterly |
| Geo-Redundancy | No | Recommended | Required |
FSI Use Case Example
Scenario: Trading Floor Agent BC/DR
Requirements:
- Agent supports real-time trading decisions
- Maximum 15-minute data loss acceptable
- Maximum 1-hour downtime during market hours
- Regulatory requirement for annual DR test
BC/DR Implementation:
- Architecture:
- Primary: East US region
- DR: West US region
-
Hot standby with near-real-time sync
-
Backup Strategy:
- Solution backup: Every 4 hours
- Configuration backup: Continuous
-
Data backup: Near-real-time replication
-
Recovery Procedures:
- Automated failover via Azure Traffic Manager
- Agent endpoints switchover in <15 minutes
-
Full functionality verification in <30 minutes
-
Testing:
- Quarterly tabletop exercises
- Annual full DR failover test
- Results documented for regulators
Regulatory Benefit:
- Demonstrates operational resilience to examiners
- Documented RTO/RPO achievement
- Evidence of regular testing
Verification & Testing
Verification Steps
- Backup Verification:
- [ ] Automated backups running on schedule
- [ ] Backup files accessible and valid
- [ ] Retention policy enforced
-
[ ] Backup alerts configured
-
DR Environment Verification:
- [ ] DR environment accessible
- [ ] Solutions deployed and current
- [ ] Connections configured
-
[ ] Test agents functional
-
Runbook Verification:
- [ ] Runbook documented and current
- [ ] Contact lists updated
- [ ] Escalation procedures defined
-
[ ] Communication templates ready
-
Test Results:
- [ ] Annual DR test completed
- [ ] RTO/RPO targets met
- [ ] Issues documented and remediated
- [ ] Sign-off obtained
Compliance Checklist
- [ ] BC/DR plan documented and approved
- [ ] Critical agents identified and classified
- [ ] RTO/RPO objectives defined
- [ ] DR environment configured
- [ ] Automated backups operational
- [ ] Annual DR test scheduled
- [ ] Test results retained for regulators
Troubleshooting & Validation
Issue 1: DR Environment Out of Sync
Symptoms: DR agents don't match production version
Resolution:
- Check sync pipeline execution logs
- Verify service connection permissions
- Manually import latest solution
- Adjust sync schedule if needed
- Enable alerting for sync failures
Issue 2: Connection Failures in DR
Symptoms: Agents can't connect to data sources in DR
Resolution:
- Verify connection references updated for DR
- Check service account permissions
- Validate network connectivity
- Update OAuth tokens if expired
- Test individual connectors
Issue 3: Backup Pipeline Failures
Symptoms: Scheduled backups not completing
Resolution:
- Review pipeline error logs
- Check service connection expiration
- Verify storage account accessibility
- Validate solution export permissions
- Test manual backup
Issue 4: RTO Target Not Met
Symptoms: DR activation exceeds time target
Resolution:
- Review and streamline runbook
- Pre-stage more configuration in DR
- Automate manual steps
- Increase DR environment tier
- Conduct additional training
Additional Resources
- Power Platform backup and restore
- Environment strategy
- Business continuity guidance
- Azure regions for Power Platform
- Solution ALM
Related Controls
| Control ID | Control Name | Relationship |
|---|---|---|
| 2.1 | Managed Environments | DR environment governance |
| 2.3 | Change Management | Solution versioning for backup |
| 3.1 | Agent Inventory | Critical agent identification |
| 3.4 | Incident Reporting | DR event documentation |
Support & Questions
For implementation support or questions about this control, contact:
- AI Governance Lead (governance direction)
- Compliance Officer (regulatory requirements)
- Technical Implementation Team (platform setup)
Updated: Dec 2025
Version: v1.0 Beta (Dec 2025)
UI Verification Status: ❌ Needs verification