Skip to content

Control 2.4: Business Continuity and Disaster Recovery

Control ID: 2.4 Pillar: Management Regulatory Reference: GLBA 501(b), SOX 404, FINRA 4511, OCC 2011-12, FFIEC BC/DR Guidance Last UI Verified: January 2026 Governance Levels: Baseline / Recommended / Regulated Last Verified: 2026-02-03


Agent 365 Architecture Update

Agent 365 Observability supports business continuity planning by providing unified telemetry for disaster recovery testing and agent health monitoring across platforms. See Unified Agent Governance for observability architecture.

Objective

Ensure critical Copilot Studio agents remain available or can be rapidly restored following service disruptions, regional outages, or disaster scenarios. Establish backup procedures, recovery objectives, failover capabilities, and regular testing to meet regulatory requirements for operational resilience.


Why This Matters for FSI

  • GLBA 501(b): Protect against threats to availability - resilient agent architecture required
  • SOX 404: Internal controls must include documented recovery procedures
  • FINRA 4511: Business continuity requirements extend to agent availability
  • OCC 2011-12: Operational risk management includes BC/DR for critical systems
  • FFIEC BC/DR: Recovery testing required with documented results

Control Description

This control addresses key FSI requirements for operational resilience of AI agents:

Capability Description FSI Relevance
Recovery Time Objective (RTO) Maximum acceptable downtime Regulatory examination focus
Recovery Point Objective (RPO) Maximum acceptable data loss Data integrity requirements
Geo-Redundancy Cross-region failover capability Regional outage protection
Regular Testing Annual DR test with documented results FFIEC examination readiness

Agent Criticality Classification

Tier Description RTO RPO Examples
Tier 1 - Critical Business cannot operate <1 hour <15 min Trading assistant, Payment processor
Tier 2 - High Significant impact <4 hours <1 hour Customer service, Compliance agent
Tier 3 - Medium Moderate impact <24 hours <4 hours Internal HR bot, IT help desk
Tier 4 - Low Minimal impact <72 hours <24 hours Personal productivity agents

Key Configuration Points

  • Classify agents by criticality tier with defined RTO/RPO
  • Create secondary region environments for geo-redundancy
  • Configure automated solution backup with appropriate retention
  • Deploy agents to DR environment with connection references updated
  • Create DR runbook with declaration criteria and recovery procedures
  • Establish service health monitoring and alerting
  • Schedule and document annual DR testing

Automation Available

See DR Testing Framework in FSI-AgentGov-Solutions for automated validation of AI agent disaster recovery procedures against defined RTO/RPO targets with gap identification.


Zone-Specific Requirements

Zone Requirement Rationale
Zone 1 (Personal) Weekly backup; 72-hour RTO acceptable; no DR environment required Low criticality, minimal business impact
Zone 2 (Team) Daily backup; 4-hour RTO; warm standby DR recommended; annual testing Shared agents require higher availability
Zone 3 (Enterprise) Continuous backup; <1-hour RTO; hot standby required; quarterly testing Customer-facing, regulatory examination focus

Roles & Responsibilities

Role Responsibility
Power Platform Admin Environment management, backup configuration, DR environment setup
IT Operations DR activation, failover execution, monitoring
AI Governance Lead Criticality classification, RTO/RPO approval
Compliance Officer DR test documentation, regulatory examination readiness

Control Relationship
2.1 - Managed Environments DR environment governance
2.3 - Change Management Solution versioning for backup
3.1 - Agent Inventory Critical agent identification
3.4 - Incident Reporting DR event documentation

Implementation Playbooks

Step-by-Step Implementation

This control has detailed playbooks for implementation, automation, testing, and troubleshooting:


Verification Criteria

Confirm control effectiveness by verifying:

  1. All agents classified by criticality tier with documented RTO/RPO
  2. Automated backups running on schedule with retention policy enforced
  3. DR environment configured and current with production
  4. DR runbook documented with declaration criteria and recovery steps
  5. Annual DR test completed with RTO/RPO targets met
  6. Test results documented and retained for regulatory examination

Additional Resources


Updated: January 2026 | Version: v1.2 | UI Verification Status: Current