Agent Behavioral Incident Response Playbook
Procedures for responding to AI agent behavioral incidents — unauthorized data access, unexpected actions, prompt injection, and agent runaway scenarios — in Microsoft 365 Copilot and Agent 365 environments.
Disclaimer
This playbook is provided for informational purposes only and does not constitute legal or regulatory advice. Consult legal counsel for specific compliance requirements.
Scope
This playbook covers incidents involving:
- Agent unauthorized data access — an agent accesses data beyond its intended scope or outside information barrier boundaries
- Agent unauthorized actions — an agent takes actions (sending emails, updating records, creating content) that were not intended or approved
- Prompt injection — a malicious or accidental input causes an agent to behave outside its designed parameters
- Agent runaway — an agent enters a loop or escalation pattern, generating excessive actions or consuming resources without producing intended outputs
- Multi-agent chain failure — an agent-to-agent orchestration (A2A) chain produces unintended data flows or unauthorized cross-agent actions
This supplements the AI Incident Response Playbook, which covers data exposure and governance control failures.
Incident Categories
Category A: Agent Unauthorized Data Access
Definition: An agent accesses, surfaces, or processes data that falls outside its approved data scope — including cross-barrier data access, access to restricted SharePoint sites, or access to data in systems connected via federated connectors.
Detection signals:
- Purview audit log shows agent accessing content in restricted SharePoint sites or across information barrier segments
- Defender XDR agent threat detection alert triggers for anomalous agent data access patterns
- User reports Copilot response containing data from an unrelated business unit or system
- Agent Insight Report shows agent accessing sites outside its approved scope
Immediate actions:
- Disable the agent — In M365 Admin Center > Agents, disable the agent immediately
- Preserve evidence — Export Purview audit logs for the agent's activity (search by AgentId)
- Assess exposure — Determine what data was accessed and whether it includes NPI, MNPI, or other regulated data
- Notify stakeholders — Alert the agent owner, information security, and compliance per the firm's incident response escalation matrix
Category B: Agent Unauthorized Actions
Definition: An agent performs actions (sending communications, updating records, modifying files) that were not intended or approved by the agent's configuration or the user's instructions.
Detection signals:
- Users report receiving unexpected emails, Teams messages, or notifications generated by an agent
- Audit log shows agent-initiated actions (email sends, file modifications, CRM updates) outside normal operating parameters
- Power Automate flow execution logs show unexpected agent-triggered flow runs
Immediate actions:
- Disable the agent — Remove the agent from all user assignments
- Revoke agent permissions — In Entra, revoke the agent's Entra Agent ID permissions
- Assess impact — Determine what actions were taken, who was affected, and whether any external communications were sent
- Recall communications — If external communications were sent, initiate message recall and notify affected recipients
Category C: Prompt Injection
Definition: A malicious or accidental input causes an agent to bypass its instructions, reveal system prompts, access data outside its scope, or perform unintended actions.
Detection signals:
- Copilot response contains system prompt content, internal instructions, or configuration details
- Agent produces outputs that contradict its configured behavior or safety guardrails
- Communication compliance flags agent-generated content with unexpected or inappropriate language
- User reports that an agent responded with content clearly outside its designated domain
Immediate actions:
- Disable the agent — Prevent further interactions
- Assess injection vector — Determine whether the injection came from user input, grounding data, or a connected data source
- Review agent instructions — Check whether system prompt protections are adequate
- Assess data exposure — Determine whether the injection caused unauthorized data access or disclosure
Category D: Agent Runaway
Definition: An agent enters a loop, escalation, or recursive pattern that generates excessive actions, consumes disproportionate resources, or produces unintended outputs at scale.
Detection signals:
- PAYG billing alerts trigger for unexpected compute consumption
- Audit log shows a high volume of agent actions in a short time period
- Users report receiving a large volume of agent-generated notifications or outputs
- Agent Insight Report shows abnormal activity volumes
Immediate actions:
- Disable the agent — Immediate removal from all user assignments
- Check budget controls — Verify PAYG budget caps are in place and assess financial impact
- Assess downstream impact — Determine whether runaway actions affected external systems, sent communications, or modified data
Investigation Procedures
Step 1: Evidence Collection
# Search for agent-specific audit events
Search-UnifiedAuditLog -StartDate (Get-Date).AddDays(-7) -EndDate (Get-Date) `
-RecordType "CopilotInteraction" -ResultSize 5000 |
Where-Object { $_.AuditData -like "*AgentId*" } |
Export-Csv -Path "agent-incident-audit.csv" -NoTypeInformation
# Search for agent admin activity events
Search-UnifiedAuditLog -StartDate (Get-Date).AddDays(-7) -EndDate (Get-Date) `
-Operations "AgentAdminActivity" -ResultSize 5000
Step 2: Impact Assessment
| Assessment Area | Questions to Answer | Evidence Source |
|---|---|---|
| Data scope | What data did the agent access? Was any of it NPI, MNPI, or restricted? | Purview audit logs, Agent Insight Report |
| Action scope | What actions did the agent take? Were any external-facing? | Audit logs, Power Automate flow logs |
| User impact | Which users interacted with or were affected by the agent? | Audit logs, user reports |
| Regulatory impact | Does this trigger notification obligations under Reg S-P, FINRA, or state laws? | Legal/compliance assessment |
| Financial impact | What PAYG or compute costs were incurred? | M365 Admin Center billing |
Step 3: Regulatory Notification Assessment
| Trigger | Notification Obligation | Timeline |
|---|---|---|
| NPI accessed by unauthorized agent | SEC Reg S-P 72-hour vendor notification; 30-day customer notification | 72 hours / 30 days |
| MNPI surfaced across barrier | SEC Rule 10b-5 assessment; potential self-report | Immediate assessment |
| External communications sent | FINRA Rule 3110 supervisory failure review | Within supervisory cycle |
| BSA/AML data exposed | FinCEN SAR assessment | Per BSA timelines |
Post-Incident Actions
- Root cause analysis — Document the root cause using the firm's incident management framework
- Remediation — Implement specific fixes (permission tightening, instruction updates, DLP policy additions)
- Agent re-assessment — Before re-enabling, conduct a full security review per Control 2.14 and Control 4.13
- Lessons learned — Update agent governance policies and this playbook based on findings
- Regulatory documentation — Retain all investigation records per FINRA Rule 4511(a) and SEC Rule 17a-4
Related Controls
- 2.9 Defender for Cloud Apps — Agent threat detection in Defender XDR
- 2.14 Declarative Agents Governance — Agent security controls
- 2.16 Federated Connector/MCP Governance — Connector-related agent incidents
- 4.13 Extensibility Governance — Agent lifecycle and multi-agent orchestration
- 4.9 Incident Reporting — General incident reporting procedures
- AI Incident Response Playbook — Data exposure and governance control failure incidents
FSI Copilot Governance Framework v1.4 - April 2026