Skip to content

Agent Behavioral Incident Response Playbook

Procedures for responding to AI agent behavioral incidents — unauthorized data access, unexpected actions, prompt injection, and agent runaway scenarios — in Microsoft 365 Copilot and Agent 365 environments.

Disclaimer

This playbook is provided for informational purposes only and does not constitute legal or regulatory advice. Consult legal counsel for specific compliance requirements.


Scope

This playbook covers incidents involving:

  • Agent unauthorized data access — an agent accesses data beyond its intended scope or outside information barrier boundaries
  • Agent unauthorized actions — an agent takes actions (sending emails, updating records, creating content) that were not intended or approved
  • Prompt injection — a malicious or accidental input causes an agent to behave outside its designed parameters
  • Agent runaway — an agent enters a loop or escalation pattern, generating excessive actions or consuming resources without producing intended outputs
  • Multi-agent chain failure — an agent-to-agent orchestration (A2A) chain produces unintended data flows or unauthorized cross-agent actions

This supplements the AI Incident Response Playbook, which covers data exposure and governance control failures.


Incident Categories

Category A: Agent Unauthorized Data Access

Definition: An agent accesses, surfaces, or processes data that falls outside its approved data scope — including cross-barrier data access, access to restricted SharePoint sites, or access to data in systems connected via federated connectors.

Detection signals:

  • Purview audit log shows agent accessing content in restricted SharePoint sites or across information barrier segments
  • Defender XDR agent threat detection alert triggers for anomalous agent data access patterns
  • User reports Copilot response containing data from an unrelated business unit or system
  • Agent Insight Report shows agent accessing sites outside its approved scope

Immediate actions:

  1. Disable the agent — In M365 Admin Center > Agents, disable the agent immediately
  2. Preserve evidence — Export Purview audit logs for the agent's activity (search by AgentId)
  3. Assess exposure — Determine what data was accessed and whether it includes NPI, MNPI, or other regulated data
  4. Notify stakeholders — Alert the agent owner, information security, and compliance per the firm's incident response escalation matrix

Category B: Agent Unauthorized Actions

Definition: An agent performs actions (sending communications, updating records, modifying files) that were not intended or approved by the agent's configuration or the user's instructions.

Detection signals:

  • Users report receiving unexpected emails, Teams messages, or notifications generated by an agent
  • Audit log shows agent-initiated actions (email sends, file modifications, CRM updates) outside normal operating parameters
  • Power Automate flow execution logs show unexpected agent-triggered flow runs

Immediate actions:

  1. Disable the agent — Remove the agent from all user assignments
  2. Revoke agent permissions — In Entra, revoke the agent's Entra Agent ID permissions
  3. Assess impact — Determine what actions were taken, who was affected, and whether any external communications were sent
  4. Recall communications — If external communications were sent, initiate message recall and notify affected recipients

Category C: Prompt Injection

Definition: A malicious or accidental input causes an agent to bypass its instructions, reveal system prompts, access data outside its scope, or perform unintended actions.

Detection signals:

  • Copilot response contains system prompt content, internal instructions, or configuration details
  • Agent produces outputs that contradict its configured behavior or safety guardrails
  • Communication compliance flags agent-generated content with unexpected or inappropriate language
  • User reports that an agent responded with content clearly outside its designated domain

Immediate actions:

  1. Disable the agent — Prevent further interactions
  2. Assess injection vector — Determine whether the injection came from user input, grounding data, or a connected data source
  3. Review agent instructions — Check whether system prompt protections are adequate
  4. Assess data exposure — Determine whether the injection caused unauthorized data access or disclosure

Category D: Agent Runaway

Definition: An agent enters a loop, escalation, or recursive pattern that generates excessive actions, consumes disproportionate resources, or produces unintended outputs at scale.

Detection signals:

  • PAYG billing alerts trigger for unexpected compute consumption
  • Audit log shows a high volume of agent actions in a short time period
  • Users report receiving a large volume of agent-generated notifications or outputs
  • Agent Insight Report shows abnormal activity volumes

Immediate actions:

  1. Disable the agent — Immediate removal from all user assignments
  2. Check budget controls — Verify PAYG budget caps are in place and assess financial impact
  3. Assess downstream impact — Determine whether runaway actions affected external systems, sent communications, or modified data

Investigation Procedures

Step 1: Evidence Collection

# Search for agent-specific audit events
Search-UnifiedAuditLog -StartDate (Get-Date).AddDays(-7) -EndDate (Get-Date) `
  -RecordType "CopilotInteraction" -ResultSize 5000 |
  Where-Object { $_.AuditData -like "*AgentId*" } |
  Export-Csv -Path "agent-incident-audit.csv" -NoTypeInformation

# Search for agent admin activity events
Search-UnifiedAuditLog -StartDate (Get-Date).AddDays(-7) -EndDate (Get-Date) `
  -Operations "AgentAdminActivity" -ResultSize 5000

Step 2: Impact Assessment

Assessment Area Questions to Answer Evidence Source
Data scope What data did the agent access? Was any of it NPI, MNPI, or restricted? Purview audit logs, Agent Insight Report
Action scope What actions did the agent take? Were any external-facing? Audit logs, Power Automate flow logs
User impact Which users interacted with or were affected by the agent? Audit logs, user reports
Regulatory impact Does this trigger notification obligations under Reg S-P, FINRA, or state laws? Legal/compliance assessment
Financial impact What PAYG or compute costs were incurred? M365 Admin Center billing

Step 3: Regulatory Notification Assessment

Trigger Notification Obligation Timeline
NPI accessed by unauthorized agent SEC Reg S-P 72-hour vendor notification; 30-day customer notification 72 hours / 30 days
MNPI surfaced across barrier SEC Rule 10b-5 assessment; potential self-report Immediate assessment
External communications sent FINRA Rule 3110 supervisory failure review Within supervisory cycle
BSA/AML data exposed FinCEN SAR assessment Per BSA timelines

Post-Incident Actions

  1. Root cause analysis — Document the root cause using the firm's incident management framework
  2. Remediation — Implement specific fixes (permission tightening, instruction updates, DLP policy additions)
  3. Agent re-assessment — Before re-enabling, conduct a full security review per Control 2.14 and Control 4.13
  4. Lessons learned — Update agent governance policies and this playbook based on findings
  5. Regulatory documentation — Retain all investigation records per FINRA Rule 4511(a) and SEC Rule 17a-4


FSI Copilot Governance Framework v1.4 - April 2026