AI Incident Response Plan

Procedure ASSURANCE

Purpose

Incident response procedures tailored for AI-specific incidents including model compromise, prompt injection exploitation, and data leakage through AI systems.

Related Controls

AI Incident Detection and Response Continuous AI Monitoring

ISO A.6 NIST MG-3 OWASP LLM01 OWASP ASI07

1. AI Incident Categories

Define the categories of AI-specific incidents that this plan covers.

Incident Taxonomy

This incident response plan covers incidents that are unique to or significantly affected by AI systems. Traditional cybersecurity incidents (malware, unauthorized access, DDoS) follow the existing IR plan; this plan supplements it for AI-specific scenarios.

Category 1: Prompt Injection Exploitation

Description: An attacker successfully injects instructions that override the AI system's intended behavior
Indicators: Unexpected AI outputs, system prompt exposure, unauthorized actions by AI agents, anomalous output patterns detected by monitoring
Examples: Customer-facing chatbot providing unauthorized information, AI agent executing unintended tool calls, system prompt extracted and published

Category 2: AI Data Leakage

Description: Sensitive data is exposed through AI system outputs, whether from training data memorization, RAG retrieval errors, or prompt-response logging
Indicators: AI outputs containing PII, credentials, or confidential data not present in the user's input; unauthorized data appearing in AI-generated content
Examples: AI chatbot revealing another customer's data, code assistant outputting API keys from training data, RAG system surfacing confidential documents to unauthorized users

Category 3: Model Compromise

Description: The AI model itself is tampered with — poisoned training data, modified weights, or substituted model
Indicators: Sudden behavioral changes, degraded accuracy on validation sets, new biases or harmful outputs that were not present previously
Examples: Supply chain attack on model weights, adversarial fine-tuning through data poisoning, model swapping in the deployment pipeline

Category 4: AI-Enabled Attack Amplification

Description: An attacker uses the organization's AI systems to amplify a traditional attack — generating phishing content, automating reconnaissance, or exploiting tool integrations
Indicators: AI systems generating high volumes of outbound communications, unusual tool usage patterns, AI-generated content used in social engineering
Examples: Attacker uses internal AI to generate targeted phishing emails, AI agent exploited to exfiltrate data through approved API integrations

Category 5: Bias and Fairness Incidents

Description: AI system produces discriminatory, harmful, or unfair outputs affecting individuals or groups
Indicators: Customer complaints about discriminatory treatment, media reports, internal detection through fairness monitoring
Examples: Hiring AI systematically disadvantaging a protected group, content moderation AI disproportionately flagging content from specific demographics

2. Severity Classification

Define severity levels for AI incidents with clear criteria and escalation requirements.

Severity Matrix

Severity	Criteria	Examples	Response Time	Escalation
SEV-1 (Critical)	Active exploitation with confirmed data exposure, regulatory breach, or widespread customer impact	Mass PII leakage through AI, model compromise in production, AI system used in active attack on customers	Immediate (within 15 minutes)	CISO, CTO, Legal, CEO
SEV-2 (High)	Confirmed vulnerability exploitation with limited data exposure or significant potential for escalation	Successful prompt injection with internal data exposure, AI generating harmful content to customers, unauthorized AI agent actions	Within 1 hour	CISO, Engineering Lead, AI System Owner
SEV-3 (Medium)	Vulnerability confirmed but no evidence of exploitation or data exposure	Prompt injection bypass discovered in testing, misconfigured access controls on AI endpoints, bias detected in AI outputs	Within 4 hours	Security Lead, AI System Owner
SEV-4 (Low)	Potential vulnerability or minor policy violation with no evidence of impact	Employee submits internal data to public AI tool, minor output anomaly detected, failed injection attempt logged	Within 24 hours	Security Analyst, AI System Owner

Classification Decision Tree

Is there confirmed data exposure involving regulated data (PII, PHI, PCI)? → If yes, minimum SEV-2; if mass exposure, SEV-1
Is the AI system actively being exploited? → If yes, minimum SEV-2; if exploitation affects customers, SEV-1
Has the AI model been compromised (weights, training data, configuration)? → If yes, minimum SEV-2
Is the AI system producing harmful or discriminatory outputs to end users? → If yes, minimum SEV-2
Is this a confirmed vulnerability with no evidence of exploitation? → SEV-3
Is this a policy violation or potential vulnerability with no confirmed impact? → SEV-4

Severity Reclassification

Severity may be upgraded at any point during the response as new information becomes available. Severity downgrades require approval from the Incident Commander and documented justification.

3. Response Procedures

Document step-by-step response procedures for each incident phase.

Phase 1: Detection and Triage (0-30 minutes)

Alert Received: Incident detected through monitoring, user report, or external notification
Initial Assessment: On-call security analyst reviews alert context, confirms it is AI-related, and performs initial severity classification
Incident Commander Assigned: Based on severity:

- SEV-1/SEV-2: Senior security engineer or CISO

- SEV-3/SEV-4: On-call security analyst

Communication Channel Established: Dedicated incident channel created in [TOOL] with naming convention: ai-incident-[DATE]-[SEQ]
Initial Notification: Stakeholders notified per the severity escalation matrix

Phase 2: Containment (30 minutes - 4 hours)

Immediate Containment Actions by Category

Category	Primary Containment	Secondary Containment
Prompt Injection	Block attacking IP/user, enable enhanced input filtering	Disable affected endpoint, activate emergency system prompt
Data Leakage	Disable affected AI endpoint, revoke compromised sessions	Isolate affected data stores, initiate breach assessment
Model Compromise	Rollback to last known good model version	Isolate model serving infrastructure, suspend all model updates
Attack Amplification	Disable AI system's external communication capabilities	Revoke AI agent tool permissions, isolate AI system from network
Bias/Fairness	Disable automated decision-making for affected use case	Redirect to human reviewers, preserve affected decision logs

Phase 3: Investigation (4-48 hours)

Evidence Collection: Preserve all logs, prompts, responses, model versions, and system configurations
Root Cause Analysis: Determine how the incident occurred, what vulnerability was exploited, and what the blast radius is
Impact Assessment: Identify all affected users, data, and systems
Timeline Construction: Build a detailed timeline from initial compromise to detection

Phase 4: Eradication and Recovery (1-7 days)

Vulnerability Remediation: Implement fixes for the root cause
System Restoration: Restore AI systems from verified clean state
Verification Testing: Run the AI red team playbook against the remediated system
Monitoring Enhancement: Deploy additional monitoring for the specific attack pattern

4. Communication Plan

Define internal and external communication procedures during an AI incident.

Internal Communication

Notification Matrix

Severity	Notify Within	Stakeholders	Channel
SEV-1	15 minutes	CISO, CTO, CEO, Legal, PR, Board (if data breach)	Phone + Email + Incident Channel
SEV-2	1 hour	CISO, Engineering Lead, AI System Owner, Legal	Email + Incident Channel
SEV-3	4 hours	Security Lead, AI System Owner	Incident Channel
SEV-4	24 hours	AI System Owner	Email

Status Update Cadence

Severity	Update Frequency	Format
SEV-1	Every 30 minutes during active response; every 4 hours during investigation	Verbal (phone/standup) + written summary
SEV-2	Every 2 hours during active response; daily during investigation	Written summary in incident channel
SEV-3	Daily during active response	Written summary in incident channel
SEV-4	As needed	Written summary via email

Status Update Template

AI INCIDENT STATUS UPDATE
Incident ID: [ID]
Severity: [SEV-X]
Status: [Investigating / Containing / Eradicating / Recovering / Closed]
Update Time: [TIMESTAMP]

Current Situation: [1-2 sentence summary]
Actions Taken Since Last Update: [Bullet list]
Next Steps: [Bullet list]
ETA for Next Update: [TIMESTAMP]
Incident Commander: [NAME]

External Communication

Regulatory Notification

Regulation	Notification Trigger	Timeline	Responsible
GDPR	Personal data breach affecting EU residents	72 hours from awareness	Data Protection Officer
CCPA	Breach of unencrypted personal information	"In the most expedient time possible"	Legal
HIPAA	Breach of unsecured PHI	60 days (individuals); 60 days (HHS)	Privacy Officer
SEC (if public company)	Material cybersecurity incident	4 business days of materiality determination	Legal + CFO

Customer Communication

Customer notification is required when AI incidents result in exposure of customer data, provision of materially incorrect AI-generated advice, or discriminatory outcomes affecting customers. All customer communications must be reviewed by Legal and PR before distribution.

5. Post-Incident Review

Define the post-incident review process to ensure thorough analysis and documentation.

Post-Incident Review Meeting

Timeline: Conducted within 5 business days of incident closure

Attendees:

Incident Commander
All responders involved in the incident
AI System Owner
Security Lead
Engineering Lead
[ROLE TITLE] (AI Governance Committee representative)

Agenda:

Incident Timeline Review (15 min) — Walk through the complete timeline from detection to closure
Root Cause Analysis (30 min) — Present and discuss the technical root cause
Response Effectiveness (20 min) — Evaluate what went well and what could be improved
Detection Gap Analysis (15 min) — Assess why the incident was not detected sooner
Remediation Validation (10 min) — Confirm all remediations are in place and verified
Action Items (15 min) — Assign and schedule follow-up actions

Post-Incident Report

The Incident Commander produces a written report within 10 business days of the review meeting:

Section	Content
Executive Summary	1-paragraph overview for leadership
Incident Description	What happened, categorization, severity
Timeline	Chronological event log from first indicator to closure
Root Cause	Technical root cause with supporting evidence
Impact Assessment	Users affected, data exposed, financial impact, reputational impact
Response Assessment	Detection time, containment time, resolution time, adherence to procedures
Remediation Summary	Actions taken to resolve and prevent recurrence
Action Items	Specific tasks with owners, due dates, and priority

Metrics Tracked

Metric	Definition	Target
Mean Time to Detect (MTTD)	Time from incident start to detection	≤ 1 hour
Mean Time to Contain (MTTC)	Time from detection to containment	≤ 4 hours (SEV-1/2)
Mean Time to Resolve (MTTR)	Time from detection to full resolution	≤ 72 hours (SEV-1)
Post-incident review completion	Review completed within SLA	100%
Action item completion rate	Action items completed by due date	≥ 95%

6. Lessons Learned

Define how lessons learned are captured, distributed, and incorporated into organizational processes.

Lessons Learned Process

Capture

Lessons learned are captured from three sources:

Post-Incident Review Meeting: Facilitator documents lessons identified during discussion
Responder Retrospectives: Each responder submits individual observations within 3 business days of incident closure
Metrics Analysis: Quantitative analysis of response metrics compared to targets

Classification

Each lesson is classified by category:

Category	Examples
Detection	"Our monitoring did not have alerts for indirect prompt injection via RAG documents"
Process	"The escalation path for AI-specific incidents was unclear to the on-call team"
Technical	"Output filtering did not catch the specific encoding used in the attack"
Training	"Responders were unfamiliar with AI-specific forensic techniques"
Communication	"Customer notification template did not adequately explain AI-specific data exposure"
Tooling	"We lacked automated tools to analyze AI interaction logs at scale during the investigation"

Distribution

Audience	Content	Format	Timeline
AI Governance Committee	Full lessons learned report	Written report + presentation	Within 15 business days
Security team	Technical lessons and detection improvements	Team briefing	Within 10 business days
Engineering team	Technical root cause and remediation details	Technical brief	Within 10 business days
All personnel (if relevant)	Awareness-level summary	Newsletter or all-hands mention	Within 30 days
Executive leadership	Impact summary and investment recommendations	Executive brief	Within 15 business days

Integration into Organizational Processes

Lessons learned must result in concrete updates to at least one of the following:

This IR Plan: Update procedures, categories, or severity criteria based on new incident types
AI Red Team Playbook: Add new attack scenarios discovered during incidents
Prompt Injection Defense Checklist: Update technical controls based on observed attack techniques
AI Deployment Validation Checklist: Add checks that would have prevented the incident
Training Materials: Update AI security training with real-world case studies (sanitized)
Monitoring and Detection: Deploy new detection rules and alerting based on observed indicators

Tracking

All lessons learned are logged in the AI Incident Knowledge Base maintained by [DEPARTMENT]. Each entry includes: incident reference, lesson description, category, action taken, date implemented, and effectiveness assessment (evaluated at the next quarterly review).

← Back to all templates