AI Incident Response Plan

Procedure ASSURANCE

Purpose

Incident response procedures tailored for AI-specific incidents including model compromise, prompt injection exploitation, and data leakage through AI systems.

Related Controls

ISO A.6 NIST MG-3 OWASP LLM01 OWASP ASI07

1. AI Incident Categories

Define the categories of AI-specific incidents that this plan covers.

Incident Taxonomy

This incident response plan covers incidents that are unique to or significantly affected by AI systems. Traditional cybersecurity incidents (malware, unauthorized access, DDoS) follow the existing IR plan; this plan supplements it for AI-specific scenarios.

Category 1: Prompt Injection Exploitation

  • Description: An attacker successfully injects instructions that override the AI system's intended behavior
  • Indicators: Unexpected AI outputs, system prompt exposure, unauthorized actions by AI agents, anomalous output patterns detected by monitoring
  • Examples: Customer-facing chatbot providing unauthorized information, AI agent executing unintended tool calls, system prompt extracted and published

Category 2: AI Data Leakage

  • Description: Sensitive data is exposed through AI system outputs, whether from training data memorization, RAG retrieval errors, or prompt-response logging
  • Indicators: AI outputs containing PII, credentials, or confidential data not present in the user's input; unauthorized data appearing in AI-generated content
  • Examples: AI chatbot revealing another customer's data, code assistant outputting API keys from training data, RAG system surfacing confidential documents to unauthorized users

Category 3: Model Compromise

  • Description: The AI model itself is tampered with — poisoned training data, modified weights, or substituted model
  • Indicators: Sudden behavioral changes, degraded accuracy on validation sets, new biases or harmful outputs that were not present previously
  • Examples: Supply chain attack on model weights, adversarial fine-tuning through data poisoning, model swapping in the deployment pipeline

Category 4: AI-Enabled Attack Amplification

  • Description: An attacker uses the organization's AI systems to amplify a traditional attack — generating phishing content, automating reconnaissance, or exploiting tool integrations
  • Indicators: AI systems generating high volumes of outbound communications, unusual tool usage patterns, AI-generated content used in social engineering
  • Examples: Attacker uses internal AI to generate targeted phishing emails, AI agent exploited to exfiltrate data through approved API integrations

Category 5: Bias and Fairness Incidents

  • Description: AI system produces discriminatory, harmful, or unfair outputs affecting individuals or groups
  • Indicators: Customer complaints about discriminatory treatment, media reports, internal detection through fairness monitoring
  • Examples: Hiring AI systematically disadvantaging a protected group, content moderation AI disproportionately flagging content from specific demographics

2. Severity Classification

Define severity levels for AI incidents with clear criteria and escalation requirements.

Severity Matrix

SeverityCriteriaExamplesResponse TimeEscalation
SEV-1 (Critical)Active exploitation with confirmed data exposure, regulatory breach, or widespread customer impactMass PII leakage through AI, model compromise in production, AI system used in active attack on customersImmediate (within 15 minutes)CISO, CTO, Legal, CEO
SEV-2 (High)Confirmed vulnerability exploitation with limited data exposure or significant potential for escalationSuccessful prompt injection with internal data exposure, AI generating harmful content to customers, unauthorized AI agent actionsWithin 1 hourCISO, Engineering Lead, AI System Owner
SEV-3 (Medium)Vulnerability confirmed but no evidence of exploitation or data exposurePrompt injection bypass discovered in testing, misconfigured access controls on AI endpoints, bias detected in AI outputsWithin 4 hoursSecurity Lead, AI System Owner
SEV-4 (Low)Potential vulnerability or minor policy violation with no evidence of impactEmployee submits internal data to public AI tool, minor output anomaly detected, failed injection attempt loggedWithin 24 hoursSecurity Analyst, AI System Owner

Classification Decision Tree

  1. Is there confirmed data exposure involving regulated data (PII, PHI, PCI)? → If yes, minimum SEV-2; if mass exposure, SEV-1
  2. Is the AI system actively being exploited? → If yes, minimum SEV-2; if exploitation affects customers, SEV-1
  3. Has the AI model been compromised (weights, training data, configuration)? → If yes, minimum SEV-2
  4. Is the AI system producing harmful or discriminatory outputs to end users? → If yes, minimum SEV-2
  5. Is this a confirmed vulnerability with no evidence of exploitation? → SEV-3
  6. Is this a policy violation or potential vulnerability with no confirmed impact? → SEV-4

Severity Reclassification

Severity may be upgraded at any point during the response as new information becomes available. Severity downgrades require approval from the Incident Commander and documented justification.

3. Response Procedures

Document step-by-step response procedures for each incident phase.

Phase 1: Detection and Triage (0-30 minutes)

  1. Alert Received: Incident detected through monitoring, user report, or external notification
  2. Initial Assessment: On-call security analyst reviews alert context, confirms it is AI-related, and performs initial severity classification
  3. Incident Commander Assigned: Based on severity:

- SEV-1/SEV-2: Senior security engineer or CISO

- SEV-3/SEV-4: On-call security analyst

  1. Communication Channel Established: Dedicated incident channel created in [TOOL] with naming convention: ai-incident-[DATE]-[SEQ]
  2. Initial Notification: Stakeholders notified per the severity escalation matrix

Phase 2: Containment (30 minutes - 4 hours)

Immediate Containment Actions by Category

CategoryPrimary ContainmentSecondary Containment
Prompt InjectionBlock attacking IP/user, enable enhanced input filteringDisable affected endpoint, activate emergency system prompt
Data LeakageDisable affected AI endpoint, revoke compromised sessionsIsolate affected data stores, initiate breach assessment
Model CompromiseRollback to last known good model versionIsolate model serving infrastructure, suspend all model updates
Attack AmplificationDisable AI system's external communication capabilitiesRevoke AI agent tool permissions, isolate AI system from network
Bias/FairnessDisable automated decision-making for affected use caseRedirect to human reviewers, preserve affected decision logs

Phase 3: Investigation (4-48 hours)

  1. Evidence Collection: Preserve all logs, prompts, responses, model versions, and system configurations
  2. Root Cause Analysis: Determine how the incident occurred, what vulnerability was exploited, and what the blast radius is
  3. Impact Assessment: Identify all affected users, data, and systems
  4. Timeline Construction: Build a detailed timeline from initial compromise to detection

Phase 4: Eradication and Recovery (1-7 days)

  1. Vulnerability Remediation: Implement fixes for the root cause
  2. System Restoration: Restore AI systems from verified clean state
  3. Verification Testing: Run the AI red team playbook against the remediated system
  4. Monitoring Enhancement: Deploy additional monitoring for the specific attack pattern

4. Communication Plan

Define internal and external communication procedures during an AI incident.

Internal Communication

Notification Matrix

SeverityNotify WithinStakeholdersChannel
SEV-115 minutesCISO, CTO, CEO, Legal, PR, Board (if data breach)Phone + Email + Incident Channel
SEV-21 hourCISO, Engineering Lead, AI System Owner, LegalEmail + Incident Channel
SEV-34 hoursSecurity Lead, AI System OwnerIncident Channel
SEV-424 hoursAI System OwnerEmail

Status Update Cadence

SeverityUpdate FrequencyFormat
SEV-1Every 30 minutes during active response; every 4 hours during investigationVerbal (phone/standup) + written summary
SEV-2Every 2 hours during active response; daily during investigationWritten summary in incident channel
SEV-3Daily during active responseWritten summary in incident channel
SEV-4As neededWritten summary via email

Status Update Template

AI INCIDENT STATUS UPDATE
Incident ID: [ID]
Severity: [SEV-X]
Status: [Investigating / Containing / Eradicating / Recovering / Closed]
Update Time: [TIMESTAMP]

Current Situation: [1-2 sentence summary]
Actions Taken Since Last Update: [Bullet list]
Next Steps: [Bullet list]
ETA for Next Update: [TIMESTAMP]
Incident Commander: [NAME]

External Communication

Regulatory Notification

RegulationNotification TriggerTimelineResponsible
GDPRPersonal data breach affecting EU residents72 hours from awarenessData Protection Officer
CCPABreach of unencrypted personal information"In the most expedient time possible"Legal
HIPAABreach of unsecured PHI60 days (individuals); 60 days (HHS)Privacy Officer
SEC (if public company)Material cybersecurity incident4 business days of materiality determinationLegal + CFO

Customer Communication

Customer notification is required when AI incidents result in exposure of customer data, provision of materially incorrect AI-generated advice, or discriminatory outcomes affecting customers. All customer communications must be reviewed by Legal and PR before distribution.

5. Post-Incident Review

Define the post-incident review process to ensure thorough analysis and documentation.

Post-Incident Review Meeting

Timeline: Conducted within 5 business days of incident closure

Attendees:

  • Incident Commander
  • All responders involved in the incident
  • AI System Owner
  • Security Lead
  • Engineering Lead
  • [ROLE TITLE] (AI Governance Committee representative)

Agenda:

  1. Incident Timeline Review (15 min) — Walk through the complete timeline from detection to closure
  2. Root Cause Analysis (30 min) — Present and discuss the technical root cause
  3. Response Effectiveness (20 min) — Evaluate what went well and what could be improved
  4. Detection Gap Analysis (15 min) — Assess why the incident was not detected sooner
  5. Remediation Validation (10 min) — Confirm all remediations are in place and verified
  6. Action Items (15 min) — Assign and schedule follow-up actions

Post-Incident Report

The Incident Commander produces a written report within 10 business days of the review meeting:

SectionContent
Executive Summary1-paragraph overview for leadership
Incident DescriptionWhat happened, categorization, severity
TimelineChronological event log from first indicator to closure
Root CauseTechnical root cause with supporting evidence
Impact AssessmentUsers affected, data exposed, financial impact, reputational impact
Response AssessmentDetection time, containment time, resolution time, adherence to procedures
Remediation SummaryActions taken to resolve and prevent recurrence
Action ItemsSpecific tasks with owners, due dates, and priority

Metrics Tracked

MetricDefinitionTarget
Mean Time to Detect (MTTD)Time from incident start to detection≤ 1 hour
Mean Time to Contain (MTTC)Time from detection to containment≤ 4 hours (SEV-1/2)
Mean Time to Resolve (MTTR)Time from detection to full resolution≤ 72 hours (SEV-1)
Post-incident review completionReview completed within SLA100%
Action item completion rateAction items completed by due date≥ 95%

6. Lessons Learned

Define how lessons learned are captured, distributed, and incorporated into organizational processes.

Lessons Learned Process

Capture

Lessons learned are captured from three sources:

  1. Post-Incident Review Meeting: Facilitator documents lessons identified during discussion
  2. Responder Retrospectives: Each responder submits individual observations within 3 business days of incident closure
  3. Metrics Analysis: Quantitative analysis of response metrics compared to targets

Classification

Each lesson is classified by category:

CategoryExamples
Detection"Our monitoring did not have alerts for indirect prompt injection via RAG documents"
Process"The escalation path for AI-specific incidents was unclear to the on-call team"
Technical"Output filtering did not catch the specific encoding used in the attack"
Training"Responders were unfamiliar with AI-specific forensic techniques"
Communication"Customer notification template did not adequately explain AI-specific data exposure"
Tooling"We lacked automated tools to analyze AI interaction logs at scale during the investigation"

Distribution

AudienceContentFormatTimeline
AI Governance CommitteeFull lessons learned reportWritten report + presentationWithin 15 business days
Security teamTechnical lessons and detection improvementsTeam briefingWithin 10 business days
Engineering teamTechnical root cause and remediation detailsTechnical briefWithin 10 business days
All personnel (if relevant)Awareness-level summaryNewsletter or all-hands mentionWithin 30 days
Executive leadershipImpact summary and investment recommendationsExecutive briefWithin 15 business days

Integration into Organizational Processes

Lessons learned must result in concrete updates to at least one of the following:

  1. This IR Plan: Update procedures, categories, or severity criteria based on new incident types
  2. AI Red Team Playbook: Add new attack scenarios discovered during incidents
  3. Prompt Injection Defense Checklist: Update technical controls based on observed attack techniques
  4. AI Deployment Validation Checklist: Add checks that would have prevented the incident
  5. Training Materials: Update AI security training with real-world case studies (sanitized)
  6. Monitoring and Detection: Deploy new detection rules and alerting based on observed indicators

Tracking

All lessons learned are logged in the AI Incident Knowledge Base maintained by [DEPARTMENT]. Each entry includes: incident reference, lesson description, category, action taken, date implemented, and effectiveness assessment (evaluated at the next quarterly review).

← Back to all templates