MODEL Owner: ML Engineering Lead / Security Engineers / AppSec

AI Model Controls

Focus on securing the AI models themselves against adversarial attacks and ensuring output quality.

Framework Mapping

Controls from each source framework that map to this domain.

Framework Mapped Controls
ISO 42001
A.9 Robustness A.6 AI System Lifecycle Cl.8 Operation
NIST AI RMF
AI 600-1 GenAI Profile MS-2 Performance MG-3 Documentation MP-5 Performance
OWASP LLM
LLM01 Prompt Injection LLM02 Insecure Output LLM07 System Prompt Leakage LLM09 Misinformation LLM10 Unbounded Consumption
OWASP Agentic
ASI01 Unbounded Consumption ASI05 Identity Exploitation ASI07 Uncontrolled Cascading ASI09 Operational Disruption

Audit Checklist

Quick-reference checklist items grouped by control.

  • Input validation pipeline is active on all AI model interfaces with structural and semantic checks
  • Prompt firewall or equivalent injection detection is deployed and ruleset is updated at least quarterly
  • Adversarial testing is conducted at least quarterly and findings are remediated within defined SLAs
  • Behavioral guardrails are tested against a maintained red team scenario library
  • Blocked prompt telemetry feeds into threat intelligence and detection rule updates
  • All AI outputs are encoded or escaped appropriately for their rendering context before delivery
  • Content safety filtering is active on all production AI output channels with defined category thresholds
  • Structured AI outputs are validated against schemas before downstream processing or execution
  • Output safety metrics are monitored in real time with alerting for threshold breaches
  • Monthly feedback loop demonstrates output quality improvements based on monitoring data
  • Rate limits and token quotas are enforced at the API gateway for all AI model endpoints
  • Real-time cost monitoring is active with budget threshold alerts configured and tested
  • Anomaly detection is operational with documented baselines and flagging thresholds
  • Automatic circuit breakers can suspend non-critical AI services during cost emergencies
  • Abuse response procedures are documented and include vendor cost dispute processes
  • System prompts contain no embedded credentials, API keys, or access tokens
  • System prompts include explicit extraction-resistance instructions
  • Output monitoring detects and alerts on system prompt fragments in model responses
  • Quarterly extraction testing is conducted and findings are remediated within 30 days
  • Credential injection uses runtime secrets management rather than static configuration
  • High-stakes AI use cases have mandatory human review workflows with defined reviewer roles
  • RAG or equivalent grounding mechanisms are deployed for factual AI applications
  • AI outputs in decision-making contexts include reliability indicators or uncertainty signals
  • Hallucination rate benchmarks are maintained and tracked over time per model and use case
  • Automated fact-checking is deployed for at least one high-stakes use case