MODEL
Owner: ML Engineering Lead / Security Engineers / AppSec
AI Model Controls
Focus on securing the AI models themselves against adversarial attacks and ensuring output quality.
Framework Mapping
Controls from each source framework that map to this domain.
| Framework | Mapped Controls |
|---|---|
| ISO 42001 |
A.9 Robustness
A.6 AI System Lifecycle
Cl.8 Operation
|
| NIST AI RMF |
AI 600-1 GenAI Profile
MS-2 Performance
MG-3 Documentation
MP-5 Performance
|
| OWASP LLM |
LLM01 Prompt Injection
LLM02 Insecure Output
LLM07 System Prompt Leakage
LLM09 Misinformation
LLM10 Unbounded Consumption
|
| OWASP Agentic |
ASI01 Unbounded Consumption
ASI05 Identity Exploitation
ASI07 Uncontrolled Cascading
ASI09 Operational Disruption
|
Controls
5 controls across Tier 1 (essential) and Tier 2 (advanced).
Tier 1
ISO A.9
NIST AI 600-1
OWASP LLM01
OWASP ASI05
Adversarial Input Defense
Tier 1
ISO A.9
NIST MS-2
OWASP LLM02
OWASP ASI06
Model Output Sanitization
Tier 1
ISO Clause 8
NIST MG-3
OWASP LLM10
OWASP ASI01
Adversarial Query Restriction and Cost Governance
Tier 2
ISO A.9
NIST AI 600-1
OWASP LLM07
OWASP ASI07
System Prompt Protection
Tier 2
ISO A.6
NIST MS-2
OWASP LLM09
OWASP ASI09
AI Output Reliability and Hallucination Mitigation
Audit Checklist
Quick-reference checklist items grouped by control.
- ☐ Input validation pipeline is active on all AI model interfaces with structural and semantic checks
- ☐ Prompt firewall or equivalent injection detection is deployed and ruleset is updated at least quarterly
- ☐ Adversarial testing is conducted at least quarterly and findings are remediated within defined SLAs
- ☐ Behavioral guardrails are tested against a maintained red team scenario library
- ☐ Blocked prompt telemetry feeds into threat intelligence and detection rule updates
- ☐ All AI outputs are encoded or escaped appropriately for their rendering context before delivery
- ☐ Content safety filtering is active on all production AI output channels with defined category thresholds
- ☐ Structured AI outputs are validated against schemas before downstream processing or execution
- ☐ Output safety metrics are monitored in real time with alerting for threshold breaches
- ☐ Monthly feedback loop demonstrates output quality improvements based on monitoring data
- ☐ Rate limits and token quotas are enforced at the API gateway for all AI model endpoints
- ☐ Real-time cost monitoring is active with budget threshold alerts configured and tested
- ☐ Anomaly detection is operational with documented baselines and flagging thresholds
- ☐ Automatic circuit breakers can suspend non-critical AI services during cost emergencies
- ☐ Abuse response procedures are documented and include vendor cost dispute processes
- ☐ System prompts contain no embedded credentials, API keys, or access tokens
- ☐ System prompts include explicit extraction-resistance instructions
- ☐ Output monitoring detects and alerts on system prompt fragments in model responses
- ☐ Quarterly extraction testing is conducted and findings are remediated within 30 days
- ☐ Credential injection uses runtime secrets management rather than static configuration
- ☐ High-stakes AI use cases have mandatory human review workflows with defined reviewer roles
- ☐ RAG or equivalent grounding mechanisms are deployed for factual AI applications
- ☐ AI outputs in decision-making contexts include reliability indicators or uncertainty signals
- ☐ Hallucination rate benchmarks are maintained and tracked over time per model and use case
- ☐ Automated fact-checking is deployed for at least one high-stakes use case