Governing AI Agents in the Enterprise
92% of AI vendors claim broad data usage rights. 17% commit to regulatory compliance. Governance frameworks from NIST, OWASP, EU CRA, and Stanford CodeX.
Regulatory landscape
EU Cyber Resilience Act (machine-readable vulnerability data), NIST IR 8596 (AI cybersecurity framework), Singapore IMDA (first agentic AI governance framework), and the International AI Safety Report 2026.
Contract considerations
Stanford CodeX analyzed AI vendor agreements: 92% claim broad data usage rights, only 17% commit to regulatory compliance, 88% impose self-liability caps. Procurement teams need agent-specific contract language.
Why existing governance frameworks miss agent risk
Vendor risk management evaluates software as a static artifact. Agents are autonomous actors that make decisions, use tools, and interact with external services at runtime. The governance gap: no existing framework addresses what an agent does after deployment.
40% of agentic AI projects will be canceled by end of 2027 due to inadequate risk controls (Gartner, Jun 2025). The failures won't be technical. They'll be governance failures — organizations that deployed agents without processes to monitor agent behavior.
Regulatory landscape
EU Cyber Resilience Act
Article 13.6 requires manufacturers to share vulnerability information in machine-readable format. Products with digital elements — including AI agents — must produce structured vulnerability disclosures. Deadline: December 2027.
NIST IR 8596
AI Cybersecurity Framework Profile (Dec 2025). Maps AI supply chain security to existing NIST CSF functions. Covers model provenance, data integrity, and agent behavior monitoring. First U.S. federal framework addressing AI agent security.
Singapore IMDA framework
First governance framework for agentic AI (Jan 2026). Key principle: organizations are liable for their agents' behavior, even when using third-party tools. Agents inherit the accountability obligations of their deployers.
International AI Safety Report
100+ experts from 30+ countries (Feb 2026). Finding: AI agents identified 77% of vulnerabilities in a real cybersecurity competition. The same capability that makes agents useful makes them dangerous when compromised.
Industry frameworks
OWASP Top 10 for Agentic Applications (Dec 2025)
10 risk categories including ASI-04 (supply chain), ASI-07 (output handling), and ASI-08 (permissions). The "least agency" principle: agents should have minimum permissions needed. See OWASP agentic top 10 for the full mapping.
OWASP MCP Top 10
Dedicated project for Model Context Protocol risks: token mismanagement, shadow MCP servers, tool poisoning. See MCP server security for technical details.
Berkeley CLTC risk profile
Agentic AI Risk Management Standards Profile (late 2025). Maps agent-specific risks to existing risk management standards.
Forrester AEGIS
Six-domain security framework for autonomous AI systems. Covers identity, data, runtime, network, governance, and observability.
Contract gaps: Stanford CodeX findings
Source: Stanford CodeX FutureLaw Workshop, Jan 2025
92%
AI vendors claim broad data usage rights
17%
Commit to full regulatory compliance
88%
Impose self-liability caps
33%
Provide IP indemnification
Procurement teams need agent-specific contract language. Standard SaaS agreements don't address agent autonomy, tool usage, or behavioral accountability. The EU Product Liability Directive (deadline Dec 2026) explicitly includes AI as a "product" under strict liability.
What good governance looks like
Verification evidence per action
Every agent-generated change has a signed audit trail. Reviewers see what the agent did, what it tested, and how confident it is.
VEX statements for vulnerabilities
Machine-readable Vulnerability Exploitability eXchange documents. CRA Article 13.6 requires this format. XOR produces VEX for every triage.
SCITT provenance receipts
Supply Chain Integrity, Transparency, and Trust receipts. Cryptographic proof of what was scanned, when, and what passed.
Continuous monitoring, not one-time audit
Rug pull attacks invalidate point-in-time reviews. Agents that pass initial vetting can change behavior after deployment. Governance requires ongoing verification.
XOR's approach
XOR produces evidence for compliance. It does not certify compliance. The platform generates VEX statements, SCITT provenance receipts, and signed audit trails that map to CRA Article 13.6, NIST IR 8596, and OWASP Agentic Top 10 controls. Auditors get structured evidence. Compliance teams decide whether it meets their requirements.
See compliance evidence for artifact formats and agent compliance evidence for the IETF-based trace format.
Sources
- Stanford CodeX FutureLaw Workshop — AI Agents x Law (Jan 2025)
- NIST IR 8596 — AI Cybersecurity Framework Profile (Dec 2025)
- OWASP Top 10 for Agentic Applications (Dec 2025)
- OWASP MCP Top 10 (2025)
- EU Cyber Resilience Act — Article 13.6
- EU Product Liability Directive — AI as "product" (deadline Dec 2026)
- Singapore IMDA — Model AI Governance Framework for Agentic AI (Jan 2026)
- International AI Safety Report 2026 — 100+ experts, 30+ countries
- Berkeley CLTC — Agentic AI Risk Management Standards Profile
- Forrester AEGIS — Six-domain security framework for autonomous AI
- Gartner — 40% of agentic AI projects canceled by end 2027 (Jun 2025)
[NEXT STEPS]
Related pages
FAQ
Why don't existing governance frameworks cover agents?
Existing frameworks evaluate software as a static artifact. Agents are autonomous actors that make decisions, use tools, and interact with external services. Behavior risk requires different controls than code risk.
What does the EU Cyber Resilience Act require for agents?
CRA Article 13.6 requires manufacturers to share vulnerability information in machine-readable format. XOR produces VEX statements and signed audit trails that satisfy this requirement.
What is the Stanford CodeX finding on AI vendor contracts?
92% of AI vendors claim broad data usage rights. Only 17% commit to full regulatory compliance. 88% impose liability caps while only 38% cap customer liability (Stanford CodeX FutureLaw Workshop, Jan 2025).
Patch verification
XOR writes a verifier for each vulnerability, then tests agent-generated patches against it. If the fix passes, it ships. If not, the failure feeds back into the agent harness.
Automated vulnerability patching
AI agents generate fixes for known CVEs. XOR verifies each fix and feeds outcomes back into the agent harness so future patches improve.
Benchmark Results
62.7% pass rate. $2.64 per fix. Real data from 1,664 evaluations.
Benchmark Results
62.7% pass rate. $2.64 per fix. Real data from 1,664 evaluations.
Agent Cost Economics
Fix vulnerabilities for $2.64–$52 with agents. 100x cheaper than incident response. Real cost data.
Agent Configurations
13 agent-model configurations evaluated on real CVEs. Compare Claude Code, Codex, Gemini CLI, Cursor, and OpenCode.
Benchmark Methodology
How CVE-Agent-Bench evaluates 13 coding agents on 128 real vulnerabilities. Deterministic, reproducible, open methodology.
Agent Environment Security
AI agents run with real permissions. XOR verifies tool configurations, sandbox boundaries, and credential exposure.
Security Economics for Agentic Patching
Security economics for agentic patching. ROI models backed by verified pass/fail data and business-impact triage.
Validation Process
25 questions we ran against our own data before publishing. Challenges assumptions, explores implications, extends findings.
Cost Analysis
10 findings on what AI patching costs and whether it is worth buying. 1,664 evaluations analyzed.
Bug Complexity
128 vulnerabilities scored by difficulty. Floor = every agent fixes it. Ceiling = no agent can.
Agent Strategies
How different agents approach the same bug. Strategy matters as much as model capability.
Execution Metrics
Per-agent session data: turns, tool calls, tokens, and timing. See what happens inside an agent run.
Pricing Transparency
Every cost number has a source. Published pricing models, measurement methods, and provider rates.
Automated Vulnerability Patching and PR Review
Automated code review, fix generation, GitHub Actions hardening, safety checks, and learning feedback. One-click install on any GitHub repository.
Getting Started with XOR GitHub App
Install in 2 minutes. First result in 15. One-click GitHub App install, first auto-review walkthrough, and engineering KPI triad.
Platform Capabilities
One install. Seven capabilities. Prompt-driven. CVE autopatch, PR review, CI hardening, guardrail review, audit packets, and more.
Dependabot Verification
Dependabot bumps versions. XOR verifies they're safe to merge. Reachability analysis, EPSS/KEV enrichment, and structured verdicts.
Compliance Evidence
Machine-readable evidence for every triaged vulnerability. VEX statements, verification reports, and audit trails produced automatically.
Compatibility and Prerequisites
Languages, build systems, CI platforms, and repository types supported by XOR. What you need to get started.
Command Reference
Every @xor-hardener command on one page. /review, /describe, /ask, /patch_i, /issue_spec, /issue_implement, and more.
Continuous Learning from Verified Agent Runs
A signed record of every agent run. See what the agent did, verify it independently, and feed the data back so agents improve.
Signed Compliance Evidence for AI Agents
A tamper-proof record of every AI agent action. Produces evidence for SOC 2, EU AI Act, PCI DSS, and more. Built on open standards so auditors verify independently.
Compliance Evidence and Standards Alignment
How XOR signed audit trails produce evidence for SOC 2, EU AI Act, PCI DSS, NIST, and other compliance frameworks.
Agentic Third-Party Risk
33% of enterprise software will be agentic by 2028. 40% of those projects will be canceled due to governance failures. A risk overview for CTOs.
MCP Server Security
17 attack types across 4 surfaces. 7.2% of 1,899 open-source MCP servers contain vulnerabilities. Technical deep-dive with defense controls.
How Agents Get Attacked
20% jailbreak success rate. 42 seconds average. 90% of successful attacks leak data. Threat landscape grounded in published research.
OWASP Top 10 for Agentic Applications
The OWASP Agentic Top 10 mapped to real-world attack data and XOR capabilities. A reference page for security teams.
See which agents produce fixes that work
128 CVEs. 13 agents. 1,664 evaluations. Agents learn from every run.