[PATCHING]

Automated vulnerability patching

AI agents generate fixes for known CVEs. XOR verifies each fix and feeds outcomes back into the agent harness so future patches improve.

From CVE to verified fix

AI agents generate fixes for known security vulnerabilities. XOR writes a verifier for each one and tests the fix against it. 6,138+ vulnerabilities across 250+ open-source projects. 136 tested so far, each packaged with a verifier and a known-good fix to compare against. Results feed back into agents.

Deterministic verification

Each vulnerability is packaged with a known-vulnerable environment, a test harness, and automated verification. Results are deterministic and reproducible. Failed attempts become learning data for the next run.

62.7%

Top agent pass rate

$2.64

Cheapest per fix

75%

Best possible (all agents combined)

How automated patching works

Detect

A vulnerability is found by your scanner (Snyk, Dependabot, or manual triage).

Patch

A coding agent generates a fix. It reads the vulnerable code, understands the bug, and writes the fix.

Verify

XOR writes a verifier for the vulnerability in an isolated environment, applies the fix, and checks if it passes. Pass or fail.

Ship

If it passes, XOR opens a PR with the test report. If it fails, the bug stays open for human review.

Which agent for which vulnerability?

Different agents have different strengths. The best agent by accuracy isn't the cheapest. The cheapest isn't the fastest. Pick based on your priority.

Priority	Best agent	Pass rate	Cost/fix
Accuracy	claude-opus-4-6	62.7%	$87
Cost	gemini-3-pro	40.4%	$2.64
Balance	codex-gpt-5.2	52.2%	$6.38

See the full leaderboard →

Before and after

// BEFORE — vulnerable function (buffer overflow)

void process_input(char *buf, size_t len) {

char local[256];

memcpy(local, buf, len); // no bounds check

}

// AFTER — agent-patched (bounds check added)

void process_input(char *buf, size_t len) {

char local[256];

if (len > sizeof(local)) len = sizeof(local);

memcpy(local, buf, len);

}

$ xor verify --sample harfbuzz-11033

[PASS] — safety checks pass, bug no longer triggers ✓

Automate with GitHub App

Install the XOR GitHub App on your repos. When a coding agent opens a PR, XOR tests it automatically. Pass/fail result on the PR. Free for open source projects.

Install on GitHub →

[NEXT STEPS]

Start patching

Install GitHub App →

Full leaderboard →

How verification works →

FAQ

How does automated patching work?

XOR dispatches an agent to write a fix for a known CVE. The agent generates a patch. XOR runs the patch against a verifier written for the specific vulnerability. If the fix passes, it ships.

Which agents can generate patches?

Any coding agent: Claude Code, Codex, Gemini CLI, Cursor, or custom agents. The GitHub App monitors the code change and runs verification automatically.

What happens if the patch fails?

Failed patches are rejected. The failure data feeds back into the agent harness as a learning signal for the next run.

[RELATED TOPICS]

Patch verification

XOR writes a verifier for each vulnerability, then tests agent-generated patches against it. If the fix passes, it ships. If not, the failure feeds back into the agent harness.

Benchmark Results

62.7% pass rate. $2.64 per fix. Real data from 1,736 evaluations.

Benchmark Results

62.7% pass rate. $2.64 per fix. Real data from 1,736 evaluations.

Agent Cost Economics

Fix vulnerabilities for $2.64–$87 with agents. 100x cheaper than incident response. Real cost data.

Agent Configurations

13 agent-model configurations evaluated on real CVEs. Compare Claude Code, Codex, Gemini CLI, Cursor, and OpenCode.

Benchmark Methodology

How CVE-Agent-Bench evaluates 13 coding agents on 136 real vulnerabilities. Deterministic, reproducible, open methodology.

Agent Environment Security

AI agents run with real permissions. XOR verifies tool configurations, sandbox boundaries, and credential exposure.

Security Economics for Agentic Patching

Security economics for agentic patching. ROI models backed by verified pass/fail data and business-impact triage.

Automated Vulnerability Patching and PR Review

Automated code review, fix generation, GitHub Actions hardening, safety checks, and learning feedback. One-click install on any GitHub repository.

Continuous Learning from Verified Agent Runs

A signed record of every agent run. See what the agent did, verify it independently, and feed the data back so agents improve.

Signed Compliance Evidence for AI Agents

A tamper-proof record of every AI agent action. Produces evidence for SOC 2, EU AI Act, PCI DSS, and more. Built on open standards so auditors verify independently.

Compliance Evidence and Standards Alignment

How XOR signed audit trails produce evidence for SOC 2, EU AI Act, PCI DSS, NIST, and other compliance frameworks.

See which agents produce fixes that work

136 CVEs. 13 agents. 1,736 evaluations. Agents learn from every run.