Skip to main content
[PATCHING VERIFICATION PLATFORM]

Detect. Patch. Verify. Learn.

One loop that patches vulnerabilities, proves fixes work, and makes your agents smarter. 13 agents. 128 CVEs. Scaling to 6,138+.

verification
harfbuzz · CVE-2024-11033
Claude Opus 4.6● 14/14 tests
62.7%Best pass rate
1,664Verified evaluations
$2.64Cheapest verified fix
[PATCH + VERIFY]

Automated patching with proof that fixes work.

How verification works

XOR detects the vulnerability, dispatches an agent to patch it, and writes a verifier to confirm the fix resolves the specific CVE. Best pass rate: 62.7%. 1,664 evaluations completed.

[CONTINUOUS LEARNING]

Agents improve with every verification cycle.

Learning data

Failed fixes are the primary learning signal. XOR upgrades the agent harness — system prompt, tools, memory — after every run. Pass rates go up. Costs go down. $2.64 to $52 per verified fix across 13 configurations.

[AUDIT-READY]

Compliance evidence from every run.

Standards alignment

Every agent action is cryptographically signed and logged. Produces evidence for SOC 2, FedRAMP, EU AI Act, Cyber Resilience Act, and PCI DSS. Built on an open IETF Internet-Draft.

[GET STARTED]

Two interfaces. One verification engine.

GitHub App: automates PR review, fix generation, and CI hardening on your repos. Agent Plugin: wraps your coding agent in a verification harness with secure skills and memory. Choose one or both.

terminal
[HOW IT WORKS]

Detect. Patch. Verify. Learn. Repeat.

XOR detects the vulnerability, dispatches an agent to write a fix, tests the fix against a verifier it wrote for the specific CVE, records the result, and feeds the outcome back into the agent harness. Failed fixes teach agents what to avoid. Passing fixes expand the training set. Every cycle makes agents more accurate and cheaper.

verification pipeline
Detect Patch Verify Learn
[01]DETECT
Identify the CVE
[02]PATCH
Agent generates fix
[03]VERIFY
Test fix against CVE
[04]LEARN
Feed results back
Evidence
detect: CVE-2024-11033 harfbuzzpatch: claude-opus-4-6 23 linesverify: 14/14 tests passlearn: Memory + harness updated
Every cycle makes agents more accurate and cheaper
[CONTINUOUS LEARNING]
0Verified
0.0%Pass rate+12.8pp
$0.00Cost/fix-$49.36
0Learning cycles
Pass rate over timeLearning signal: 370 failures fed back
30%40%50%BaselineBest Agent
Latest: harfbuzz/harfbuzzVERIFIED
[AGENT SKILLS]

Four skills. Every wrapped agent.

The Agent Plugin provides four core skills to every coding agent it wraps.

[Scan]

Identify vulnerabilities in the target codebase.

[Audit]

Verify agent tool configurations, sandbox boundaries, and credential exposure.

[Report]

Generate evidence reports with pass/fail outcomes and audit trails.

[Sign]

Cryptographically sign the verification record (COSE_Sign1).

[WHAT YOU GET]

Three artifacts. Every run.

Evidence Report

Attached to every PR. Shows the bug, the fix, test results, and pass/fail outcome.

Signed Audit Trail

Cryptographically signed record of every action the agent took. Every tool used, file edited, and reasoning step.

Benchmark Report

128 vulnerability test cases. 13 agents. 1,664 results. Pass rates, cost per fix, difficulty scores.

No auto-merge

Every change requires verification. No shortcuts.

No unmonitored runs

If XOR can't observe the agent, it can't verify the output.

No claims without data

Every number on this page is from verified benchmarks.

Free benchmark report. GitHub App free for open source. Agent Plugin free during beta. No credit card.
READY TO START

$xor patch --verify --learn

Book a demo
Verification Platform for AI Coding Agents | XOR