Mesh Networking in CVE-Agent-Bench — 5 vulnerabilities tested
5 vulnerability samples from a mesh networking library, generating 75 evaluations across 15 agents.
Overview
This mesh networking library is an implementation of the Thread networking protocol backed by Google, Apple, and Amazon. Thread is a low-power mesh networking standard used in the Matter smart home specification. The implementation handles mesh routing, encryption, and device discovery in resource-constrained IoT environments. IoT devices often cannot be updated easily, making initial code security critical.
Benchmark coverage
5 vulnerability samples from this mesh networking library are included in CVE-Agent-Bench, generating 75 individual evaluations across 15 agent configurations. These samples include heap overflow vulnerabilities in mesh networking, stack buffer overflows, and memory corruption in protocol parsing.
Vulnerability classes
Mesh networking samples cover vulnerability patterns in embedded network protocol implementation:
- Stack buffer overflows in CoAP message handling where variable-length payloads exceed stack allocation
- Heap buffer overflows in mesh routing where neighbor discovery packets trigger out-of-bounds writes
- Out-of-bounds reads in frame parsing where field offsets are not validated against frame size
- Integer overflow in length calculations leading to undersized stack allocation
- Assertion failures in protocol state machines where unexpected messages cause crashes
- Resource exhaustion where crafted packets trigger excessive memory allocation in constrained environments
Why mesh networking bugs are interesting for agent evaluation
Mesh networking vulnerabilities test an agent's ability to understand embedded networking protocols and memory constraints. The codebase handles complex mesh routing state machines with limited RAM and CPU. Bugs often involve buffer handling in constrained memory environments or incorrect bounds checking in protocol parsing. Agents must generate fixes that close security gaps while fitting within the tight resource constraints of IoT devices.
IoT mesh networks are particularly challenging because devices may be physically inaccessible after deployment, and a single compromised device can attack all neighbors in the mesh. This makes the initial implementation security exceptionally important.
Agent performance on mesh networking
Per-project performance data is not yet published. Aggregate results across all codebases are available at the full results page, where you can compare agents by pass rate and cost. The benchmark methodology documents the evaluation approach.
Related codebases
Codebases with similar embedded and resource-constrained implementation challenges:
- Archive Library, binary format parsing with variable-length data
- Network Switch, network packet processing with protocol parsing
- Data Compressor, high-performance algorithms under memory constraints
Explore more
- Full benchmark results
- Agent profiles
- Methodology
- Economics analysis, cost per verified patch
FAQ
How does mesh networking relate to CVE-Agent-Bench?
Mesh networking powers Matter smart home devices. 5 samples test agents on embedded systems, resource constraints, and mesh networking protocol correctness.
Benchmark Results
62.7% pass rate. $2.64 per fix. Real data from 1,920 evaluations.
Benchmark Methodology
How XOR benchmarks AI coding agents on real security vulnerabilities. Reproducible, deterministic, and transparent.
Benchmark Results
62.7% pass rate. $2.64 per fix. Real data from 1,920 evaluations.
See which agents produce fixes that work
128 CVEs. 15 agents. 1,920 evaluations. Agents learn from every run.