Mar 23|OASB v0.3.0 -- product-agnostic adapter interface, arp-guard vs llm-guard benchmark
All updates

Open Agent Security Benchmark

97,127 hosts scanned.
One benchmark emerged.

OASB is the open standard for AI agent security — 46 controls, 3 maturity levels, built from real-world data.

$npx hackmyagent secure --benchmark oasb-1

Internet-wide scan data

The current state of AI agent security

HackMyAgent scanned the public internet for exposed AI agent infrastructure. The results informed which OASB controls matter most.

97,127

Hosts discovered

11,192

Hosts scanned

1,594

Vulnerable

1,190

CLAUDE.md exposed

645

MCP tools exposed

5,042

Outdated endpoints

Read the full research report

Three measurement systems

One benchmark, three specifications

OASB-1Compliance

Check agent compliance

CIS Benchmarks for AI agents. 46 controls across 10 categories with L1/L2/L3 maturity levels. Answers: “Is your agent secure?”

Tests46 controls, L1/L2/L3
Analogous toCIS Benchmarks
AudienceAgent developers, compliance teams
View OASB-1 specification
OASB-2Governance

Govern agent behavior

Behavioral governance for AI agents. 72 controls across 9 domains with 4 agent tiers. Answers: “Does your agent behave correctly?”

Tests72 controls, 4 tiers
Analogous toSOC 2 Trust Principles
AudienceAgent builders, governance teams
View OASB-2 specification
OASB EvalEvaluation

Evaluate security tools

MITRE ATT&CK Evaluations for AI agent security tools. 222 attack scenarios across 10 MITRE ATLAS techniques. Answers: “Does your EDR catch this?”

Tests222 attack scenarios
Analogous toMITRE ATT&CK Evaluations
AudienceSecurity tool vendors, evaluators
Explore OASB Eval

Your security team will ask what standard you are using.

Send them here.

OASB Eval

Verify your agent's security

Run the benchmark against your AI agent. Read the docs for CI/CD integration.

$npx hackmyagent secure --benchmark oasb-1