AgentGuard Cost Model
Your agent's API costs apply. ZeroLeaks sends ~105 requests; target model inference is billed to you.
AgentGuard Cost Model
Understanding who pays for what is important before running AgentGuard scans.
Your Costs (Target Model)
Your agent's API costs apply. AgentGuard sends approximately 105 HTTP requests to your endpoint per scan. Each request triggers a real inference on your model using your provider and keys.
- Phase 1 — ~53 requests (30 extraction + 23 injection)
- Phase 2 — ~60+ requests (tool hijacking, indirect injection, authority, protocol, multi-turn, data leakage, legacy, dynamic probes)
Every request is a real user message to your agent. Your infrastructure (OpenAI, Anthropic, etc.) bills you for:
- Input tokens (system prompt + conversation + attack prompt)
- Output tokens (agent response)
Estimate cost per scan by multiplying your per-token rate by typical request size. A scan can consume on the order of hundreds of thousands of tokens depending on your agent's context and response length.
ZeroLeaks Costs (Attacker and Evaluator)
ZeroLeaks covers the cost of:
- Attacker model — Generates attack prompts (Strategist, Attacker, Mutator)
- Evaluator model — Analyzes agent responses for leakage and compliance (Claude)
These run on OpenRouter using ZeroLeaks' API key. You are not billed for attacker or evaluator inference.
Summary
| Component | Who pays | Notes |
|---|---|---|
| Target model (your agent) | You | ~105 requests, your provider/keys |
| Attacker model | ZeroLeaks | OpenRouter, ZeroLeaks key |
| Evaluator model | ZeroLeaks | Claude via OpenRouter, ZeroLeaks key |
Recommendations
- Estimate before scanning — Check your provider's pricing and typical request size. A scan may cost a few dollars or more depending on model and context.
- Use staging endpoints — Test against a staging or sandbox agent first if cost is a concern.
- Rate limits — Ensure your endpoint can handle ~105 requests within a few minutes. AgentGuard throttles between probes but does not batch.
- Idle agents — If your agent scales to zero, the first requests may be slow; timeouts are 45 seconds per request.