ZeroLeaks
Getting Started

Run Your First Scan

Walk through running a security scan on your system prompt in under 5 minutes.

Run Your First Scan

This guide walks you through running your first ZeroLeaks security scan. The process takes a few minutes from start to finish.

Prerequisites

  • A ZeroLeaks account (see Create an account)
  • A system prompt you want to test (e.g., your AI assistant's instructions)

Step-by-Step

Paste your system prompt

In the dashboard, open the New Scan flow. Paste your full system prompt into the text area. The prompt is sent to the scan engine and used as the target for extraction and injection attacks.

Keep the prompt as close to production as possible. Redacting sensitive data is fine, but changing behavior may affect scan accuracy.

Choose scan type

Select one of three types:

  • Full (recommended): Runs both extraction and injection in parallel
  • Extraction: Tests prompt leakage only (30 adaptive turns)
  • Injection: Tests prompt injection only (23 probes across 8 types)

For most use cases, Full gives the best coverage in a single run. All scan types run in sandbox mode with tool execution testing, canary tokens, and kill chain detection when applicable.

Click Scan

Click Start Scan to queue the job. The scan runs on a background worker, so you can navigate away. The page will poll for completion and redirect you to the results when done.

Wait for results

Scan duration varies by type:

  • Full: ~8-20 minutes (both run in parallel)
  • Extraction: ~5-15 minutes
  • Injection: ~3-8 minutes

You can watch progress on the scan page or return later from the dashboard.

Optional: Scan Settings

Before starting, you can adjust:

  • Target model: The OpenRouter model used to simulate the target (default: Claude Sonnet 4.5)
  • Temperature: Affects attack variation (higher = more diverse probes)
  • Tools: Define your agent's tools in JSON for tool-specific attack probes

These are available in the collapsible Advanced section. Defaults work well for most prompts.

Next Steps

Once the scan completes, you will see the results page with your health score, vulnerability status, findings, and recommendations. Use the report to identify weak spots and apply hardening before deploying.

On this page