Attack Agents
Strategist, Attacker, Evaluator, Mutator, Inspector, Orchestrator. Each has a create function exported.
Attack Agents
zeroleaks uses six specialized agents that coordinate during a scan. Each agent has a create function you can use to instantiate it directly for custom integrations.
Strategist
Role: Selects attack strategy based on target analysis, conversation history, and findings.
The Strategist analyzes the target's responses, tracks leak status, and recommends which attack strategy to use next. It also manages phase transitions (reconnaissance, profiling, soft probe, escalation, exploitation, persistence) and can request conversation resets when stuck.
import { createStrategist, type Strategist } from "zeroleaks";
const strategist = createStrategist();
const output = await strategist.selectStrategy({
turn: 5,
history: conversationHistory,
findings: [],
leakStatus: "none",
lastEvaluatorFeedback: "...",
});
// output.selectedStrategy, output.shouldReset, output.phaseTransitionExports: createStrategist, Strategist
Attacker
Role: Generates attack prompts based on strategy, defense profile, and evaluator feedback.
The Attacker uses TAP-style attack generation. It selects probes from the library, adapts to the defense profile, and can use vector memory to avoid repeating failed attacks. Supports Best-of-N variation generation via the Mutator.
import { createAttacker, type Attacker } from "zeroleaks";
const attacker = createAttacker({
maxBranchingFactor: 3,
maxTreeDepth: 4,
pruningThreshold: 0.3,
});
const output = await attacker.generateAttack({
history: conversationHistory,
strategy: selectedStrategy,
defenseProfile: defenseProfile,
phase: "escalation",
evaluatorFeedback: "...",
previousAttackNode: lastNode,
});
// output.attack.prompt, output.attack.technique, output.attack.categoryExports: createAttacker, Attacker
Evaluator
Role: Analyzes target responses for information leakage and compliance with attacker intent.
The Evaluator determines whether the target leaked system prompt content, followed injected instructions, or revealed sensitive information. It returns leak status, confidence, extracted content, and recommendations for the next attack.
import { createEvaluator } from "zeroleaks";
const evaluator = createEvaluator();
const result = await evaluator.evaluate({
attackPrompt: "...",
targetResponse: "...",
conversationHistory: [],
systemPrompt: "...",
attackNode: attackNode,
});
// result.status, result.confidence, result.extractedContent, result.recommendationExports: createEvaluator, Evaluator
Mutator
Role: Produces Best-of-N variations of attack prompts.
The Mutator generates semantic variations of an attack prompt. The engine evaluates each variation in parallel (when enabled) and selects the one that performed best.
import { createMutator, type Mutator } from "zeroleaks";
const mutator = createMutator();
const mutations = await mutator.bestOfN(attackPrompt, 3);
// mutations.variations: string[]Exports: createMutator, Mutator
Inspector
Role: TombRaider-style defense fingerprinting. Analyzes target responses to identify known defense systems and recommend bypass techniques.
The Inspector compares target response patterns against a database of known defenses (e.g. Azure Prompt Shield, OpenAI Moderation). When a match is found, it suggests techniques with documented success rates.
import { createInspector, DEFENSE_DATABASE } from "zeroleaks";
const inspector = createInspector();
const output = await inspector.analyze({
conversationHistory: [],
recentResponses: ["..."],
});
// output.fingerprint, output.suggestedBypasses, output.confidenceExports: createInspector, Inspector, DEFENSE_DATABASE
Orchestrator
Role: Coordinates multi-turn attack sequences (Siren, Echo Chamber, TombRaider patterns).
The Orchestrator manages predefined multi-turn sequences that simulate human jailbreak behaviors. It uses adaptive temperature scheduling (AutoAdv-inspired) and provides step-by-step prompts for gradual escalation.
import {
createOrchestrator,
SIREN_SEQUENCE,
ECHO_CHAMBER_SEQUENCE,
TOMBRAIDER_SEQUENCE,
DEFAULT_TEMPERATURE_CONFIG,
} from "zeroleaks";
const orchestrator = createOrchestrator();
const state = await orchestrator.getNextStep({
sequence: SIREN_SEQUENCE,
currentStep: 2,
conversationHistory: [],
temperatureState: { ... },
});
// state.prompt, state.nextStep, state.shouldEvaluateExports: createOrchestrator, MultiTurnOrchestrator, SIREN_SEQUENCE, ECHO_CHAMBER_SEQUENCE, TOMBRAIDER_SEQUENCE, DEFAULT_TEMPERATURE_CONFIG
Agent Summary
| Agent | Create function | Primary output |
|---|---|---|
| Strategist | createStrategist() | Strategy selection, phase transition, reset decision |
| Attacker | createAttacker(config) | Attack prompt, technique, category |
| Evaluator | createEvaluator() | Leak status, confidence, recommendation |
| Mutator | createMutator() | Best-of-N prompt variations |
| Inspector | createInspector(model?) | Defense fingerprint, bypass suggestions |
| Orchestrator | createOrchestrator(config?) | Multi-turn step prompt, temperature state |