What is @zeroleaks/shield
Runtime prompt security package for LLM applications. Hardening, injection detection, and output sanitization in under 5ms.
What is @zeroleaks/shield
@zeroleaks/shield is a runtime prompt security package for LLM applications. It adds defense-in-depth to your AI stack by hardening system prompts, detecting injection attempts in user input, and sanitizing model output for leaked prompt fragments. All operations complete in under 5ms and never mutate your objects.
Three Core Capabilities
harden — Injects security rules into system prompts to resist instruction override, role hijacking, and prompt extraction. Configurable persona anchoring and anti-extraction directives.
detect — Heuristic-based injection detection on user input. Normalizes Unicode (NFKC), bounds input length, and matches against 10 pattern categories. Default threshold is medium; default action on detection is block.
sanitize — N-gram matching to detect leaked system prompt fragments in model output. Redacts matches before returning responses. Configurable n-gram size and similarity threshold.
Provider Wrappers
Shield provides drop-in wrappers for popular LLM clients. Each wrapper applies hardening, detection, and sanitization automatically:
- OpenAI —
shieldOpenAI(client, options)wrapschat.completions.create - Anthropic —
shieldAnthropic(client, options)wrapsmessages.create - Groq —
shieldGroq(client, options)wrapschat.completions.create - Vercel AI SDK —
shieldMiddleware(options)orshieldLanguageModelMiddleware(options)withwrapLanguageModelforgenerateText/streamText
Design Principles
- Non-mutating — Caller objects are never mutated; copies are used internally
- Unicode normalization — Input is normalized with NFKC before detection
- Length bounds — Input and output are truncated to 1MB by default to avoid DoS
- Fast — Target execution time under 5ms for all operations
Next Steps
Installation
Install via npm or bun. Peer dependencies are optional.
harden()
Add security rules to system prompts.
detect()
Detect prompt injection in user input.
sanitize()
Detect and redact leaked prompt fragments in output.
Provider Wrappers
OpenAI, Anthropic, Groq, and Vercel AI SDK integrations.
Errors
ShieldError, InjectionDetectedError, LeakDetectedError.
Threat Model
What Shield catches and what it does not.