The LLM doesn't get direct access to your tools.
Gatekeeper sits between your AI agent and the real world — shell, filesystem, HTTP, memory, LLM calls. Every action is evaluated against policy. Allow, require human approval, or deny. Every decision — and every dollar — is logged. Works with Claude, GPT, local LLMs, and MCP servers.
npm install @runestone-labs/gatekeeper-client
Three decisions, not two
Binary allow-or-deny is why security tools get disabled. Any rule strict enough to catch real misuse also blocks legitimate work. Gatekeeper's third option is the one developers actually leave on.
Fast-path the safe 90%.
Your agent's normal reads, scoped writes, and allowlisted HTTP fly through without interruption. Sub-millisecond policy evaluation.
Pause for the risky 10%.
Sensitive actions generate an HMAC-signed, single-use approval URL. You click to allow. Replay-proof, 1-hour expiry, no SaaS in the middle.
Hard-block the obvious bad.
Dangerous shell patterns, path traversal, SSRF, disallowed extensions — rejected before the tool is ever touched. The agent can't retry its way through.
Built for the actual threat model
Semantic tool rules, not regex.
Shell policy understands allowed_cwd_prefixes and dangerous-pattern lists.
File policy enforces allowed_paths + denied extensions. HTTP gets DNS & IP allowlists plus SSRF protection.
tools:
shell.exec:
allowed_cwd_prefixes: ["/workspace/"]
deny_patterns: ["rm -rf", ":(){ :|:& };:"]
decision: approve # human sign-off required Append-only audit log.
Every decision — allow, deny, approval requested, approval consumed, executed — written as JSONL with actor, args (secrets redacted), risk flags, and policy hash. Daily file by default. Optional Postgres sink.
{"ts":"2026-04-22T14:03:11Z",
"actor":"agent:finance-sync",
"tool":"http.request",
"decision":"approved",
"risk":["external-api"],
"policy_hash":"sha256:a3f9..."} Role-based principals.
Tag agents with roles (local_dev, prod, research_readonly).
Policies fork on role, so the same agent binary can be tightly caged in production and
loose on your laptop — without branching the agent code.
Framework-agnostic.
HTTP in, HTTP out. Works with the Anthropic Claude SDK, OpenAI SDK, LangChain, LlamaIndex,
custom agents, and MCP servers. One TypeScript client (@runestone-labs/gatekeeper-client)
for the most common surface.
USD budgets & cost audit.
Every LLM call is priced (per-model input/output tokens, cache hits) and stamped into the
audit log. /usage + /llm-usage show actual dollars per agent, per
workflow, per day. Budgets ship dormant: observe first, set caps at ~2× p95, enforce
with confidence.
budgets:
agent:finance-sync:
daily_usd: 2.00
decision: approve # over-budget → human sign-off
agent:research_*:
monthly_usd: 50.00
decision: deny Memory, with boundaries.
Optional knowledge-graph layer: entities, episodes, relationships. Agents request memory operations just like any other tool — upserts and queries flow through the same policy engine, so you can scope which agent writes what, and audit every change.
Where Gatekeeper fits
- → A runtime enforcement boundary between agent and tool.
- → Self-hosted. No SaaS proxy. Arguments stay on your disk.
- → Auditable. Plain JSONL, cryptographic approval URLs, open source.
- → Semantic. Understands shells, files, HTTP — not abstract ABAC.
- → Prompt-injection mitigation. Guardrails AI does that; different layer.
- → An LLM gateway or observability platform. Portkey and Langfuse do those.
- → A replacement for OS-level sandboxing. Pair with it.
- → Magic. We document what we catch and what we don't.
Ready to put a boundary in front of your agents?
Early access is free. Design partners get weekly syncs, direct founder Slack, roadmap influence, and the hosted tier free at launch.