● v0.3.1 · Apache-2.0 · 268 tests · self-hosted

The LLM doesn't get direct access to your tools.

Gatekeeper sits between your AI agent and the real world — shell, filesystem, HTTP, memory, LLM calls. Every action is evaluated against policy. Allow, require human approval, or deny. Every decision — and every dollar — is logged. Works with Claude, GPT, local LLMs, and MCP servers.

$ npm install @runestone-labs/gatekeeper-client
YOUR AGENT
LLM
Claude · GPT · Llama · MCP
GATEKEEPER
policy · approval · audit
on your hardware
TOOLS
shell · files · HTTP · memory · LLM
the real world

Three decisions, not two

Binary allow-or-deny is why security tools get disabled. Any rule strict enough to catch real misuse also blocks legitimate work. Gatekeeper's third option is the one developers actually leave on.

● ALLOW

Fast-path the safe 90%.

Your agent's normal reads, scoped writes, and allowlisted HTTP fly through without interruption. Sub-millisecond policy evaluation.

◐ APPROVE

Pause for the risky 10%.

Sensitive actions generate an HMAC-signed, single-use approval URL. You click to allow. Replay-proof, 1-hour expiry, no SaaS in the middle.

✕ DENY

Hard-block the obvious bad.

Dangerous shell patterns, path traversal, SSRF, disallowed extensions — rejected before the tool is ever touched. The agent can't retry its way through.

Built for the actual threat model

Semantic tool rules, not regex.

Shell policy understands allowed_cwd_prefixes and dangerous-pattern lists. File policy enforces allowed_paths + denied extensions. HTTP gets DNS & IP allowlists plus SSRF protection.

tools:
  shell.exec:
    allowed_cwd_prefixes: ["/workspace/"]
    deny_patterns: ["rm -rf", ":(){ :|:& };:"]
    decision: approve   # human sign-off required

Append-only audit log.

Every decision — allow, deny, approval requested, approval consumed, executed — written as JSONL with actor, args (secrets redacted), risk flags, and policy hash. Daily file by default. Optional Postgres sink.

{"ts":"2026-04-22T14:03:11Z",
 "actor":"agent:finance-sync",
 "tool":"http.request",
 "decision":"approved",
 "risk":["external-api"],
 "policy_hash":"sha256:a3f9..."}

Role-based principals.

Tag agents with roles (local_dev, prod, research_readonly). Policies fork on role, so the same agent binary can be tightly caged in production and loose on your laptop — without branching the agent code.

Framework-agnostic.

HTTP in, HTTP out. Works with the Anthropic Claude SDK, OpenAI SDK, LangChain, LlamaIndex, custom agents, and MCP servers. One TypeScript client (@runestone-labs/gatekeeper-client) for the most common surface.

USD budgets & cost audit.

Every LLM call is priced (per-model input/output tokens, cache hits) and stamped into the audit log. /usage + /llm-usage show actual dollars per agent, per workflow, per day. Budgets ship dormant: observe first, set caps at ~2× p95, enforce with confidence.

budgets:
  agent:finance-sync:
    daily_usd: 2.00
    decision: approve     # over-budget → human sign-off
  agent:research_*:
    monthly_usd: 50.00
    decision: deny

Memory, with boundaries.

Optional knowledge-graph layer: entities, episodes, relationships. Agents request memory operations just like any other tool — upserts and queries flow through the same policy engine, so you can scope which agent writes what, and audit every change.

Where Gatekeeper fits

IS
  • → A runtime enforcement boundary between agent and tool.
  • Self-hosted. No SaaS proxy. Arguments stay on your disk.
  • Auditable. Plain JSONL, cryptographic approval URLs, open source.
  • Semantic. Understands shells, files, HTTP — not abstract ABAC.
IS NOT
  • → Prompt-injection mitigation. Guardrails AI does that; different layer.
  • → An LLM gateway or observability platform. Portkey and Langfuse do those.
  • → A replacement for OS-level sandboxing. Pair with it.
  • → Magic. We document what we catch and what we don't.

Ready to put a boundary in front of your agents?

Early access is free. Design partners get weekly syncs, direct founder Slack, roadmap influence, and the hosted tier free at launch.