● v0.3.2 · Apache-2.0 · 359 tests · self-hosted

The LLM doesn't get direct access to your tools.

Gatekeeper sits between your AI agent and the real world — shell, filesystem, HTTP, memory, LLM calls. Every action is evaluated against policy. Allow, require human approval, or deny. Every decision — and every dollar — is logged. Works with Claude, GPT, local LLMs, and MCP servers.

$ npm install @runestone-labs/gatekeeper-client

45 seconds · what Gatekeeper enables, in production

YOUR AGENT
LLM
Claude · GPT · Llama · MCP
GATEKEEPER
policy · approval · audit
on your hardware
TOOLS
shell · files · HTTP · memory · LLM
the real world

Three decisions, not two

Binary allow-or-deny is why security tools get disabled. Any rule strict enough to catch real misuse also blocks legitimate work. Gatekeeper's third option is the one developers actually leave on.

● ALLOW

Fast-path the safe 90%.

Your agent's normal reads, scoped writes, and allowlisted HTTP fly through without interruption. Sub-millisecond policy evaluation.

◐ APPROVE

Pause for the risky 10%.

Sensitive actions generate an HMAC-signed, single-use approval URL. You click to allow. Replay-proof, 1-hour expiry, no SaaS in the middle.

✕ DENY

Hard-block the obvious bad.

Dangerous shell patterns, path traversal, SSRF, disallowed extensions — rejected before the tool is ever touched. The agent can't retry its way through.

Built for the actual threat model

Semantic tool rules, not regex.

Shell policy understands allowed_cwd_prefixes and dangerous-pattern lists. File policy enforces allowed_paths + denied extensions. HTTP gets DNS & IP allowlists plus SSRF protection.

tools:
  shell.exec:
    allowed_cwd_prefixes: ["/workspace/"]
    deny_patterns: ["rm -rf", ":(){ :|:& };:"]
    decision: approve   # human sign-off required

Append-only audit log.

Every decision — allow, deny, approval requested, approval consumed, executed — written as JSONL with actor, args (secrets redacted), risk flags, and policy hash. Daily file by default. Optional Postgres sink.

{"ts":"2026-04-22T14:03:11Z",
 "actor":"agent:finance-sync",
 "tool":"http.request",
 "decision":"approved",
 "risk":["external-api"],
 "policy_hash":"sha256:a3f9..."}

Role-based principals.

Tag agents with roles (local_dev, prod, research_readonly). Policies fork on role, so the same agent binary can be tightly caged in production and loose on your laptop — without branching the agent code.

Framework-agnostic.

HTTP in, HTTP out. Works with the Anthropic Claude SDK, OpenAI SDK, LangChain, LlamaIndex, custom agents, and MCP servers. One TypeScript client (@runestone-labs/gatekeeper-client) for the most common surface.

USD budgets & cost audit.

Every LLM call is priced (per-model input/output tokens, cache hits) and stamped into the audit log. /usage + /llm-usage show actual dollars per agent, per workflow, per day. Budgets ship dormant: observe first, set caps at ~2× p95, enforce with confidence.

budgets:
  agent:finance-sync:
    daily_usd: 2.00
    decision: approve     # over-budget → human sign-off
  agent:research_*:
    monthly_usd: 50.00
    decision: deny

Memory, with boundaries.

Optional knowledge-graph layer: entities, episodes, relationships. Agents request memory operations just like any other tool — upserts and queries flow through the same policy engine, so you can scope which agent writes what, and audit every change.

Where Gatekeeper fits

IS
  • → A runtime enforcement boundary between agent and tool.
  • Self-hosted. No SaaS proxy. Arguments stay on your disk.
  • Auditable. Plain JSONL, cryptographic approval URLs, open source.
  • Semantic. Understands shells, files, HTTP — not abstract ABAC.
IS NOT
  • → Prompt-injection mitigation. Guardrails AI does that; different layer.
  • → An LLM gateway or observability platform. Portkey and Langfuse do those.
  • → A replacement for OS-level sandboxing. Pair with it.
  • → Magic. We document what we catch and what we don't.

vs. the alternatives

Capability Gatekeeper MCP server alone DIY policy code
Allow / approve / deny tri-state decision Yes Allow / deny only If you build it
Append-only audit log JSONL out of the box No If you build it
Semantic rules (shell / fs / http / llm) Built in Tool-shape only If you build it
Per-principal cost budgets Built in No If you build it
Self-hosted, no SaaS proxy Yes Yes Yes
Framework-agnostic client Any HTTP MCP-only Whatever you wrote
License Apache-2.0 Varies by server Yours

MCP servers expose tools; they don't decide whether the call should happen. Gatekeeper sits in front of any of them — including MCP servers you already run.

Common questions

What is Runestone Gatekeeper?

Runestone Gatekeeper is a self-hosted policy, approval, budget, and audit layer for AI agent tool calls. The agent calls Gatekeeper instead of directly calling shell, filesystem, HTTP, memory, or LLM tools.

Is Gatekeeper an MCP server?

Gatekeeper can sit in front of MCP servers, but it is not limited to MCP. It is an HTTP enforcement boundary that can protect MCP tools, SDK-based agents, custom agents, and local automation.

Does Gatekeeper stop prompt injection?

Gatekeeper is not a prompt-injection detector. It enforces runtime policy after the model asks to take an action, so risky shell, file, HTTP, memory, or model calls can be allowed, denied, or held for approval.

Where does Gatekeeper run?

Gatekeeper is self-hosted. You can run it locally or in your own environment so tool arguments, approval flow, and audit logs stay under your control.

What does Gatekeeper log?

Gatekeeper writes append-only audit events for requests, policy decisions, approvals, denials, executions, errors, risk flags, policy hashes, and LLM spend when budget tracking is enabled.

Pilot support

Useful policy for a real agent stack, not a generic demo.

The OSS package works today. The paid pilot/support path is for teams that need Gatekeeper tuned around their framework, approval flow, audit trail, and risk model before agents touch production systems.

Policy review

Threat model, YAML rules, principal roles, and approval gates matched to your workflow.

Integration pairing

Claude SDK, OpenAI SDK, MCP, LangChain, LlamaIndex, or custom HTTP agents.

Audit and budget setup

Append-only logs, spend tracking, and reviewable evidence for operational risk.

Feature feedback loop

One pragmatic gap from your stack gets scoped directly into the roadmap.

Ready to put a boundary in front of your agents?

Install the open-source client today, or ask for focused help turning it into a production policy boundary for your agents.