The LLM doesn't get direct access to your tools.
Gatekeeper sits between your AI agent and the real world — shell, filesystem, HTTP, memory, LLM calls. Every action is evaluated against policy. Allow, require human approval, or deny. Every decision — and every dollar — is logged. Works with Claude, GPT, local LLMs, and MCP servers.
$
npm install @runestone-labs/gatekeeper-client
45 seconds · what Gatekeeper enables, in production
Three decisions, not two
Binary allow-or-deny is why security tools get disabled. Any rule strict enough to catch real misuse also blocks legitimate work. Gatekeeper's third option is the one developers actually leave on.
Fast-path the safe 90%.
Your agent's normal reads, scoped writes, and allowlisted HTTP fly through without interruption. Sub-millisecond policy evaluation.
Pause for the risky 10%.
Sensitive actions generate an HMAC-signed, single-use approval URL. You click to allow. Replay-proof, 1-hour expiry, no SaaS in the middle.
Hard-block the obvious bad.
Dangerous shell patterns, path traversal, SSRF, disallowed extensions — rejected before the tool is ever touched. The agent can't retry its way through.
Built for the actual threat model
Semantic tool rules, not regex.
Shell policy understands allowed_cwd_prefixes and dangerous-pattern lists.
File policy enforces allowed_paths + denied extensions. HTTP gets DNS & IP allowlists plus SSRF protection.
tools:
shell.exec:
allowed_cwd_prefixes: ["/workspace/"]
deny_patterns: ["rm -rf", ":(){ :|:& };:"]
decision: approve # human sign-off required Append-only audit log.
Every decision — allow, deny, approval requested, approval consumed, executed — written as JSONL with actor, args (secrets redacted), risk flags, and policy hash. Daily file by default. Optional Postgres sink.
{"ts":"2026-04-22T14:03:11Z",
"actor":"agent:finance-sync",
"tool":"http.request",
"decision":"approved",
"risk":["external-api"],
"policy_hash":"sha256:a3f9..."} Role-based principals.
Tag agents with roles (local_dev, prod, research_readonly).
Policies fork on role, so the same agent binary can be tightly caged in production and
loose on your laptop — without branching the agent code.
Framework-agnostic.
HTTP in, HTTP out. Works with the Anthropic Claude SDK, OpenAI SDK, LangChain, LlamaIndex,
custom agents, and MCP servers. One TypeScript client (@runestone-labs/gatekeeper-client)
for the most common surface.
USD budgets & cost audit.
Every LLM call is priced (per-model input/output tokens, cache hits) and stamped into the
audit log. /usage + /llm-usage show actual dollars per agent, per
workflow, per day. Budgets ship dormant: observe first, set caps at ~2× p95, enforce
with confidence.
budgets:
agent:finance-sync:
daily_usd: 2.00
decision: approve # over-budget → human sign-off
agent:research_*:
monthly_usd: 50.00
decision: deny Memory, with boundaries.
Optional knowledge-graph layer: entities, episodes, relationships. Agents request memory operations just like any other tool — upserts and queries flow through the same policy engine, so you can scope which agent writes what, and audit every change.
Where Gatekeeper fits
- → A runtime enforcement boundary between agent and tool.
- → Self-hosted. No SaaS proxy. Arguments stay on your disk.
- → Auditable. Plain JSONL, cryptographic approval URLs, open source.
- → Semantic. Understands shells, files, HTTP — not abstract ABAC.
- → Prompt-injection mitigation. Guardrails AI does that; different layer.
- → An LLM gateway or observability platform. Portkey and Langfuse do those.
- → A replacement for OS-level sandboxing. Pair with it.
- → Magic. We document what we catch and what we don't.
vs. the alternatives
| Capability | Gatekeeper | MCP server alone | DIY policy code |
|---|---|---|---|
| Allow / approve / deny tri-state decision | Yes | Allow / deny only | If you build it |
| Append-only audit log | JSONL out of the box | No | If you build it |
| Semantic rules (shell / fs / http / llm) | Built in | Tool-shape only | If you build it |
| Per-principal cost budgets | Built in | No | If you build it |
| Self-hosted, no SaaS proxy | Yes | Yes | Yes |
| Framework-agnostic client | Any HTTP | MCP-only | Whatever you wrote |
| License | Apache-2.0 | Varies by server | Yours |
MCP servers expose tools; they don't decide whether the call should happen. Gatekeeper sits in front of any of them — including MCP servers you already run.
Common questions
What is Runestone Gatekeeper?
Runestone Gatekeeper is a self-hosted policy, approval, budget, and audit layer for AI agent tool calls. The agent calls Gatekeeper instead of directly calling shell, filesystem, HTTP, memory, or LLM tools.
Is Gatekeeper an MCP server?
Gatekeeper can sit in front of MCP servers, but it is not limited to MCP. It is an HTTP enforcement boundary that can protect MCP tools, SDK-based agents, custom agents, and local automation.
Does Gatekeeper stop prompt injection?
Gatekeeper is not a prompt-injection detector. It enforces runtime policy after the model asks to take an action, so risky shell, file, HTTP, memory, or model calls can be allowed, denied, or held for approval.
Where does Gatekeeper run?
Gatekeeper is self-hosted. You can run it locally or in your own environment so tool arguments, approval flow, and audit logs stay under your control.
What does Gatekeeper log?
Gatekeeper writes append-only audit events for requests, policy decisions, approvals, denials, executions, errors, risk flags, policy hashes, and LLM spend when budget tracking is enabled.
Pilot support
Useful policy for a real agent stack, not a generic demo.
The OSS package works today. The paid pilot/support path is for teams that need Gatekeeper tuned around their framework, approval flow, audit trail, and risk model before agents touch production systems.
Policy review
Threat model, YAML rules, principal roles, and approval gates matched to your workflow.
Integration pairing
Claude SDK, OpenAI SDK, MCP, LangChain, LlamaIndex, or custom HTTP agents.
Audit and budget setup
Append-only logs, spend tracking, and reviewable evidence for operational risk.
Feature feedback loop
One pragmatic gap from your stack gets scoped directly into the roadmap.
Ready to put a boundary in front of your agents?
Install the open-source client today, or ask for focused help turning it into a production policy boundary for your agents.