USE CASES

Where Gatekeeper pays for itself.

Three workflows we actually run — not imagined personas. The patterns are the same whether your agent is Claude, a local Llama, or a small army of MCP servers.

01 Shell-using dev agents

Your coding agent has a shell. Don't give it root.

Claude Code, OpenAI's Codex, and a growing number of in-house dev agents run real commands on real filesystems. The failure modes aren't exotic — they're rm -rf in the wrong directory, curl | sh from a prompt-injected README, git push --force on the wrong branch.

What Gatekeeper does
  • Scope the shell to allowed_cwd_prefixes.
  • Hard-deny known dangerous patterns (rm -rf /, fork bombs, curl | sh).
  • Require approval for sudo, git push, anything touching ~/.ssh.
  • Log every command with cwd, env digest, exit code.
Real rule
shell.exec:
  allowed_cwd_prefixes:
    - "/workspace/"
  deny_patterns:
    - "rm -rf /"
    - "curl | sh"
    - ":(){ :|:& };:"
  approve_patterns:
    - "^sudo "
    - "^git push"
02 Research agents with HTTP

Research agents love an open egress. Close it.

Give an agent a browser and a scraping tool and within a week it will try to hit your internal metadata service, an S3 bucket from a pasted link, or localhost:5432 because "it saw a reference in a log." Gatekeeper's HTTP policy closes those silently.

What Gatekeeper does
  • Allowlist external hosts (*.finnhub.io, api.coinbase.com).
  • Block private / link-local / loopback IPs (SSRF protection).
  • Deny metadata endpoints (169.254.169.254 and friends).
  • Rate-limit per tool + per principal.
  • Redact auth headers from audit log args automatically.
Real rule
http.request:
  allowed_hosts:
    - "api.finnhub.io"
    - "api.coinbase.com"
    - "*.polymarket.com"
  deny_private_ips: true
  deny_hosts:
    - "169.254.169.254"
  redact_headers:
    - "authorization"
    - "x-api-key"
03 High-sensitivity workflows

Some actions need a human. Gatekeeper is the pause.

Trades, irreversible writes, emails that go to real people, prod deploys. You don't want to disallow them — you want the agent to propose them and wait. Gatekeeper's approval flow is designed for exactly that loop.

What Gatekeeper does
  • Mark tools decision: approve — never allow, never deny by default.
  • Generate HMAC-signed, single-use approval URLs. 1-hour expiry.
  • Agent blocks (or returns pending and polls) until approval lands.
  • Approver identity + timestamp captured in the audit log.
  • Notify via your existing channel — ntfy, email, Slack webhook — no Gatekeeper SaaS required.
The flow
agent → GK: place_trade(NVDA, buy, 10)
GK   → policy: decision=approve
GK   → ntfy:  "approve trade"
you  → click signed URL
GK   → nonce consumed, sig valid
GK   → broker.execute(...)
GK   → audit:  approver=you, ok=true
agent ← receipt
04 Cost-bounded agent fleets

LLM calls are tool calls. Budget them.

Once you have a dozen agents running against Claude or GPT, "oh my god my bill" shows up before "oh my god they broke production." Gatekeeper prices every LLM call (tokens × model rate, with cache-hit credit) and caps by agent, workflow, or tool — with the same three-decision model as everything else. Over-budget can pause for approval instead of hard-denying.

What Gatekeeper does
  • Per-call USD cost stamped into audit log.
  • /usage + /llm-usage endpoints show spend by actor, workflow, model.
  • Rolling-window budgets: daily / weekly / monthly.
  • Over-budget decision configurable: allow (warn), approve (pause), deny (block).
  • Cache hits and tool-use premiums tracked separately, not hidden.
Real rule
budgets:
  agent:finance-sync:
    daily_usd: 2.00
    decision: approve

  workflow:nightly-scrape:
    daily_usd: 5.00
    decision: approve

  agent:research_*:
    monthly_usd: 50.00
    decision: deny

Your agent workflow not here?

Tell us what you're trying to put a boundary around. Design partners get policy templates tailored to their workflow and direct founder access while they set it up.