PILOT SUPPORT

Make Gatekeeper useful
in your stack.

Direct founder time, policy review, integration pairing, and a concrete feedback loop for teams putting agents into production, not roadmaps.

● OPEN SOURCE

Don't need us. Just install it.

OSS works today. v0.3.2, Apache‑2.0, 359 tests, production‑tested. No signup required.

npm i @runestone-labs/gatekeeper-client

docker run -p 7337:7337 \
  ghcr.io/runestone-labs/gatekeeper:latest
Read the source →
Focused pilots
◆ PILOT/SUPPORT

Work directly with the founder. For 6 weeks.

If you've got a real agent workload and real stakes, we'll tune Gatekeeper to your stack, co-write practical policy, and turn the sharpest missing piece into roadmap-grade feedback.

What you get

Threat-model review
Map the concrete tools your agent can touch, the failure modes that matter, and the decisions that should be allow, approve, or deny.
Policy templates
Co-write YAML for your stack: MCP, Claude SDK, OpenAI SDK, LangChain, LlamaIndex, or custom HTTP agents.
Integration pairing
Work through the client wiring, local deployment shape, environment boundaries, and approval callback path.
Audit setup
Make the JSONL or Postgres trail useful enough for incident review, compliance questions, and operator handoff.
Budget controls
Turn LLM spend logging into practical per-agent or per-workflow limits without breaking normal work.
Roadmap-grade gap review
If Gatekeeper is missing something important for your workload, we scope it clearly instead of hand-waving around it.
Go-live review
Before production use, we review the policy, logs, approval path, and rollback plan against the original threat model.

Is this for you?

● YES, IF
  • → Your agent already hits real systems — shell, prod APIs, databases, cloud.
  • → You've had "the incident," or you're engineering enough to know one is coming.
  • → Compliance (HIPAA / SOX / EU AI Act) is going to ask for an audit trail.
  • → You want to put a policy layer between your LLM and the real world and actually keep it turned on.
  • → You can give us 45 min/week for six weeks and an honest use case to observe.
○ PROBABLY NOT, IF
  • → You're evaluating agent-security vendors and want a 40-page comparison.
  • → Your agent only calls an LLM, no tools. You don't need Gatekeeper yet.
  • → You want a managed SaaS proxy that handles everything. We're local‑first.
  • → You want "AI safety" — philosophical framing, not enforcement. Look elsewhere.
  • → You don't have a concrete workload yet. Come back when you do.

How the six weeks go

  1. Week 0
    Discovery
    You describe the agent, the tools it can call, and the failure modes that would actually hurt.
  2. Week 1
    First policy
    We draft the first Gatekeeper policy and get it running locally against a representative workflow.
  3. Weeks 2–3
    Integration
    Wire the client, approval flow, audit sink, and budget logging into your dev environment.
  4. Weeks 4–5
    Hardening
    Tune false positives, review blocked actions, and close the biggest gaps found by real usage.
  5. Week 6
    Go-live review
    Decide whether the policy is ready for production, what still needs human approval, and what should stay out of scope.

Apply.

Two fields. No drop-downs. I read every one personally and reply within 48 hours — or tell you honestly that we're not a fit, fast.

Specific beats polished. The more concrete the workload and the failure mode, the faster I can tell you whether this is a fit.

Prefer email? evan@runestonelabs.io

Focused pilots. First reply in 48 hours. No hard sell.