Allow, approve, deny — why binary policy fails for AI agents
Allow/deny is why people disable security tools. A third decision — pause for human approval — is the difference between a policy engine you leave on and one you wrap in try/except.
Every time someone wires an AI agent to real tools — shell, filesystem, HTTP, a production database — they hit the same wall within a week. They start with allow/deny. It breaks. They loosen it. It breaks in the other direction. Eventually they turn it off.
The problem isn’t the agent. It’s the decision space.
Why binary policy fails
Imagine you’re writing a policy for a coding agent’s shell access.
The safe thing is to deny everything. But then the agent can’t do its job. So you write
an allowlist: npm, pnpm, git status, ls, maybe grep. It works for a day.
Then someone asks the agent to “install the missing dependency,” and it needs apt install.
Now you need sudo. You add sudo apt install to the allowlist. The next week it needs
sudo apt update. You add that too. By month two, your allowlist is 40 entries long and
the last three are rules like sudo .* because you gave up.
Or you go the other way. Deny the obviously dangerous things: rm -rf /, curl | sh,
:(){ :|:& };:. Allow everything else. And then some day, the agent decides to
chown -R your home directory, or mv ~/.ssh ~/.ssh.bak, and you realize your denylist
was a list of things you thought of, not a list of things an agent might do.
Both directions are the same failure. The binary doesn’t match the shape of the problem. Agent actions aren’t binary. Most of them are safe. A small minority are genuinely dangerous. A much larger minority — the interesting middle — is conditional. Safe in some contexts. Not in others. Maybe-safe if the human who owns the keyboard sees what’s about to happen and says “yeah, go ahead.”
The third decision
Gatekeeper’s policy has three outcomes, not two.
-
Allow — the fast path. The action executes. Policy evaluation is sub-millisecond. Reads, scoped writes, allowlisted HTTP, routine memory operations.
-
Deny — the hard wall. The action is refused, the agent gets a typed error, and the attempt is logged. Deny is for things that are never okay — path traversal, fork bombs, SSRF to metadata services, writes to
.envor.pemfiles. -
Approve — the pause. The action is held. An HMAC-signed single-use URL is generated. A notification fires (ntfy, email, your own webhook). A human clicks, and the action resumes on the other side. No SaaS in the middle.
The approve decision is the one that changes the game. It lets you write policies that are actually strict, because “strict” no longer means “the agent is blocked and the human has to rewrite the policy at 2am.” Strict means “the human sees it before it happens.”
What that looks like
Here’s the pattern we use for the finance-sync workflow in our own personal-assistant stack:
tools:
http.request:
allowed_hosts: ["api.finnhub.io", "api.coinbase.com"]
deny_private_ips: true
decision: allow # read-only price fetch — fast path
trade.place:
decision: approve # every trade: human pauses, checks, signs
approval:
expires_in: 3600
notify: ["ntfy://trades-channel"]
files.write:
allowed_paths: ["/workspace/**"]
deny_extensions: [".env", ".pem", ".key"]
decision: allow # scoped writes — fine
Three tools, three decisions, three different latency profiles. The agent doesn’t know or care — it makes the same HTTP call to Gatekeeper in every case. Gatekeeper decides what happens next.
Why this isn’t just “notifications with a step”
The thing that makes approve actually work — not just a theatrical pause — is that Gatekeeper is the one holding the request. The agent isn’t polling a queue. It doesn’t know the tool was paused. It’s making a synchronous tool call (or awaiting an async promise) and the response lands when — and only when — you click. Replay the URL and the nonce rejects. Wait an hour and the request expires. Every state change is in the audit log, stamped with which policy hash was active.
That’s the bit that makes developers leave it on. Not the feature list. The fact that it doesn’t get in their way when it shouldn’t, and it catches them when it should.
If you’re building agents
You’ll hit this wall. You don’t have to build the three-decision model yourself — Gatekeeper is Apache-2.0 and does exactly this. But if you build your own, please give yourself a third option. Binary policy is why the 2010s turned into a decade of “security tools everyone disables.” Don’t do it again.
Want Gatekeeper for your agents? Request early access or read the source.