Engineering · 2026-08-19

How Pan gates dangerous tool calls

Pan never takes a destructive or external-network action without an explicit approval. The mechanism is smaller than it sounds — an in-memory store, a three-level classifier, and a blocking goroutine. Here is what the gating layer actually looks like.

The Pan agent is approval-first by construction. Every tool call that the agent loop classifies as dangerous or catastrophic is suspended until the operator resolves it through the HTTP API. The mechanism that makes this work is small enough to fit in one Go package — internal/approval in the pan-agent repo — and the design choices in that package are worth pulling out, because they're the difference between an approval system that operators trust and one they grow tired of.

The shape, in three pieces.

A classifier that decides whether the call needs approval at all. Classify(toolName, argumentsJSON) returns an ApprovalCheck carrying one of three levels: Safe, Dangerous, Catastrophic. Safe means no approval is required — the call goes through. Dangerous means the operator gets a one-click confirm. Catastrophic means the operator must type a confirmation phrase to proceed. The classifier dispatches by tool family — terminal, code-execution, filesystem, browser, plus tool-specific cases — and matches argument content against a pattern set that grades the request. A terminal call to ls is Safe; a terminal call to rm -rf / is Catastrophic; the spectrum in between is Dangerous. The point is that the level is content-aware, not just a property of the tool's name. An old design that allowlisted "safe tools" and approval-gated "dangerous tools" turned out to be too coarse: most dangerous tools have safe uses, and the cost of false-alarming on those was high enough that operators started rubber-stamping approvals.

An in-memory Store that holds pending and recently-resolved approvals. Pending approvals live in a map keyed by approval ID. Resolved approvals move to a bounded ring buffer of the most recent 256, so Get keeps working for a while but the process's memory does not grow without bound. The store is the source of truth while the agent is running; nothing about it is persisted across restarts. That is intentional. An approval is a question to the operator about an action the agent wants to take right now; if the agent restarts, the question is gone, the action did not happen, and the world is in a coherent state. Persisting the approval queue across restarts would create a category of stale approvals that fire actions out of context, which is worse than the queue being empty.

A channel-based wait. When the agent loop classifies a tool call as Dangerous or Catastrophic, it creates an Approval with Status: StatusPending, registers it in the store, and calls Wait(id, done). Wait selects on the approval's resolution channel and the goroutine's done channel. When the operator resolves the approval through the HTTP API (POST /v1/approvals/{id}), the store flips Status to StatusApproved or StatusRejected, sets ResolvedAt, and closes the channel. Wait returns the new status, the agent loop proceeds (or aborts), and the resolved approval is moved into the ring buffer. There is no polling, no global lock held across the wait, no way for a resolved approval to be missed by an in-flight Wait.

The Approval record carries a small set of fields, and what they are not is as important as what they are.

type Approval struct {
    ID          string  // crypto/rand-derived hex
    SessionID   string  // which agent session
    ToolName    string  // which tool
    Arguments   string  // raw JSON, as the model emitted it
    Status      Status  // pending | approved | rejected
    CreatedAt   int64   // unix millis
    ResolvedAt  *int64  // unix millis, or nil if pending

    // Classifier output, populated at Create time.
    Level       Level
    PatternKey  string
    Description string
}

The record stores the call's arguments as the model emitted them. We considered normalising — pretty-printing JSON, extracting fields into structured columns — and decided not to. The point of the approval surface is to let the operator inspect what the agent is about to do; presenting a normalised view is a translation layer, and translation layers can hide the very thing the operator needs to see (an unusual escape, a suspicious null byte, an overlong path). The raw form is the truthful form.

The classifier carries a PatternKey and a Description alongside the level. The frontend uses PatternKey to render the right copy ("you are about to run rm -rf on a path outside your home directory" rather than "this is a Dangerous filesystem call"). The pattern-key vocabulary is small and stable enough that translations and accessible alternatives can be authored against it. Free-form descriptions composed from the model's text would be tempting and would, sooner or later, become an injection vector — the operator's screen would show whatever the model wanted them to read. The pattern key is from a closed set; the description is an English fallback. The frontend prefers the pattern key.

A few details that took us iteration.

ANSI escape stripping in the pattern matcher. A terminal command can carry ANSI escape sequences in its arguments. A pattern matcher that sees the raw bytes can be tricked by escapes that hide characters from the visible string. The classifier strips ANSI/VT escape sequences (ECMA-48: CSI, OSC, DCS, APC, PM, SOS, 8-bit C1 controls) before pattern matching, so the match operates on what the operator would have seen if the string were rendered. This is more boring than it sounds; the bug class it closes is exactly the kind that escapes review the first time around.

No "automatic approval" mode in the gating package itself. Operators sometimes want a session to run with reduced friction — a long batch where they trust the agent to act on a defined class of approvals without re-prompting. That capability lives upstream of the approval store, not inside it. The store does what it does (gate, wait, resolve). A higher layer can decide, before calling the store, that a particular Dangerous-level call falls under a session-level pre-approval and skip the gate. The store is unaware of that decision; it sees only the calls that reach it. This separation keeps the gating package's correctness property simple: every approval that arrives is pending until resolved.

The store's IDs are crypto-randomly generated. Sequential IDs are tempting (they sort, they're small, they're easy to debug). They are also enumerable: an operator who has resolved approval 4711 can reasonably guess that 4712 exists. We use 16 hex characters from crypto/rand. The IDs sort by creation time well enough through the timestamp field; the unguessability is worth the eight extra characters in URLs.

What's deliberately not in this package: any persistence story, any audit log, any cross-session correlation, any cryptographic chaining. Those are different problems with different invariants. The approval store is small, in-memory, transient, and exactly what the lifetime of a single agent session needs. If a separate audit log of every action the agent ever asked to take is needed — and that is a reasonable thing to want — the right place for it is a layer above the gating package, consuming the resolution events and writing them where the operator's audit policy says they should land. We have prototypes; they're not in internal/approval and won't be.

The package is around five hundred lines of Go. It is one of the parts of pan-agent we're most willing to point new readers at, because it does one thing, it does it without ambition, and the size is the feature.

← All engineering posts