🛡️Gatekeeper/ SDKs

Gatekeeper is a control plane for multi-tenant SaaS: identity, API keys, quotas, entitlements, billing, webhooks, and audit behind one HTTP API. Every protected request funnels through a single access-decision engine before any route logic runs. That engine is fail-closed by construction, and the same core runs unchanged on Cloudflare Workers and on Bun.

One decision point#

There is exactly one place where access is decided: the authorize() pipeline in the gatekeeper engine. Routes never call domain services directly for access control. They call a guard() facade that builds the engine and runs the pipeline, so isolation and quota rules cannot be skipped per route.

The pipeline takes an access request and returns a single decision: allow, deny, or error. An allow is only ever produced by running off the end of the pipeline. Any halt, thrown exception, or absent-but-required dependency becomes a deny or error instead. There is no path where a backend fault silently becomes an allow.

The pipeline#

The seven steps run in a fixed order. The cheapest gates front the expensive ones, state-mutating work runs last, and identity is established before anything that needs an actor.

flowchart TD
    A[authorize request] --> V[1 validate<br/>shape / tenant -> 400]
    V --> R[2 rateLimit<br/>brute-force gate -> 429 / 503]
    R --> I[3 identity<br/>resolve credential -> 401 / 503]
    I --> B[4 tenant-binding<br/>cross-tenant guard -> 403]
    B --> Z[5 authorize<br/>policy + role / perm -> 401 / 403 / 503]
    Z --> E[6 entitlement<br/>plan feature gate -> 402 / 503]
    E --> Q[7 quota<br/>usage gate, atomic -> 402 / 503]
    Q --> OK[allow<br/>actor + tenantRole + quota]
    V -.halt.-> D[deny / error]
    R -.halt.-> D
    I -.halt.-> D
    B -.halt.-> D
    Z -.halt.-> D
    E -.halt.-> D
    Q -.halt.-> D
StepGateWhy here
1validateReject malformed input before spending any backend call.
2rateLimitRuns before identity so brute force on the credential itself is throttled. Cheapest gate, fronts everything.
3identityEstablishes the actor. Everything after needs it. Anonymous is allowed to continue; the policy step decides.
4tenant-bindingAn API-key actor pointed at a tenant other than its own is denied before any role lookup.
5authorizeThe actual "may this actor do this" check, including membership and permission backend calls.
6entitlementPlan-level feature gating. Runs only when the request carries an entitlement spec.
7quotaUsage metering. Last because it mutates state - it must only fire once everything else has passed.

The optional gates (rate limit, entitlement, quota) execute only when the request carries the matching spec, so the common authenticated path stays cheap.

Engine guarantees#

The engine wraps the pipeline with defense in depth:

Runtime-portable by design#

Gatekeeper is hexagonal: the domain logic depends only on interfaces, and concrete adapters are wired in at one composition root. No domain package imports a Cloudflare or Bun type. The runtime entrypoint builds a single portable dependency bundle and forwards it down.

 L4  clients        TypeScript SDK / Python SDK
                          | HTTP
 L3  composition    apps/api  (routes -> guard() -> service factories -> stores)
                          | authorize()
 L2  engine         the decision pipeline + ports
                          | service interfaces
 L1  domain         auth, apikeys, quota, permissions, entitlements,
     services       billing, payments, webhooks, audit, jobs, ratelimit
                          | foundation interfaces
 L0  foundation     common, crypto, database, events, observability

The same engine runs on two runtimes. The only difference is which adapters the entrypoint constructs:

DependencyWorkersBun
DatabaseD1 adapterbun:sqlite adapter
CacheKV namespacein-memory or Redis
Audit queueCloudflare Queue + DLQinline deferred write
Client IPCF-Connecting-IPsocket peer address
Rate limitersnative [[ratelimits]] bindingscache-backed fixed-window limiter
Usage counterUSAGE_COUNTER Durable Objectatomic SQL path

Adding a new runtime (Node, Lambda, Vercel) is a thin entrypoint, not a fork: construct the dependency bundle from the platform's primitives, then forward the request to the app.

How a decision reaches HTTP#

The engine returns a transport-neutral decision. The guard() facade in the API app turns it into an HTTP response: it maps the decision's status to HTTP, optionally remaps a 403 to 404 to hide resource existence, and sets Retry-After on rate-limited responses. The engine package never reads transports and never writes HTTP. That boundary lives entirely in the app.