How chukei speaks Snowflake's wire protocol without breaking a single client
Drivers change one hostname and nothing else — so chukei has to be byte-faithful on the wire, invisible on latency, and incapable of breaking a query even when its own code panics.
The pitch for chukei sounds almost too clean: change one hostname, keep your SQL and your credentials, and start cutting Snowflake compute. The catch is that the simplicity lives entirely on your side. To make a driver believe it is talking to Snowflake when it is actually talking to a Rust proxy in your VPC, chukei has to speak Snowflake’s wire protocol faithfully enough that nothing — not the JDBC driver, not snowflake-connector-python, not dbt, not a BI tool — can tell the difference.
This post is about how that works: where the proxy sits, how authentication passes straight through it, how results stream back, why the overhead is measured in single-digit milliseconds, and — most importantly — why a bug in chukei still can’t break one of your queries.
01 / WHERE IT SITSOne hostname, the whole path
A Snowflake client opens an HTTPS session to an account endpoint, logs in, and
then exchanges JSON request/response bodies over that session for the lifetime
of its work. chukei inserts itself as a transparent wire-protocol proxy on that
path. The driver resolves your-account.chukei.internal instead of the
Snowflake hostname; everything above the connection string is untouched.
The point of the proxy is not to be clever in the path. It is to be a place where chukei can decide, per query, whether Snowflake needs to run at all — without the client ever participating in that decision.
02 / AUTH PASSES THROUGHCredentials stay with the driver
The single most common worry about anything in the query path is credentials. chukei’s answer is that it never owns them. The login exchange — whatever the client uses: password, key-pair JWT, OAuth, or an externally-brokered token — is forwarded to Snowflake and validated by Snowflake. chukei is not an identity provider and does not re-implement Snowflake’s auth.
Credentials are never persisted or logged. Session tokens chukei needs to correlate a request with its Snowflake session live in memory only, for the life of the session. There is no credential store, no token file, no log line that contains a secret. If chukei restarts, sessions re-authenticate through it exactly as they would directly against Snowflake.
Because auth is Snowflake’s to grant and revoke, your existing RBAC, network policies, and key rotation keep working unchanged. chukei sees the shape of traffic — which user, app, team, or dbt model issued a query — which is enough to attribute cost on the wire without anyone tagging anything, and without ever holding a credential it could leak.
03 / RESULTS STREAMBig answers don’t buffer through
Snowflake returns small result sets inline and large ones as a set of chunked result files the client fetches directly. chukei honours that. For a passthrough query, response bodies and chunk pointers flow back to the driver as Snowflake emitted them — chukei does not re-encode, re-chunk, or buffer a whole result set in memory to inspect it. That is both a latency property and a safety property: the proxy stays cheap, and large or chunked result shapes are exactly the class it refuses to cache.
A served cache hit is the inverse case: a deterministic repeated read whose result chukei already holds and has continuously double-checked against live Snowflake. There the proxy returns the cached body in the same wire format the driver expects, so the client deserialises it identically — it never learns the warehouse stayed asleep.
The driver can’t tell a verified cache hit from a fresh Snowflake answer, because on the wire there is nothing to tell apart.
— on the streaming path
04 / THE BUDGETMilliseconds, measured
The hot path is deliberately boring. It is deterministic Rust — parse the request enough to classify it, check a few in-memory structures, and either serve or forward. There is no LLM in the path, no network round-trip to a model, no analysis step that could stall a query. That is what keeps the overhead inside budget.
| Property | Target / measured |
|---|---|
| Added latency, passthrough | ~2 ms p99 |
| Overhead budget (hard cap) | +5 ms |
| Decision engine | deterministic, in-process |
| LLM on the query path | none |
Those numbers are measured against the proxy, not promised by it: ~2 ms at p99 on the passthrough path, against a +5 ms budget we treat as a hard ceiling. The moment a decision can’t be made cheaply and safely, chukei stops trying to be smart and just forwards — which is the whole design.
05 / FAIL OPENA bug in chukei can’t break your query
This is the guarantee everything else is built to protect. chukei fails open. If the parser hits SQL it doesn’t fully understand, if a cache lookup misses, if a query is non-deterministic, if it’s a write, if the result shape is unsafe to cache, or if chukei’s own code panics — the request degrades to a byte-identical passthrough to Snowflake. The worst case is that you paid for a proxy hop measured in milliseconds and got exactly the answer Snowflake would have given you anyway.
# fail-open decision, conceptually, on every request:
classify(request)
├─ deterministic repeat + verified fresh cache → serve cached body
├─ safe to optimise (equivalence-tested) → apply, then forward
└─ anything else / parse error / panic → PASSTHROUGH (verbatim)
# writes, non-deterministic SQL, chunked/large results, unknown shapes
# all land in the last branch by design — "when in doubt, miss".
The proxy is false-positive-intolerant on purpose: it would rather miss a saving than serve a wrong byte. A cache that is ever wrong is worse than no cache, so the bar to act is high and the fallback is always the unmodified Snowflake answer. That is why we can put chukei in the production query path of a read-heavy BI account without asking anyone to change their SQL, their credentials, or their trust model — only one hostname.
Key takeaways
- chukei speaks Snowflake’s wire protocol faithfully, so drivers change one hostname and nothing else — SQL and credentials stay with the existing driver.
- Auth passes straight through to Snowflake; chukei never persists or logs credentials, and session tokens live in memory only.
- The hot path is deterministic Rust with no LLM, holding ~2 ms p99 overhead against a +5 ms budget — measured, not promised.
- Fail open is the contract: parse errors, cache misses, writes, non-deterministic SQL, unsafe result shapes, and even a panic all degrade to a byte-identical passthrough. chukei never breaks a query.
chukei is Apache-2.0 and self-hosted — the proxy runs in your own VPC, and the behaviour described here is in the repository for you to read, not take on faith. If you want to see the wire-level decision path for yourself before anything touches production, start with the replay simulator: it projects savings from your query history offline, without installing a thing in the path.
Builds the Rust wire-protocol core of chukei. Spends his time making sure the proxy adds milliseconds, never breakage.