Snowflake Query Result Cache: How It Stays Correct
Snowflake's native result cache is fast but narrow. A verified proxy cache survives across warehouses and sessions — and is double-checked against live Snowflake on every hit.
Snowflake’s query result cache stores the result of a query for 24 hours and serves it again — for free, with no warehouse — if you re-run the exact same query against unchanged data. It is genuinely useful, but it is narrow: per-account, easily invalidated, and blind to queries that are logically identical but textually different. A verified proxy cache fills that gap, with one hard rule: it would rather miss than ever return a wrong answer.
The reason caching matters for cost is simple. Most of a BI-heavy Snowflake bill is the same reads, recomputed — dashboards and scheduled reports asking questions that were already answered that morning. The cheapest credit is the one you never spend. The hard part is not storing answers; it is being certain a stored answer still matches Snowflake. Get that wrong once in analytics and you have quietly shipped bad numbers to a finance dashboard.
01 / THE NATIVE CACHEHow does Snowflake result caching work?
Snowflake maintains a result cache in the cloud services layer, shared across the whole account. When a query arrives, Snowflake checks whether it has a stored result for that exact query. If it does, and the inputs are unchanged, it returns the stored rows in milliseconds without resuming a warehouse — so the query is effectively free. The entry lives for 24 hours, and each reuse resets the clock up to a 31-day maximum.
That is the good part. The limits are where teams get surprised:
| Native cache invalidates when… | Why it bites |
|---|---|
| Any underlying table changes | One write to a wide table evicts every dependent result |
| The query text differs | SELECT * vs an explicit column list is a different query |
| A different role runs it | Per-role scoping means analysts miss each other’s results |
| The query is non-deterministic | CURRENT_TIMESTAMP(), RANDOM(), etc. are never cached |
| 24 hours pass | Time-based expiry regardless of freshness |
So the native cache is best at protecting back-to-back identical runs. It is weakest exactly where dashboard fleets live: many tools, many roles, slightly different SQL, refreshing on schedules longer than the data actually changes. We go deeper on the SQL-shape side of this in Snowflake query optimization.
02 / THE STAKESWhy a wrong cached answer is unacceptable
A cache is a bet that a stored answer is still correct. For a web page, a stale cache hit is a cosmetic bug. For analytics, it is a data integrity failure — a number on a board that does not match the warehouse. Nobody notices until a quarterly figure is questioned, and by then trust in the whole pipeline is gone.
A cache that is fast but occasionally wrong is worse than no cache at all, because you stop being able to trust the fast path.
— the design constraint
This is why chukei’s caching strategy starts from intolerance, not speed. The question is never “can we serve this faster?” It is “can we prove this answer still equals what Snowflake would return right now?” If the proof is anything short of certain, the cache steps aside.
03 / THE VERIFIED CACHEA false-positive-intolerant proxy cache
chukei is a transparent wire-protocol proxy that sits in the Snowflake query path inside your own VPC. Drivers change one hostname; SQL and credentials stay exactly where they are. Because it sees every query on the wire, it can run a result cache that is broader than the native one — surviving across warehouses and sessions — while being far stricter about correctness.
Two properties make that safe:
- Continuous double-checking. Cache entries are re-validated against live Snowflake, so a served hit reflects the current state of the underlying data, not just a 24-hour window. Freshness is verified, not assumed.
- Deterministic, no LLM on the hot path. The decision to serve or miss is made by deterministic logic, not a model guessing. The hot path adds about ~2 ms p99 of overhead, within a +5 ms budget.
False-positive-intolerant by design. When chukei cannot prove an entry is still correct, it does not serve it. There is no “probably fine” tier. The worst case for the cache is a miss — never a wrong answer.
04 / THE GUARDRAILSWhat it never caches — and how it fails open
A correct cache is defined as much by what it refuses as by what it serves. chukei never caches:
- Writes — any
INSERT,UPDATE,MERGE, DDL, or anything that mutates state. - Non-deterministic queries —
CURRENT_TIMESTAMP(),RANDOM(),UUID_STRING(), session-dependent functions. - Chunked or large results — result shapes that cannot be verified cheaply are passed straight through.
Everything outside the provably-safe set degrades to a byte-identical passthrough to Snowflake. Parse errors, cache misses, unsafe result shapes, and any internal uncertainty all fail open — the query runs on Snowflake exactly as it would have without the proxy. Credentials are never persisted or logged; session tokens live in memory only.
-- Force fresh execution in Snowflake (bypass the native result cache):
ALTER SESSION SET USE_CACHED_RESULT = FALSE;
-- A deterministic repeated read like this is the ideal verified-cache candidate:
SELECT region, SUM(amount) AS revenue
FROM analytics.sales
WHERE order_date >= DATEADD('day', -30, CURRENT_DATE())
GROUP BY region;
-- This one is non-deterministic — chukei (and Snowflake) will never cache it:
SELECT CURRENT_TIMESTAMP() AS generated_at, COUNT(*) FROM analytics.sales;
05 / THE EVIDENCEThe soak: ~120k queries, ~60k hits, 0 mismatches
Claims about cache correctness are worthless without a soak test. chukei’s ran roughly 120,000 queries through the proxy against live Snowflake. About 60,000 were served from the verified cache, and every single served result was checked against what Snowflake would have returned. The mismatch count was zero.
Zero is the only acceptable number here. A verified cache that produced even a handful of stale answers across 60k hits would not be a cost optimization — it would be a liability. The whole point is that the fast path stays trustworthy, so caching becomes a credit you can actually bank rather than a risk you have to audit. For how this rolls up into a wider cost program, see Snowflake cost optimization.
Key takeaways
- Snowflake’s native result cache is free and fast but narrow: per-account, 24-hour, and invalidated by data changes, role, non-determinism, or even a different query string.
- A verified proxy cache survives across warehouses and sessions and re-checks every hit against live Snowflake — broader coverage, stricter correctness.
- chukei is false-positive-intolerant: it never caches writes, non-deterministic queries, or large/chunked results, and fails open to a byte-identical passthrough.
- The soak proved it: ~120k queries, ~60k cache hits, 0 mismatches — deterministic, with no LLM on the hot path.
chukei is Apache-2.0 and self-hosted: the cache runs in your VPC, your SQL and credentials never move, and you can read exactly how a cache decision is made. The full caching design — verification, invalidation, and the safety set — is documented at docs.chukei.dev. Point one read-heavy workload at it, watch the hit rate, and check the mismatch count yourself.
Frequently asked questions
- How does Snowflake result caching work?
- Snowflake keeps a per-account result cache for 24 hours. If you re-run a syntactically identical query and the underlying data, role, and session context are unchanged, Snowflake returns the stored result without spinning up a warehouse — so the query costs nothing in compute.
- Why does my Snowflake cache miss?
- The native result cache invalidates on almost any change: a write to any underlying table, a different role, a non-deterministic function, or even a cosmetically different query string (SELECT * vs an explicit column list). After 24 hours the entry expires regardless.
- Can you cache Snowflake results across sessions?
- Snowflake's own result cache is per-account and survives across sessions for 24 hours, but it is invalidated easily. A verified proxy cache like chukei's sits in front of Snowflake and can serve the same deterministic read across warehouses and sessions, re-checking freshness against live Snowflake on each hit.
- Is cached data ever stale or wrong?
- It should never be. chukei's cache is false-positive-intolerant: it never serves writes, non-deterministic queries, or chunked/large results, and it continuously double-checks entries against live Snowflake. When in doubt it misses and falls open to a verbatim passthrough.
- How do I clear the cache in Snowflake?
- You can disable the result cache for a session with ALTER SESSION SET USE_CACHED_RESULT = FALSE, which forces fresh execution. Snowflake also invalidates entries automatically on data changes and after 24 hours.
Works on the cost-modelling and replay engine at OSO. Previously spent too long staring at Snowflake bills that nobody could explain.