ALL POSTS Engineering

Snowflake Query Optimization: Cut Compute Without Rewriting Dashboards

Two ways to make Snowflake queries cheaper — tune them by hand, or apply equivalence-tested rewrites on the wire — and why determinism is the whole trust argument.

DK Dan KowalskiSystems engineering, OSO Jun 17, 2026 10 min read

Snowflake query optimization is the work of making a query return the same answer while scanning and shuffling less data — because in Snowflake you pay for compute-seconds, and compute-seconds are mostly spent reading micro-partitions you didn’t need. There are two ways to get there: tune the SQL by hand (prune columns, push predicates, cluster, right-size warehouses), or apply deterministic, equivalence-tested rewrites on the wire so expensive shapes get cheaper without anyone touching a dashboard. This guide covers both.

Most teams reach for a bigger warehouse first. That makes a query finish faster, but it rarely makes it cheaper — a 2× warehouse costs 2× the credits per second, so unless it cuts wall-clock time by more than half you’ve spent more, not less. Real Snowflake performance tuning starts one level down: how much data does the query actually have to touch?

01 / WHAT MAKES A QUERY EXPENSIVEWhat makes a Snowflake query expensive?

Snowflake stores tables as immutable, columnar micro-partitions. A query is cheap when the optimizer can prune — skip whole partitions it can prove are irrelevant — and expensive when it can’t. Four things drive cost up:

  • Reading columns you don’t return. SELECT * against a wide table forces scans of columns the consumer never uses. Columnar storage means each unused column is wasted I/O on every run.
  • Filtering too late. A WHERE clause evaluated after a join makes Snowflake materialise rows it will immediately discard.
  • Poor clustering. If the column you filter on isn’t correlated with micro-partition order, Snowflake can’t prune and falls back to a full scan.
  • Spilling. When the working set exceeds the warehouse’s memory, Snowflake spills to local — then remote — disk, which is dramatically slower and shows up as bytes spilled in the query profile.

You can see all of this in ACCOUNT_USAGE. This finds the queries doing the most scanning and spilling — your optimization shortlist:

SELECT
  query_id,
  LEFT(query_text, 80)                     AS sql_preview,
  bytes_scanned / POWER(1024, 3)           AS gb_scanned,
  partitions_scanned,
  partitions_total,
  bytes_spilled_to_local_storage  / POWER(1024, 3) AS gb_spill_local,
  bytes_spilled_to_remote_storage / POWER(1024, 3) AS gb_spill_remote
FROM snowflake.account_usage.query_history
WHERE start_time > DATEADD('day', -7, CURRENT_TIMESTAMP())
  AND partitions_total > 0
ORDER BY bytes_scanned DESC
LIMIT 50;

When partitions_scanned is close to partitions_total, the optimizer isn’t pruning — that’s a clustering or predicate problem. When the spill columns are non-zero, the warehouse is undersized for that query’s working set.

02 / TUNE IT BY HANDHow do I optimize a Snowflake query manually?

Manual optimization is still the highest-leverage thing you can do, because it fixes the query at the source. Four moves cover most of the wins.

Prune the projection. Return only the columns the consumer reads.

-- before: dashboard renders order_id, amount, ts — but scans every column
SELECT *
FROM analytics.fct_orders
WHERE order_date = CURRENT_DATE;

-- after: narrower scan, narrower shuffle, identical result for this consumer
SELECT order_id, amount, ts
FROM analytics.fct_orders
WHERE order_date = CURRENT_DATE;

Push predicates down. Filter before you join, not after, so rows are eliminated before the expensive step.

-- before: join first, filter the joined result
SELECT o.order_id, c.region
FROM analytics.fct_orders o
JOIN analytics.dim_customer c ON c.customer_id = o.customer_id
WHERE o.order_date = CURRENT_DATE;

-- after: filter the fact table first, then join far fewer rows
SELECT o.order_id, c.region
FROM (
  SELECT order_id, customer_id
  FROM analytics.fct_orders
  WHERE order_date = CURRENT_DATE
) o
JOIN analytics.dim_customer c ON c.customer_id = o.customer_id;

Cluster on what you filter. For large tables queried on a predictable column, a clustering key lets Snowflake prune partitions instead of scanning them.

ALTER TABLE analytics.fct_orders CLUSTER BY (order_date);

Size warehouses to avoid spill. If the profile shows remote spill, the fix is a larger warehouse for that workload — not for everything. Right-sizing and idle behaviour are their own discipline; see how to reduce Snowflake costs for the warehouse side of the bill.

Manual tuning has a ceiling. It fixes queries you own and remember to revisit. It does nothing for the SELECT * a BI tool generates on every refresh, the redundant DISTINCT in a vendor connector, or the thousand machine-emitted queries no human will ever hand-edit. That’s the gap in-path rewriting fills.

03 / REWRITE IT ON THE WIREHow do I optimize queries I can’t edit?

chukei is an open source, Apache-2.0 cost optimization engine for Snowflake — a transparent wire-protocol proxy that runs in your own VPC. Drivers change one hostname; SQL and credentials stay where they are. One of its six levers is deterministic SQL rewriting: it parses each query into an abstract syntax tree, walks that tree against a fixed pack of equivalence-tested rules, and rewrites the expensive shapes inline — the same SELECT * pruning and predicate pushdown above, applied automatically to queries no human will ever touch.

A query AST walked against the deterministic rule-pack. Each rule carries an equivalence proof and fires only when its preconditions hold; anything unproven is forwarded verbatim.

Each rule is a pure function from one AST to another, paired with a proof — a property test over generated inputs — that both trees produce identical result sets under Snowflake semantics. If a rule’s preconditions hold, it fires and the rewritten SQL is forwarded; if they don’t, the original query passes through untouched. The rules target the same waste manual tuning does:

RuleWhat it doesSkips when
prune-select-starReplace SELECT * with the consumed columnsFull column set is observed downstream
push-predicatePush WHERE filters below joinsPredicate references a post-join expression
drop-redundant-distinctRemove DISTINCT a key already guaranteesUniqueness can’t be proven from constraints
collapse-nested-subqueryFlatten equivalent nested selectsAny layer is non-deterministic
eliminate-no-op-castDrop a cast to a column’s existing typeCast changes precision or semantics

Rewriting acts on query shape, so it stacks with the other levers rather than overlapping them — most notably with verified query result caching, which eliminates repeated reads while rewriting makes each read cheaper.

04 / WHY NO LLM ON THE HOT PATHWhy determinism is the trust argument

The crowded end of this market sells LLM-based optimizers: feed the query to a model, get a “faster” version back. That can surface clever rewrites a fixed rule-pack never will. But it inherits the model’s failure mode — the output is plausible, not proven. A model that’s right 99% of the time is catastrophic inline, because the 1% silently returns wrong numbers to a dashboard nobody re-checks, and you can’t tell which 1%.

A rewrite engine you cannot fully trust is one you have to fully audit — which costs more than the compute it saves.

— the design constraint

chukei makes the opposite trade on purpose. The rule-pack is smaller than what a model could propose, but every rule is closed-form and equivalence-tested, so the rewrite is correct by construction. The same query produces the same rewrite every time, on every node, forever.

No LLM on the hot path — by invariant, not preference. The rewrite lever runs in the deterministic query path at ~2 ms p99 overhead inside a +5 ms budget. An inference call can’t meet that budget and can’t offer a proof, so it is never in the loop. LLM-based tools sit beside your warehouse and advise; chukei sits in front of it and acts only when it can prove the rewrite is safe.

This is not anti-AI; it’s the right tool in the right place. Use a model in your editor to propose rewrites you then review and ship. Keep it off the wire, where there’s no human in the loop and a wrong answer is invisible.

05 / VERIFY EQUIVALENCECan query optimization be automated safely?

Yes — if it fails open. Every lever in chukei shares one invariant: when in doubt, do nothing. A parse error, an unrecognised dialect feature, a function that can’t be proven deterministic, a precondition that doesn’t hold — any of these ends the rewrite attempt and the original SQL goes to Snowflake verbatim.

chukei rewrite · per-query decision
  parse ok? ───────────────────── no ──▶ passthrough (verbatim)
   │ yes
  rule matches? ───────────────── no ──▶ passthrough (verbatim)
   │ yes
  equivalence preconditions hold? ─ no ──▶ passthrough (verbatim)
   │ yes
  apply rewrite ──▶ forward rewritten SQL

The worst case is that chukei declines to optimise a query and you pay exactly what you’d have paid without it. There’s no failure mode where a rewrite corrupts a result, because a rewrite that can’t be proven equivalent is, by definition, never applied. Before any of it touches production, the replay simulator projects the saving from your own QUERY_HISTORY offline — rewriting contributes its share of the 15–30% band we ask every team to validate, a target to confirm with replay, never a guarantee. For the full programme that sits around it, start with the cornerstone guide to Snowflake cost optimization.

Key takeaways

  • Snowflake query cost is driven by how much data you scan — prune SELECT *, push predicates early, cluster on filter columns, and size warehouses to avoid spill.
  • Manual tuning is highest-leverage but has a ceiling: it can’t touch the machine-generated queries from BI tools and connectors.
  • chukei rewrites those on the wire by walking an AST against a deterministic, equivalence-tested rule-pack — same input, same rewrite, every time.
  • No LLM on the hot path. Determinism is the trust argument vs plausible-but-unproven AI optimizers, and it fits the ~2 ms p99 / +5 ms budget.
  • The lever is fail-open: unproven rewrites are never applied, so the worst case is no rewrite — never a wrong answer.

The full rule-pack, its property tests, and the equivalence proofs are in the repository — every rule ships with the test that keeps it honest. If you have a query shape you think we should rewrite safely, open an issue with the redacted SQL and we’ll tell you whether it can be proven equivalent.

Frequently asked questions

How do I optimize a Snowflake query?
Stop scanning data you don't return. Select only the columns the consumer uses, push WHERE filters as early as possible, cluster large tables on the columns you filter on, and size warehouses so queries don't spill to local or remote disk. Each of these reduces the micro-partitions Snowflake has to read, which is what you pay for.
How do I speed up a slow Snowflake query?
Check the query profile for spilling and full table scans. Spilling to remote disk means the warehouse is too small for the working set; a full scan usually means a missing predicate or poor clustering. Fixing pruning typically helps more than throwing a larger warehouse at the problem.
Can Snowflake query optimization be automated safely?
Yes, if the rewrites are deterministic and equivalence-tested. chukei applies a fixed AST rule-pack inline and fires a rule only when it can prove the rewrite returns an identical result set; otherwise it forwards the original SQL verbatim. It never uses an LLM on the query path, so the result is the same every time.
What is an equivalence-tested rewrite?
A rewrite that ships with a proof — encoded as a property test over generated inputs — that the transformed query produces the same result set as the original under Snowflake semantics. If the preconditions for that proof don't hold on a given query, the rule is not applied.
Does Snowflake query optimization require changing my dashboards?
No. Manual tuning happens in your SQL and models, but in-path rewrites with chukei need no changes to dashboards, BI tools, or dbt models — drivers change one hostname and chukei rewrites expensive shapes on the wire, falling open to passthrough whenever it can't prove a rewrite safe.
DK
Dan Kowalski

Builds the Rust wire-protocol core of chukei. Spends his time making sure the proxy adds milliseconds, never breakage.

SnowflakeSQLPerformanceDeterminism