ALL POSTS Snowflake

Snowflake warehouse sizing & auto-suspend: stop paying for idle

How warehouse sizes map to credits, how to right-size without breaking SLAs, and why a fixed 60-second auto-suspend timer leaves money on the table — replaced by a forecast.

AO Amara OkoyeReliability, OSO Jun 18, 2026 9 min read
SLEEP

Snowflake warehouse sizing is a guess, and the auto-suspend timeout you set next to it is a second guess stacked on the first. Size sets how fast credits burn; auto-suspend sets how long they keep burning after the work stops. Get either wrong and you pay for compute nobody used. This post covers how sizes map to credits, how to right-size without breaking SLAs, and how to replace the static idle timer with a forecast.

The two levers are independent. Size is how many credits per hour a warehouse burns while it runs. Auto-suspend is how long it keeps running once the last query finishes. Most teams tune size carefully and then leave a default 60- or 600-second idle timer in place — which means a perfectly right-sized warehouse can still bleed credits every time it sits warm between queries.

01 / SIZES AND CREDITSHow does warehouse size map to credits?

Every step up the size ladder doubles the credit burn rate. An XS warehouse costs 1 credit per hour; a 4XL costs 128. Billing is per-second after a 60-second minimum on each resume, so the cost of a query is roughly size rate × seconds it ran — plus whatever idle time you pay for before auto-suspend kicks in.

SizeCredits / hourRelative costTypical fit
XS1BI dashboards, repeated reads
S2Reporting, light analyst queries
M4dbt models, moderate transforms
L8Heavy transforms, large joins
XL1616×Large batch, big aggregations
2XL3232×Very large workloads
3XL6464×Bulk / exceptional jobs
4XL128128×Rarely justified continuously

Because the rate doubles each step, the obvious advice — “just size up if it’s slow” — is expensive advice. Doubling the size to halve a query’s runtime is cost-neutral at best, and only if the query actually parallelises. For the read-heavy BI and reporting workloads chukei targets, an XS or S warehouse is usually the right home; the bigger sizes earn their keep on genuine transforms.

02 / RIGHT-SIZINGWhat size Snowflake warehouse do I need?

Right-sizing without breaking SLAs comes down to two signals you can read straight from QUERY_HISTORY: disk spilling and queuing. If queries spill to local or remote storage, the warehouse is too small for the working set and a size up will pay for itself. If queries queue, you need more clusters (or a multi-cluster warehouse), not a bigger one. Absent both, sizing down is usually free latency-neutral savings.

-- Find warehouses spilling to remote disk or running long: candidates to re-size
SELECT
  warehouse_name,
  warehouse_size,
  COUNT(*)                                         AS queries,
  ROUND(AVG(total_elapsed_time) / 1000, 1)         AS avg_secs,
  SUM(bytes_spilled_to_remote_storage)             AS remote_spill_bytes,
  SUM(bytes_spilled_to_local_storage)              AS local_spill_bytes
FROM snowflake.account_usage.query_history
WHERE start_time >= DATEADD('day', -30, CURRENT_TIMESTAMP())
GROUP BY 1, 2
ORDER BY remote_spill_bytes DESC;

Warehouses with heavy remote spill are under-sized; warehouses with near-zero spill and short runtimes are candidates to size down. Do this per warehouse and per role — a single account-wide default is the thing that put you here.

Right-sizing is half the job. Once a warehouse is the correct size, the remaining waste is time it spends warm but idle. That is the auto-suspend problem, and a fixed timer is the wrong tool for it.

03 / THE IDLE-TIMEOUT TRAPWhat’s the best auto-suspend timeout?

There isn’t one. A static timeout — 60 seconds, 300, 600 — is a single number applied to a query stream that doesn’t arrive on a single rhythm. Set it short and you suspend in the gap between two queries of the same burst, paying a cold resume (and the 60-second minimum) to do work you were about to do anyway. Set it long and you pay for the warehouse to sit warm through every lull.

A static idle timer vs the real query arrival pattern. The fixed 60s timeout (amber) suspends inside a burst, then has to resume; the gaps where the warehouse is genuinely idle (grey) are where the credits actually leak.

The amber block is the trap: the timer holds the warehouse warm right after a burst and fails to predict the long idle gap that follows. One static number cannot be right for both the within-burst pause and the between-burst lull, because they are different distributions.

A static idle timer is a single guess applied to a stream that has at least two rhythms. It is wrong in both directions at once.

— the auto-suspend problem in one line

04 / FROM TIMER TO FORECASTReplacing the guess with a Poisson idle model

chukei sits in the Snowflake query path as a transparent wire-protocol proxy, so it sees the actual arrival times of queries per warehouse and per role. Instead of a fixed timeout, it models the idle gaps as a Poisson process and forecasts how likely the next query is to arrive in the next few seconds. When the probability of imminent work drops below a threshold, it knows the warehouse is genuinely idle — not just pausing mid-burst — and recommends an early suspend. This is plain statistics on arrival times: deterministic, no LLM anywhere near the decision.

Poisson, not Holt-Winters; suggest-only, not auto-pilot. chukei forecasts idle windows from query arrival rates and starts in suggest-only mode — it shows you the suspends it would recommend before it is allowed to act. In simulation, the model captured ~94% of the modelled idle savings versus a naive fixed timer.

In suggest-only mode chukei emits a recommendation per warehouse — the suspend it would have triggered, the credits it would have saved, and the confidence — so you can audit the model against your own SLAs before handing it the lever:

# Show the suspends chukei would recommend, without enforcing anything
chukei suspend suggest --warehouse BI_WH --window 30d

warehouse   BI_WH  (size XS · 1 credit/hr)
─────────────────────────────────────────────
model       Poisson idle · suggest-only (no enforcement)
idle gaps   1,884 windows over 30 days
current     auto_suspend = 600s   (static)
suggested   suspend after ~72s idle · p(next query <72s) < 0.05
captured    ~94% of modelled idle savings vs fixed timer
projected idle credits within the 15–30% target band

 promote per warehouse/role with:  chukei suspend enforce --warehouse BI_WH

05 / ENFORCING SAFELYEnforce per warehouse and per role

Suggest-only is the default for a reason: you validate the model against the warehouses you trust least before it touches anything. When the recommendations hold up, you promote enforcement one warehouse and one role at a time — a low-stakes reporting warehouse first, a latency-sensitive interactive one later or never. Enforcement is scoped, never account-wide-by-default, and like every chukei behaviour it fails open: if the proxy is unsure or unavailable, the warehouse keeps Snowflake’s own native auto-suspend and nothing breaks.

The 15–30% savings band is a target to validate, not a guarantee — the real number depends on how bursty your workload is and how loose your current timers are. The replay simulator we cover in how to reduce Snowflake costs lets you project the suspend savings from a QUERY_HISTORY export before any of this touches your account. For the full picture — sizing, suspend, caching, and attribution together — start at the cornerstone guide to Snowflake cost optimization.

Key takeaways

  • Warehouse credits double every size step (XS 1/hr → 4XL 128/hr); size up only when queries spill or queue, and size down when they don’t.
  • Right-sizing is half the job — a correctly sized warehouse still leaks credits sitting warm and idle between queries.
  • A static auto-suspend timer is a guess that is wrong in both directions: too short suspends mid-burst, too long pays for idle.
  • chukei replaces the guess with a Poisson idle forecast — suggest-only first, capturing ~94% of modelled savings in simulation, enforced per warehouse and role, always fail-open.
  • chukei is Apache-2.0, self-hosted, Snowflake-only, deterministic, with no LLM on the decision path.

Want to see the suspend suggestions for your own warehouses before changing a single timeout? The Poisson idle model and the replay simulator ship in the Apache-2.0 release — the docs walk through running chukei suspend suggest against your account and reading the projected savings.

Frequently asked questions

What size Snowflake warehouse do I need?
Start with the smallest size that meets your latency SLA, then size up only when queries spill to remote disk or queue. For most BI and reporting workloads an XS or S warehouse is enough; large transforms may justify L or bigger. Size is a per-warehouse decision, not an account-wide default.
How does Snowflake auto-suspend work?
Auto-suspend stops a warehouse after it has been idle for a fixed number of seconds, so you stop paying per-second credits until the next query resumes it. The timeout is a static guess you set per warehouse — it does not adapt to your actual query arrival pattern.
What's the best auto-suspend timeout?
There is no single best value: a short timeout (e.g. 60s) saves the most idle compute but risks cold resumes mid-burst, while a long one wastes credits between queries. The right timeout depends on how queries actually arrive — which is why chukei models idle windows with a Poisson process instead of using one fixed number.
How many credits does each Snowflake warehouse size use?
Credit consumption doubles with each size step: XS uses 1 credit/hour, S uses 2, M uses 4, L uses 8, XL 16, 2XL 32, 3XL 64, 4XL 128. Billing is per-second with a 60-second minimum on resume, so idle time between queries is pure waste.
AO
Amara Okoye

Owns the fail-open guarantees and the idle-suspend modelling. Believes the safest optimisation is the one that degrades to passthrough.

SnowflakeFinOpsWarehousesAuto-Suspend