Snowflake warehouse sizing is a guess, and the auto-suspend timeout you set next to it is a second guess stacked on the first. Size sets how fast credits burn; auto-suspend sets how long they keep burning after the work stops. Get either wrong and you pay for compute nobody used. This post covers how sizes map to credits, how to right-size without breaking SLAs, and how to replace the static idle timer with a forecast.
The two levers are independent. Size is how many credits per hour a warehouse burns while it runs. Auto-suspend is how long it keeps running once the last query finishes. Most teams tune size carefully and then leave a default 60- or 600-second idle timer in place — which means a perfectly right-sized warehouse can still bleed credits every time it sits warm between queries.
01 / SIZES AND CREDITSHow does warehouse size map to credits?
Every step up the size ladder doubles the credit burn rate. An XS warehouse costs 1 credit per hour; a 4XL costs 128. Billing is per-second after a 60-second minimum on each resume, so the cost of a query is roughly size rate × seconds it ran — plus whatever idle time you pay for before auto-suspend kicks in.
| Size | Credits / hour | Relative cost | Typical fit |
|---|---|---|---|
| XS | 1 | 1× | BI dashboards, repeated reads |
| S | 2 | 2× | Reporting, light analyst queries |
| M | 4 | 4× | dbt models, moderate transforms |
| L | 8 | 8× | Heavy transforms, large joins |
| XL | 16 | 16× | Large batch, big aggregations |
| 2XL | 32 | 32× | Very large workloads |
| 3XL | 64 | 64× | Bulk / exceptional jobs |
| 4XL | 128 | 128× | Rarely justified continuously |
Because the rate doubles each step, the obvious advice — “just size up if it’s slow” — is expensive advice. Doubling the size to halve a query’s runtime is cost-neutral at best, and only if the query actually parallelises. For the read-heavy BI and reporting workloads chukei targets, an XS or S warehouse is usually the right home; the bigger sizes earn their keep on genuine transforms.
02 / RIGHT-SIZINGWhat size Snowflake warehouse do I need?
Right-sizing without breaking SLAs comes down to two signals you can read
straight from QUERY_HISTORY: disk spilling and queuing. If queries
spill to local or remote storage, the warehouse is too small for the working
set and a size up will pay for itself. If queries queue, you need more clusters
(or a multi-cluster warehouse), not a bigger one. Absent both, sizing down is
usually free latency-neutral savings.
-- Find warehouses spilling to remote disk or running long: candidates to re-size
SELECT
warehouse_name,
warehouse_size,
COUNT(*) AS queries,
ROUND(AVG(total_elapsed_time) / 1000, 1) AS avg_secs,
SUM(bytes_spilled_to_remote_storage) AS remote_spill_bytes,
SUM(bytes_spilled_to_local_storage) AS local_spill_bytes
FROM snowflake.account_usage.query_history
WHERE start_time >= DATEADD('day', -30, CURRENT_TIMESTAMP())
GROUP BY 1, 2
ORDER BY remote_spill_bytes DESC;
Warehouses with heavy remote spill are under-sized; warehouses with near-zero spill and short runtimes are candidates to size down. Do this per warehouse and per role — a single account-wide default is the thing that put you here.
Right-sizing is half the job. Once a warehouse is the correct size, the remaining waste is time it spends warm but idle. That is the auto-suspend problem, and a fixed timer is the wrong tool for it.
03 / THE IDLE-TIMEOUT TRAPWhat’s the best auto-suspend timeout?
There isn’t one. A static timeout — 60 seconds, 300, 600 — is a single number applied to a query stream that doesn’t arrive on a single rhythm. Set it short and you suspend in the gap between two queries of the same burst, paying a cold resume (and the 60-second minimum) to do work you were about to do anyway. Set it long and you pay for the warehouse to sit warm through every lull.
The amber block is the trap: the timer holds the warehouse warm right after a burst and fails to predict the long idle gap that follows. One static number cannot be right for both the within-burst pause and the between-burst lull, because they are different distributions.
A static idle timer is a single guess applied to a stream that has at least two rhythms. It is wrong in both directions at once.
— the auto-suspend problem in one line
04 / FROM TIMER TO FORECASTReplacing the guess with a Poisson idle model
chukei sits in the Snowflake query path as a transparent wire-protocol proxy, so it sees the actual arrival times of queries per warehouse and per role. Instead of a fixed timeout, it models the idle gaps as a Poisson process and forecasts how likely the next query is to arrive in the next few seconds. When the probability of imminent work drops below a threshold, it knows the warehouse is genuinely idle — not just pausing mid-burst — and recommends an early suspend. This is plain statistics on arrival times: deterministic, no LLM anywhere near the decision.
Poisson, not Holt-Winters; suggest-only, not auto-pilot. chukei forecasts idle windows from query arrival rates and starts in suggest-only mode — it shows you the suspends it would recommend before it is allowed to act. In simulation, the model captured ~94% of the modelled idle savings versus a naive fixed timer.
In suggest-only mode chukei emits a recommendation per warehouse — the suspend it would have triggered, the credits it would have saved, and the confidence — so you can audit the model against your own SLAs before handing it the lever:
# Show the suspends chukei would recommend, without enforcing anything
chukei suspend suggest --warehouse BI_WH --window 30d
warehouse BI_WH (size XS · 1 credit/hr)
─────────────────────────────────────────────
model Poisson idle · suggest-only (no enforcement)
idle gaps 1,884 windows over 30 days
current auto_suspend = 600s (static)
suggested suspend after ~72s idle · p(next query <72s) < 0.05
captured ~94% of modelled idle savings vs fixed timer
projected ↓ idle credits — within the 15–30% target band
→ promote per warehouse/role with: chukei suspend enforce --warehouse BI_WH
05 / ENFORCING SAFELYEnforce per warehouse and per role
Suggest-only is the default for a reason: you validate the model against the warehouses you trust least before it touches anything. When the recommendations hold up, you promote enforcement one warehouse and one role at a time — a low-stakes reporting warehouse first, a latency-sensitive interactive one later or never. Enforcement is scoped, never account-wide-by-default, and like every chukei behaviour it fails open: if the proxy is unsure or unavailable, the warehouse keeps Snowflake’s own native auto-suspend and nothing breaks.
The 15–30% savings band is a target to validate, not a guarantee — the real
number depends on how bursty your workload is and how loose your current timers
are. The replay simulator we cover in
how to reduce Snowflake costs lets you
project the suspend savings from a QUERY_HISTORY export before any of this
touches your account. For the full picture — sizing, suspend, caching, and
attribution together — start at the cornerstone guide to
Snowflake cost optimization.
Key takeaways
- Warehouse credits double every size step (XS 1/hr → 4XL 128/hr); size up only when queries spill or queue, and size down when they don’t.
- Right-sizing is half the job — a correctly sized warehouse still leaks credits sitting warm and idle between queries.
- A static auto-suspend timer is a guess that is wrong in both directions: too short suspends mid-burst, too long pays for idle.
- chukei replaces the guess with a Poisson idle forecast — suggest-only first, capturing ~94% of modelled savings in simulation, enforced per warehouse and role, always fail-open.
- chukei is Apache-2.0, self-hosted, Snowflake-only, deterministic, with no LLM on the decision path.
Want to see the suspend suggestions for your own warehouses before changing a
single timeout? The Poisson idle model and the replay simulator ship in the
Apache-2.0 release — the docs walk through running
chukei suspend suggest against your account and reading the projected savings.
Frequently asked questions
- What size Snowflake warehouse do I need?
- Start with the smallest size that meets your latency SLA, then size up only when queries spill to remote disk or queue. For most BI and reporting workloads an XS or S warehouse is enough; large transforms may justify L or bigger. Size is a per-warehouse decision, not an account-wide default.
- How does Snowflake auto-suspend work?
- Auto-suspend stops a warehouse after it has been idle for a fixed number of seconds, so you stop paying per-second credits until the next query resumes it. The timeout is a static guess you set per warehouse — it does not adapt to your actual query arrival pattern.
- What's the best auto-suspend timeout?
- There is no single best value: a short timeout (e.g. 60s) saves the most idle compute but risks cold resumes mid-burst, while a long one wastes credits between queries. The right timeout depends on how queries actually arrive — which is why chukei models idle windows with a Poisson process instead of using one fixed number.
- How many credits does each Snowflake warehouse size use?
- Credit consumption doubles with each size step: XS uses 1 credit/hour, S uses 2, M uses 4, L uses 8, XL 16, 2XL 32, 3XL 64, 4XL 128. Billing is per-second with a 60-second minimum on resume, so idle time between queries is pure waste.
Owns the fail-open guarantees and the idle-suspend modelling. Believes the safest optimisation is the one that degrades to passthrough.