// FOUNDATIONS

ICE score

A back-of-envelope prioritization framework for experiment ideas: rate each on Impact, Confidence, and Ease from 1 to 10, average or multiply the three, sort by score. Picks the next experiment to run.

// what it is

ICE is the lightest-weight way to triage an experiment backlog. Impact is the size of the win if the test succeeds. Confidence is how sure you are the test will succeed. Ease is the inverse of effort — how cheap is it to run? Score each of the three from 1 to 10, combine them, and the top of the list is the next thing to test.

ICE is not a precision instrument; it is a forcing function. The act of scoring 20 ideas makes you notice that half of them are low-impact, half of the rest are low-confidence, and exactly two are both high-impact and easy. The value is making the comparison explicit so the loudest stakeholder does not just pick.

// when this matters

When to use it

Score the backlog any time you have more than five experiment ideas competing for the next slot. ICE forces a comparison where instinct otherwise picks the loudest idea. Re-score quarterly as confidence shifts.

// deeper

What this looks like in practice

The three letters trade off against each other in revealing ways. High-impact ideas tend to be low-confidence — you are doing something new. High-confidence ideas tend to be low-impact — you have already done the work to know the answer. High-ease ideas tend to be both low-impact and high-confidence — you can already do them in your sleep. The ICE score surfaces the tradeoff so the team picks deliberately instead of by reflex.

ICE is the startup-friendly cousin of RICE (Reach × Impact × Confidence ÷ Effort) and PIE (Potential × Importance × Ease). Use ICE when reach is hard to estimate, which is most early-stage products. Switch to RICE once your audience math is real and the reach numbers stop being guesses. The frameworks all answer the same question — what should we test next, given finite hours.

The numbers in ICE are not real numbers; they are 1-to-10 vibes. That sounds bad but is actually the point. The framework is valuable because it makes the team is intuitions comparable, not because the math is rigorous. If your team starts arguing about whether something is a 7 or an 8, the framework is working. If they are arguing about whether to multiply or average, they are missing the point.

// example

A worked example

// EXAMPLE

"Test a new pricing page": Impact 9, Confidence 5, Ease 6 → score 6.7. "Test a new email subject line": Impact 3, Confidence 8, Ease 9 → score 6.7. Roughly equivalent priority, so run whichever is fresher — the pricing test wins on lift potential, the email wins on velocity.

// pitfalls

Common mistakes

Treating ICE scores as precise.ICE is for ranking, not measurement. A 7.8 is not meaningfully better than a 7.6; both belong in the top tier and either is a fine pick.
Scoring alone.If one person scores the backlog, the scores reflect their biases. Score as a small group; the disagreements are the value, not the noise.
Re-scoring constantly.Re-score quarterly, or when something material changes. Weekly re-scoring turns ICE into a meeting that picks an experiment instead of a tool that does.

// related

Related terms

Pick a hypothesis. Vocabulary done.

The fastest way to learn this vocabulary is to commit one experiment. The contract takes about five minutes to write.

Run your first experiment Browse the field guides