Guardrail metric
A secondary metric you commit to monitoring during an experiment to make sure the variant is not winning on the primary metric by breaking something else. The "do no harm" check on a contract.
A guardrail metric answers "what could this experiment break that I want to make sure I catch?" Every contract has a primary metric — the number the experiment is trying to move. Guardrails are the other numbers you commit to watching to make sure the win is real and not coming at the expense of revenue, retention, or experience.
Guardrails are most useful when the primary metric is upstream of business outcomes. A signup-rate test should guardrail downstream activation. A price-cut test should guardrail revenue per visitor. An outbound-volume test should guardrail reply quality and unsubscribe rate. Without guardrails, a team can ship a string of "wins" that move the headline number while quietly tanking the business.
When to use it
Set guardrails any time the primary metric is upstream of revenue, retention, or experience — which is most of the time. Two to three guardrails are usually enough; more turns the verdict into a debate.
What this looks like in practice
Guardrails are not just "more metrics to watch." They are pre-committed kill conditions: if the guardrail moves the wrong way by more than X, the verdict is kill regardless of how the primary metric performed. Without that commitment, guardrails become post-hoc explanations for shipping anyway. Specify the breach threshold in the contract, the same way you specify the primary metric is kill threshold.
Pick guardrails that are downstream of the primary metric or laterally affected by the variant. A signup-rate test guardrails activation, revenue per visitor, and unsubscribe rate. A page-speed test guardrails bounce rate and conversion. Bad guardrails are upstream metrics (you cannot move them with this test) or unrelated ones (they create false alarms that derail decisions).
Guardrail breaches are common and often educational. A test that "loses" because the primary metric won but a guardrail blew up is a better verdict than one that "wins" on the primary and silently breaks retention three months later. The archive entry should record the guardrail outcome as part of the verdict, not as an afterthought tacked on after a complaint.
A worked example
A pricing experiment that cuts the monthly price by 30% might lift signup rate (primary metric) by 50%. The contract is guardrail metric — revenue per visitor — would have to stay flat or improve, or the "win" is a loss in disguise. If revenue per visitor drops more than 5%, the verdict is kill regardless of the signup-rate lift.
Common mistakes
- Choosing guardrails after the data arrives.Like every other part of the contract, guardrails must be committed before launch. Post-hoc guardrails are rationalization, not protection.
- Setting guardrails so loose they never trigger.A guardrail with a 50% breach threshold is decorative. Anchor it to a real cost — "if revenue per visitor drops more than 5%, kill."
- Having too many guardrails.Five guardrails create five debates and no decisions. Two or three with sharp thresholds beat ten with vague ones.
Related terms
Pick a hypothesis. Vocabulary done.
The fastest way to learn this vocabulary is to commit one experiment. The contract takes about five minutes to write.