Define rollback metrics and automated reversal triggers before deploying — make every deployment a two-way door
When implementing feature flags, canary deployments, or A/B tests, treat the deployment decision as a two-way door by defining rollback metrics and automated reversal triggers before deployment.
Why This Is a Rule
Software deployment is a natural laboratory for the one-way/two-way door framework (Classify every decision as one-way or two-way door before deliberating — minutes for reversible, days for irreversible). Feature flags, canary deployments, and A/B tests are specifically designed to convert one-way doors (ship code to all users) into two-way doors (ship code to a subset with automatic rollback). But the reversibility these tools enable only works if the reversal criteria are defined before deployment — not after you're staring at a production incident trying to decide whether things are "bad enough" to roll back.
Pre-defined rollback metrics ("error rate >2%," "p95 latency >500ms," "conversion rate drops >5%") remove human judgment from the reversal decision at the moment when judgment is most compromised — during a crisis with stakeholders watching. Automated reversal triggers go further: they execute the rollback without requiring anyone to make a decision under pressure. The deployment decision becomes genuinely two-way because reversal happens automatically when conditions warrant.
Without pre-defined criteria, the deployment becomes a de facto one-way door despite the technical capability to roll back. Teams delay rollback because "maybe it'll stabilize," "we've invested too much to revert," or "the metrics look bad but maybe it's normal variance." These are escalation-of-commitment biases (After irreversible commitments, schedule external reviews with pre-defined criteria — escalation of commitment corrupts self-assessment) operating in real-time. Pre-defined criteria neutralize them.
When This Fires
- Before any deployment that uses feature flags, canary processes, or A/B testing
- When designing deployment pipelines and wanting to embed reversibility
- When a deployment goes sideways and no one can decide whether to roll back — this rule should have fired earlier
- When applying Before committing to a one-way door, search for ways to make it two-way — trial periods, exit clauses, pilots, or phased rollouts (restructure one-way doors) to software engineering decisions
Common Failure Mode
Deploying with feature flags but no rollback criteria: "We can always roll back if it goes badly." Without defined "badly," you'll debate whether the current state is bad enough while the incident escalates. The feature flag is technically a two-way door, but operationally it's a one-way door because the reversal decision requires real-time deliberation under crisis conditions.
The Protocol
(1) Before deployment, define rollback metrics: which metrics, what thresholds, over what time window. Be specific: "error rate >2% sustained for >5 minutes" not "if errors increase significantly." (2) Where possible, automate the reversal: set up monitoring that triggers automatic rollback when thresholds are breached. (3) Define the observation window: how long after deployment before evaluating? Minutes for canary, hours for phased rollout, days for A/B tests. (4) During the observation window, watch defined metrics only. Don't get distracted by metrics that weren't in the pre-defined set. (5) If rollback triggers fire → roll back immediately. Do not override automated rollback without a specific, documented reason. The whole point of pre-defining criteria was to prevent crisis-state judgment from overriding rational analysis.