Pre-commit to prediction failure thresholds (X out of Y) that trigger automatic schema review
Set prediction failure thresholds as numeric ratios (X failures out of Y recent predictions) for each schema before observing prediction outcomes to trigger review when the threshold is crossed.
Why This Is a Rule
Without pre-set thresholds, schema review is triggered by subjective feeling: "I think this schema isn't working anymore." Subjective triggers are unreliable because they're biased by recency (the last failure feels like a pattern) and by emotional investment (heavily invested schemas resist review regardless of failure rate).
Pre-set numeric thresholds convert review triggers from subjective to objective: "If 3 of the last 10 predictions from this schema fail, trigger a review." The threshold is set before outcomes are observed, preventing post-hoc rationalization. When the threshold is crossed, review is automatic — not a judgment call.
The ratio format (X out of Y recent predictions) is better than absolute counts because it accounts for prediction frequency. A schema that makes 50 predictions per month with 5 failures (10%) is performing differently than one that makes 5 predictions per month with 3 failures (60%), even though both have a manageable absolute count.
When This Fires
- When setting up monitoring for any operational schema
- During schema documentation when defining review triggers
- When you want to remove subjectivity from the "when to review?" question
- After a schema failure when deciding whether to review or continue
Common Failure Mode
Setting thresholds after seeing the data: "My failure rate is 25%, so I'll set the threshold at 30%." This accommodates the current failure rate rather than challenging it. The threshold should be set based on what failure rate is acceptable for the schema's purpose, not based on what the current rate happens to be.
The Protocol
For each monitored schema: (1) Before observing any outcomes, set a threshold: "I will trigger full review if [X] of the last [Y] predictions fail." (2) A reasonable default: 3 of 10 (30% failure rate triggers review). Adjust based on schema criticality — high-stakes schemas might trigger at 2 of 10. (3) Track predictions and outcomes. When the threshold is crossed → automatic review, not "let me think about whether to review." The pre-commitment is what prevents motivated reasoning from overriding the signal.