Concepts

The irreducible epistemic atoms underlying the curriculum. 4,828 atoms across 8 types and 2 molecules

After 10+ post-action reviews, analyze in aggregate — patterns across unrelated tasks reveal systemic tendencies, not isolated errors

After accumulating 10+ post-action reviews, analyze them in aggregate to identify structural causes appearing across multiple unrelated tasks—these recurring patterns indicate systemic tendencies requiring architectural fixes not isolated corrections.

After 10+ post-action reviews, analyze in aggregate — patterns across unrelated tasks reveal systemic tendencies, not isolated errors

Add verification checkpoints at every coupling point where one process consumes another's output unreviewed — break cascade chains

Design three operating modes for every important process: full, reduced, and minimal — degrade fidelity but never lose continuity

Define triggers for mode transitions in both directions — especially the return path from degraded back to full

Test degraded modes during calm periods — if they don't work when you're resourced, they'll certainly fail under stress

Occasionally run processes in degraded mode when not required — rehearse partial failure so mode-switching is familiar, not frightening

Document five recovery components for every important process: failure mode, detection trigger, recovery steps, time target, verification

Set RTO and RPO before failure, not during crisis — define maximum acceptable downtime and data loss while thinking clearly

Log every mistake for 30 days with date, event, and conditions — no analysis, just raw data for pattern detection

Automate pattern-based error detection before manual review — reserve human attention for contextual judgment tools can't handle

Never rely on human vigilance for low-frequency monitoring over 30+ minutes — attention degrades predictably regardless of motivation

Detect errors at the earliest catchable point — correction cost increases exponentially with downstream propagation distance

Multiply direct correction time by 3x for true cost — context-switching, opportunity cost, and verification overhead are invisible

When correction exceeds 20% of weekly capacity, shift investment from faster fixes to upstream prevention

Within 24 hours of an error, write one mechanistic sentence + the incorrect assumption it revealed — before memory reconstructs

Every active goal needs an error budget — define acceptable misses per period to convert brittle perfection into resilient tolerance

Three-tier error budgets: green (no action), yellow (investigate), red (halt and redesign) — with pre-committed responses per zone

Decompose error handling into four independent subsystems: detection, diagnosis, correction, learning — partial failure shouldn't disable the whole system

After one cycle, review the corrector itself — did the correction actually prevent the target error, or does the corrector need correcting?

Find the leading indicator that appears days before recurring errors manifest — hours of warning isn't enough for prevention

Define non-overlapping agent scopes — scope collisions are architecture problems, not willpower failures

Assign exactly one agent as accountable for each decision — consulted agents advise, but only one has final authority

Agent priority orderings must be context-specific — which agent wins depends on the time, capacity state, and situation

Map agent dependencies and arrange in topological order — no agent executes before its required inputs are available

After 10+ post-action reviews, analyze in aggregate — patterns across unrelated tasks reveal systemic tendencies, not isolated errors

Add verification checkpoints at every coupling point where one process consumes another's output unreviewed — break cascade chains

Design three operating modes for every important process: full, reduced, and minimal — degrade fidelity but never lose continuity

Define triggers for mode transitions in both directions — especially the return path from degraded back to full

Test degraded modes during calm periods — if they don't work when you're resourced, they'll certainly fail under stress

Occasionally run processes in degraded mode when not required — rehearse partial failure so mode-switching is familiar, not frightening

Document five recovery components for every important process: failure mode, detection trigger, recovery steps, time target, verification

Set RTO and RPO before failure, not during crisis — define maximum acceptable downtime and data loss while thinking clearly

Log every mistake for 30 days with date, event, and conditions — no analysis, just raw data for pattern detection

Automate pattern-based error detection before manual review — reserve human attention for contextual judgment tools can't handle

Never rely on human vigilance for low-frequency monitoring over 30+ minutes — attention degrades predictably regardless of motivation

Detect errors at the earliest catchable point — correction cost increases exponentially with downstream propagation distance

Multiply direct correction time by 3x for true cost — context-switching, opportunity cost, and verification overhead are invisible

When correction exceeds 20% of weekly capacity, shift investment from faster fixes to upstream prevention

Within 24 hours of an error, write one mechanistic sentence + the incorrect assumption it revealed — before memory reconstructs

Every active goal needs an error budget — define acceptable misses per period to convert brittle perfection into resilient tolerance

Three-tier error budgets: green (no action), yellow (investigate), red (halt and redesign) — with pre-committed responses per zone

Decompose error handling into four independent subsystems: detection, diagnosis, correction, learning — partial failure shouldn't disable the whole system

After one cycle, review the corrector itself — did the correction actually prevent the target error, or does the corrector need correcting?

Find the leading indicator that appears days before recurring errors manifest — hours of warning isn't enough for prevention

Define non-overlapping agent scopes — scope collisions are architecture problems, not willpower failures

Assign exactly one agent as accountable for each decision — consulted agents advise, but only one has final authority

Agent priority orderings must be context-specific — which agent wins depends on the time, capacity state, and situation

Map agent dependencies and arrange in topological order — no agent executes before its required inputs are available