Error detection precedes error correction

The bug you never found

Every programmer knows the nightmare: a system that produces wrong outputs, but passes every test. The code compiles. The dashboard is green. Users are complaining, but the monitoring shows nothing abnormal. The error exists — it has existed for weeks — but the system has no mechanism to detect it. And so correction never begins. Not because the team lacks skill. Not because the error is inherently unfixable. But because nothing in the entire infrastructure is looking for it.

The previous lesson established that all systems produce errors. This lesson makes the harder claim: the bottleneck in error correction is almost never the correction itself. It is the detection. The gap between a system that makes errors and a system that fixes them is not a gap in repair capability. It is a gap in sensing. You cannot fix what you cannot see, and most systems — cognitive, organizational, technical — are radically under-instrumented for seeing their own failures.

Your brain already knows this: the error-related negativity

Your brain has dedicated neural hardware for error detection, and neuroscience has been mapping it since the early 1990s.

In 1991, two independent research groups — Falkenstein and colleagues in Dortmund and Gehring and co-workers in Illinois — simultaneously discovered a distinct electrical signal in the brain that fires approximately 50 to 100 milliseconds after a person commits an error in a speeded response task. They called it the error-related negativity (ERN). The signal is generated primarily in the anterior cingulate cortex (ACC), a region of the medial prefrontal cortex that sits at the intersection of cognitive control and performance monitoring (Gehring et al., 1993; Falkenstein et al., 1991).

The critical finding is in the timing. The ERN fires before the person is consciously aware of having made an error. Your brain detects the error before "you" do. In many cases, corrective motor responses — the hand pulling back from the wrong key — begin before the conscious registration of the mistake. Detection is not just prior to correction in principle. It is prior in the literal neural sequence.

Patients with damage to the anterior cingulate cortex show diminished ERN signals and correspondingly impaired error detection. They still make errors at normal rates. They simply cannot detect them. And because they cannot detect them, they cannot correct them. The error rate stays constant. Not because repair is impossible, but because the detection mechanism is broken (Swick & Turken, 2002).

This is the neuroscience encoding of the lesson's primitive: the correction infrastructure is useless without the detection infrastructure. The ACC is the brain's error sensor. Damage the sensor and the entire downstream correction pipeline goes dark.

Reason's taxonomy: you detect slips differently than mistakes

James Reason, in his landmark Human Error (1990), drew a distinction that reshapes how you think about detection. Reason identified two fundamentally different categories of error: slips and mistakes. Slips occur when you have the right plan but execute it wrong — you meant to type "the" but typed "teh." Mistakes occur when the plan itself is wrong — you used a formula that does not apply to this problem, or you made a decision based on an assumption that was false.

The detection mechanisms for each are radically different. Slips are detected by what Reason called "auto-control processes" — rapid, largely automatic comparisons between intended action and actual action. Your fingers feel wrong on the keyboard. The word looks wrong on the screen. Detection is fast, often unconscious, and successful roughly 90% of the time in healthy individuals.

Mistakes are far harder to detect because the detection mechanism must operate at a higher level. You are not comparing action to intention — the action matched the intention perfectly. You are comparing the outcome to reality, which requires external feedback, domain knowledge, and often the passage of time. A manager who implements a flawed strategy executes the plan perfectly. The error is in the plan. Detection requires observing that the results diverge from expectations, and that requires both a clear expectation and a measurement system that can register the divergence.

Reason's framework reveals that investing in error detection means investing in two separate systems: a fast, automatic system for catching execution failures, and a slower, deliberate system for catching failures in reasoning and judgment.

Information theory: detection is a prerequisite by mathematical law

Claude Shannon's foundational work in information theory (1948) formalized a version of this principle that applies far beyond human cognition. In Shannon's framework, any message transmitted through a noisy channel will accumulate errors. The question is what you do about it.

Shannon showed that error-correcting codes work in two stages, and the stages are not interchangeable. First, the receiver must detect that an error has occurred. Only then can the receiver attempt to correct it. A simple parity check bit — one extra bit appended to a data string — can detect single-bit errors but cannot correct them. A Hamming code adds more redundancy, enabling both detection and correction, but the detection capacity must always equal or exceed the correction capacity. You can detect errors you cannot correct. You can never correct errors you cannot detect.

Richard Hamming, building on Shannon's work, made this hierarchy explicit in his error-correcting codes (Hamming, 1950). A Hamming code with minimum distance d can detect up to d-1 errors and correct up to (d-1)/2 errors. Detection always covers a wider range than correction. The mathematics are unambiguous: detection is the outer boundary, and correction operates within it.

This is not an engineering convenience. It is a mathematical constraint. Any system — biological, computational, cognitive — that attempts to correct errors without first establishing a detection mechanism that is at least as broad as the correction mechanism will produce false corrections: "fixing" things that were not broken, while leaving actual errors untouched.

Software engineering learned this the expensive way

The history of software engineering is a decades-long lesson in the primacy of detection over correction.

In the early decades of the field, the dominant approach was debugging: wait for something to break, then find and fix it. The industry spent years learning that this approach fails at scale. By the time a bug manifests as a user-visible failure, it has often propagated through multiple system layers, corrupted data, and created secondary errors that mask the original cause. Correction after the fact is expensive, error-prone, and often incomplete.

The modern discipline of observability inverts this entirely. Instead of waiting for failures and then diagnosing them, observability-first engineering instruments systems with logging, metrics, tracing, and alerting from the beginning — before any error has occurred. The goal is not to prevent errors (L-0481 established that this is impossible). The goal is to detect them as early, specifically, and automatically as possible.

The data supports the investment. Organizations with mature observability practices detect issues 2.1 times faster and achieve a 69% improvement in Mean Time to Recovery (Forrester, 2023). The recovery improvement is not because their engineers are better at fixing things. It is because their detection systems surface the error earlier, with more context, and with more precise localization. Better detection produces better correction, even with the same correction capability.

The principle generalizes beyond software. Any domain where errors compound over time — medicine, manufacturing, finance, personal habit systems — benefits more from earlier detection than from faster correction.

The AI parallel: anomaly detection as a first-class discipline

Machine learning has an entire subdiscipline devoted to error detection, and it is one of the fastest-growing areas in the field.

Anomaly detection systems — autoencoders, isolation forests, LSTM-based time series monitors — are designed to do exactly one thing: detect when a system's outputs deviate from expected patterns. An autoencoder trained on normal data learns to reconstruct that data with low error. When it encounters an anomaly, reconstruction error spikes. The spike is the detection signal. What happens after detection — alerting, root cause analysis, automated remediation — is a separate pipeline that depends entirely on the detection signal arriving first.

Reinforcement learning makes the dependency even more explicit. An RL agent that cannot detect when its actions produced poor outcomes — because its reward signal is noisy, delayed, or absent — cannot update its policy. The agent continues executing the same flawed strategy, not because it lacks the capacity to learn a better one, but because the error signal never reaches the learning mechanism. Detection failure produces learning failure.

The most sophisticated AI systems in production today — autonomous vehicles, medical diagnostic models, financial trading systems — invest more engineering resources in monitoring and anomaly detection than in the core models themselves. The industry has learned, through costly failures, that a model that works correctly 99% of the time but cannot detect the 1% failure mode is more dangerous than a model that works correctly 95% of the time with robust failure detection.

Metacognition: detection applied to your own thinking

The cognitive science of metacognition — thinking about thinking — is fundamentally a science of error detection applied inward.

Rabbitt and colleagues, beginning in the 1960s, established that humans can reliably detect and correct their own errors without requiring external feedback. But the capacity varies enormously between individuals, and it degrades predictably under specific conditions: cognitive load, time pressure, emotional arousal, and domain unfamiliarity all impair metacognitive monitoring (Yeung & Summerfield, 2012).

The practical implication is that your error detection system is not fixed. It is a skill that can be developed and a system that can be designed. Metacognitive calibration — the accuracy of your self-assessment — improves with structured practice. Students who routinely estimate their performance before receiving feedback and then compare their estimates to actual results develop sharper detection over time. The detection mechanism itself learns, but only if it receives feedback on its own accuracy.

This creates a recursive requirement that maps directly to the feedback loops you learned in Phase 24: your error detection system needs its own feedback loop. You need to detect not just errors in your output, but errors in your detection. Are you catching the errors that matter? Are you flagging false positives? Are your detection criteria calibrated to reality or to your assumptions?

The highest-performing individuals and organizations do not just detect errors. They monitor the performance of their detection systems. They detect detection failures.

Building your detection infrastructure

The practical protocol for strengthening error detection has four steps, and they must be executed in order.

Step 1: Define your error surface. Before you can detect errors, you must specify what counts as an error. This requires an explicit standard — a reference value against which outputs can be compared. If you do not have a standard, you do not have a detection mechanism. You have a vague sense of unease, which is not the same thing.

Step 2: Instrument your outputs. Create observation points where you can measure your actual outputs against your standards. For a writer, this might be a post-draft checklist. For a manager, this might be a weekly metrics review. For a decision-maker, this might be a prediction log that records what you expected and what actually happened. The key is that the observation must be specific, regular, and recorded.

Step 3: Separate detection from correction. This is the step most people skip. When you notice an error, the impulse is to fix it immediately. Resist this. Record the error first. Note what it is, where it occurred, and when you detected it. Build a detection log before you build a correction protocol. Detection done well produces the data that makes correction effective. Detection interrupted by premature correction produces neither good detection nor good correction.

Step 4: Audit your detection. Periodically review your detection log and ask: What errors did I catch? What errors did I miss that I later discovered through other means? Are there categories of error I am systematically blind to? This is metacognitive monitoring applied to your error detection system — the recursive loop that separates functional detection from the illusion of detection.

Detection is the investment; correction is the return

Phase 25 is about error correction. But the lesson that makes all other lessons in this phase work is this one: correction is downstream of detection, always. Every correction mechanism — behavioral, computational, organizational — depends on a detection mechanism that is at least as broad, at least as fast, and at least as accurate as the corrections it enables.

Shannon proved this mathematically for information channels. Reason demonstrated it empirically for human error. The ACC implements it neurally in your brain. The software industry learned it through decades of production failures. Every domain converges on the same principle: invest in detection first.

The next lesson (L-0483) introduces a critical refinement: not all errors are the same kind. Execution errors, knowledge errors, and judgment errors require fundamentally different detection strategies and fundamentally different correction approaches. The classification you begin there builds directly on the detection infrastructure you build here. You cannot classify what you have not first detected.

Sources:

Gehring, W. J., Goss, B., Coles, M. G. H., Meyer, D. E., & Donchin, E. (1993). "A neural system for error detection and compensation." Psychological Science, 4(6), 385-390.
Falkenstein, M., Hohnsbein, J., Hoormann, J., & Blanke, L. (1991). "Effects of crossmodal divided attention on late ERP components." Electroencephalography and Clinical Neurophysiology, 78, 447-455.
Reason, J. (1990). Human Error. Cambridge University Press.
Shannon, C. E. (1948). "A Mathematical Theory of Communication." Bell System Technical Journal, 27, 379-423.
Hamming, R. W. (1950). "Error Detecting and Error Correcting Codes." Bell System Technical Journal, 29(2), 147-160.
Yeung, N., & Summerfield, C. (2012). "Metacognition in human decision-making: confidence and error monitoring." Philosophical Transactions of the Royal Society B, 367, 1310-1321.
Swick, D., & Turken, A. U. (2002). "Dissociation between conflict detection and error monitoring in the human anterior cingulate cortex." Proceedings of the National Academy of Sciences, 99(25), 16354-16359.