When dashboards are all-green but something feels wrong, check for absent signals
When a system reports all-green status but something feels wrong, immediately check for missing log streams, absent metrics, or silent services rather than trusting the presence of positive signals alone.
Why This Is a Rule
The most dangerous system failures are silent: a service stops reporting rather than reporting errors, a log stream goes quiet rather than producing error entries, a metric flatlines rather than spiking. The dashboard shows all-green — not because everything is working, but because the component that would report failure has itself failed.
Your gut feeling that "something is wrong" despite all-green status is often interoceptive detection of an absence you can't consciously identify. The monitoring looks normal, but the pattern of data doesn't feel normal. This feeling is your brain's pattern-recognition system detecting that something expected is missing — a log stream that should be active, a metric that should be fluctuating, a service that should be generating events.
Trusting positive signals alone is survivorship bias applied to monitoring: you're looking at the systems that report healthy and assuming they represent the whole system. The services that have crashed silently aren't visible in the dashboard at all.
When This Fires
- System dashboards are green but page load times feel slow or user reports don't match metrics
- After a deployment when monitoring shows no errors but something feels different
- During an incident when the root cause hasn't been found but all monitored services look healthy
- Any time your intuition says "wrong" while your dashboards say "fine"
Common Failure Mode
Dismissing the gut feeling because the data contradicts it. "The dashboard says everything is green, so I must be imagining things." In monitoring, the absence of negative signals is not the same as the presence of positive signals. A health check that returns 200 OK tells you the endpoint is reachable — not that the service is functioning correctly, processing data, or producing output.
The Protocol
When you feel something is wrong despite all-green status: (1) List the services, log streams, and metrics you expect to see active. (2) For each one, verify it is actually producing current data — not just that the last data point was green, but that data points are arriving at the expected frequency. (3) Check for flatlines: a metric that hasn't changed in hours might mean stability, or might mean the reporting pipeline is broken. (4) Trust absent signals more than present ones. The thing that stopped reporting is more likely to be the problem than the thing that's still green.