Question
Why does comparative monitoring fail?
Quick Answer
Comparing agents on a single metric and declaring a winner. One agent may score higher on throughput but lower on sustainability. Another may look worse this week but was operating under unusual conditions. The failure is premature convergence — collapsing a multi-dimensional comparison into a.
The most common reason comparative monitoring fails: Comparing agents on a single metric and declaring a winner. One agent may score higher on throughput but lower on sustainability. Another may look worse this week but was operating under unusual conditions. The failure is premature convergence — collapsing a multi-dimensional comparison into a single number and optimizing for it, which is precisely the trap Goodhart's law describes.
The fix: Pick two agents (habits, routines, or decision rules) that serve similar goals. Define 2-3 shared metrics. Track both for one week under comparable conditions. At the end, place the results side by side: which agent performed better on which metric? Did one dominate across the board, or did they trade advantages? Write a one-paragraph verdict explaining which you will keep, modify, or retire — and why.
The underlying principle is straightforward: Compare agents against each other and against baselines to identify relative performance.
Learn more in these lessons