Core Primitive
Keep a log of what you tried and what happened for future reference.
The experiment you already ran
You have tried this before. You have already tested some change to your routine, your diet, your work habits, your sleep schedule, your communication style — and it either worked, failed, or landed somewhere ambiguous. You remember the gist. You remember whether you liked how it felt. You might even remember a specific moment, one particularly good day or one frustrating setback, that stands in for the entire experience in your memory.
But you did not write it down. And because you did not write it down, what you "remember" is not what happened. It is a reconstruction — assembled after the fact by a brain that is remarkably skilled at creating coherent narratives and remarkably poor at preserving accurate data. The hypothesis you started with has been quietly revised in your memory to match the outcome you observed, so it feels like you "knew it all along." The days that fit your overall impression have been amplified; the days that contradicted it have faded. The ending — whatever happened last — has disproportionately colored your assessment of the whole experience. And any detail that did not fit the story your brain is telling has been silently discarded.
This is not a failure of character. It is a structural feature of human memory. And it means that an experiment without recorded results is, for all practical purposes, an experiment that never happened. You cannot build on what you cannot accurately recall. You cannot iterate on what you have already distorted. The primitive for this lesson is simple and urgent: keep a log of what you tried and what happened for future reference. Not because you might forget. Because you will forget — and worse, you will remember something that did not happen the way you think it did.
Why memory fails experiments
The problem is not that your memory is weak. The problem is that your memory is strong in exactly the wrong ways for preserving experimental data. Daniel Kahneman spent decades documenting the systematic distortions that separate the "experiencing self" — the one who lives through each moment of an experiment — from the "remembering self" — the one who later reconstructs what happened. The remembering self does not replay the experience like a video. It edits, compresses, and rewrites the experience according to rules that serve narrative coherence rather than factual accuracy.
Three of these rules are particularly destructive for experimental memory.
The first is hindsight bias. Once you know how an experiment turned out, your brain retroactively adjusts your memory of what you expected. Baruch Fischhoff's original research demonstrated that this is not occasional or mild — it is pervasive, automatic, and resistant to awareness. You cannot reliably remember what you predicted before you saw the outcome, because the outcome has already contaminated the memory of the prediction. Without a written hypothesis recorded before the experiment began, you have no trustworthy baseline for evaluating what you learned.
The second is the peak-end rule. Kahneman and colleagues showed that when people evaluate a past experience, they disproportionately weight the most intense moment and the final moment. A week-long experiment with four mediocre days, one excellent day, and a strong finish will be remembered as a success. The same experiment with four good days, one terrible day, and a weak finish will be remembered as a failure — even if the aggregate data tells the opposite story. Without a daily record, your assessment of "how the experiment went" is dominated by peaks and endings, not representative experience.
The third is confirmation bias in memory retrieval. When you try to recall details of a past experiment, you search in a direction determined by your current beliefs. If you currently want to believe that intermittent fasting works for you, your memory will preferentially surface the good days and suppress the bad ones. This is not deliberate dishonesty. It is the architecture of associative memory: retrieval is cued by your current state, and your current state biases which memories are accessible.
Together, these three distortions form a kind of anti-scientific memory system. They revise predictions to match outcomes, weight assessments toward dramatic moments rather than representative data, and filter recall through current beliefs. Scientists do not work under these conditions — they write things down, record hypotheses before experiments begin, log data as it arrives, and separate observation from interpretation. The reason is not that scientists have worse memories. It is that they have recognized what memory actually does and built external systems to compensate. You need the same compensation for your personal experiments.
What to record: the experiment journal format
The recording does not need to be elaborate. Elaborate systems get abandoned. What it needs to be is specific enough to be useful when you return to it months later, and structured enough that you can compare across experiments. The following six-field format captures the essential information for any behavioral experiment.
Hypothesis. State what you predicted would happen, written before you begin. This is the single most important field because it is the only defense against hindsight bias. Once you have a written prediction, your future self cannot retroactively claim they "knew it all along." "I predict that replacing my afternoon coffee with a ten-minute walk will maintain my energy level through 5 PM" is sufficient. The specificity gives you something concrete to evaluate rather than a vague impression.
Intervention. Document exactly what you did, with enough detail that you could replicate the experiment six months from now. Record not just the behavior but the context: when, for how long, what preceded it, what followed it, and any deviations from your plan. "I meditated" is too vague. "I sat in the blue chair at 7:15 AM, used the Waking Up daily meditation for approximately 10 minutes, with noise-canceling headphones, before checking email" is replicable. The contextual details you omit are often the ones that mattered most.
Observable data. Record what you actually observed, separating objective measurements from subjective impressions. Objective: "I completed four Pomodoro cycles instead of my usual two." Subjective: "I felt noticeably calmer during the first hour of work." Both types are valuable, but conflating them makes your records unreliable. Note at least one thing you did not expect — surprises are often the most informative data points, and they are the first things memory discards.
Outcome assessment. State whether your hypothesis was supported, partially supported, or not supported. Resist the temptation to rationalize. If the data is ambiguous, say so: "The walking seemed to help on three of five days, but confounding variables make clean attribution impossible. Verdict: tentatively supported, needs a cleaner replication." Honest ambiguity is far more useful than a false clean result, because it tells your future self exactly where the uncertainty lies.
Surprises and side effects. Document anything not part of your original hypothesis. The walk experiment was about afternoon focus, but you noticed you slept better on walking days. K. Anders Ericsson, in his research on deliberate practice, found that expert performers maintain detailed performance records precisely because patterns that drive improvement are often visible in the data long before they become visible to intuition. Your surprises field captures signals that your hypothesis did not predict but your future self will want to investigate.
Next steps. What will you try next? Every experiment should generate at least one follow-up: a replication with tighter controls, a variation that tests a related hypothesis, or a pivot that explores a surprising side effect. Writing the next step immediately, while the context is fresh, is far more effective than generating follow-up ideas weeks later from a cold start.
The minimum viable record
If six fields feel like too much overhead, start smaller. The best recording system is the one you actually use.
The minimum viable record is three sentences: what you tried, what happened, and what you will do next. "Tried replacing social media with reading for the first 30 minutes after waking. Energy was better on 4 of 5 mornings, but I missed the news-checking and felt anxious about it by midday. Next: try 15 minutes of reading followed by 15 minutes of curated news to test whether a smaller dose captures most of the benefit." That is forty-five seconds of writing and enough information to be useful months later.
James Pennebaker's research on expressive writing demonstrates that the act of translating experience into words — even briefly — changes how the brain processes and stores that experience. Writing about what happened forces cognitive processing that merely thinking about it does not. The record is not just a storage medium. It is a processing mechanism. The act of recording is itself part of learning from the experiment.
Start with three sentences. Graduate to six fields when the habit is stable. The critical threshold is not format sophistication. It is the difference between writing something and writing nothing.
How records compound into a personal science database
A single experiment record is useful. A collection of records across months and years is transformative. This is the compounding effect that separates people who are perpetually "trying things" from people who are building a genuine personal science.
After twelve months of consistent recording, you might have forty to fifty documented experiments. Individually, each tells you what happened once. Collectively, they reveal patterns of your own behavioral responsiveness. You might discover that morning experiments succeed at a higher rate than evening ones — suggesting your self-regulatory capacity is front-loaded. You might notice that experiments with social accountability consistently outperform solo ones, revealing something fundamental about your motivational architecture. You might find that all your successful experiments share an overlooked feature — perhaps environmental modification rather than willpower — which gives you a design principle for every future experiment.
These meta-patterns are invisible from inside any single experiment. They emerge only with enough data points to compare. This is the principle behind scientific meta-analysis applied to your own life: the real insights come from synthesis across many studies, not from any individual one.
Ikujiro Nonaka and Hirotaka Takeuchi described the critical transition from tacit knowledge — what you know but cannot articulate — to explicit knowledge that is externalized into communicable form. An unrecorded experiment is tacit: locked in your head, subject to every memory distortion described above, inaccessible to comparison. A recorded experiment is explicit: stable, searchable, and composable. The record converts lived experience into a knowledge asset that compounds over time rather than degrading.
Charles Darwin understood this intuitively. His experimental notebooks, meticulously maintained over decades, allowed him to compare observations made years apart and notice patterns that spanned long intervals. The notebooks were not a record of his science. They were the medium through which his science became possible.
Ericsson's research on expert performance confirms the same pattern at the individual level. Experts in domains from music to surgery maintain detailed records of their practice, including what went wrong. Novices practice and move on without documentation. The difference matters because improvement in complex domains requires identifying error patterns across many attempts — patterns invisible without records because each attempt is remembered only vaguely. The expert's practice log is not a burden on top of practice. It is the mechanism by which practice becomes deliberate rather than merely repetitive.
Recording creates an external checkpoint that your memory cannot retroactively edit. When you return to a journal entry from six months ago, you encounter your actual observations — including ones that contradict your current narrative. That discomfort means the record is working: preserving information your memory system would otherwise rewrite. Without records, you will repeat experiments you have already run, abandon experiments too early because distorted memory says they failed, and miss the cross-experiment patterns that would make future experiments dramatically more effective. With records, each experiment builds on every previous one.
The Third Brain
Your experiment journal becomes qualitatively different when you feed it into an AI system that can process patterns across entries you would never compare on your own.
Give an AI your last twenty experiment records and ask: "What patterns do you see across my successful experiments versus my failed ones?" It will scan dimensions you might not think to compare — time of day, day of week, whether the experiment added or subtracted a behavior, whether you ran it solo or with accountability, whether your hypothesis was specific or vague. The AI processes experiments dimensionally, as data points with comparable features, while you process them narratively, as stories. That complementary perspective surfaces correlations invisible to either mode alone.
The AI also serves as real-time recording support. Dictate a quick observation about today's experiment, and let it format the entry into your six-field structure. Describe a surprising side effect, and let it flag the observation as a potential future hypothesis. Over time, the AI accumulates a picture of your experimental history that does not suffer from hindsight bias, does not privilege peaks and endings, and does not selectively retrieve evidence confirming your beliefs.
But recognize the boundary. The AI processes your records. It does not generate the observations. If you skip recording and try to reconstruct experiments from memory during a conversation, you are feeding distorted data into a pattern-recognition system. The AI amplifies whatever you give it. Give it real-time records, and it amplifies signal. Give it reconstructed memories, and it amplifies noise.
Building on what you recorded
The prerequisite lesson, Experimental mindset reduces fear of failure, taught you that an experimental mindset reduces fear of failure by reframing every outcome as data. This lesson gives that data somewhere to live. Without a record, "data" is a nice metaphor. With a record, it is literal — an actual dataset of your behavioral experiments that becomes the foundation for everything that follows in this phase.
The next lesson, Failed experiments are successful learning, builds directly on this foundation by examining what happens when your recorded results show that an experiment failed. With a detailed record in hand, you can analyze exactly what happened and why, turning failure from a discouraging dead end into a data-rich starting point for your next iteration. A failed experiment with a record is a map to your next breakthrough. A failed experiment without a record is just a bad memory that fades.
Write it down. Your future self is depending on it.
Frequently Asked Questions