Why decisions get lost (and how to keep the ones that matter)
Decisions get lost for a boring, mechanical reason: they are made in the places work actually happens - a Slack thread, a pull request comment, a quick call - and none of those places was built to keep a reason. The team acts on the decision, the thread scrolls away, and the why is gone within a sprint. Nobody decided to lose it. It just had nowhere to live.
That is the whole problem in one sentence. The rest of this is the mechanism in slow motion, because understanding the steps is what tells you where to intervene.
The anatomy of a lost decision
Picture a backend engineer named Daniel. On a Tuesday afternoon, the team is choosing how to throttle a webhook fan-out that keeps overwhelming a downstream service. There are three options on the table. Daniel and two others hash it out in a #platform thread, land on a token-bucket limiter in the gateway, and Daniel writes the follow-up: "going with the gateway limiter, not per-consumer backpressure - backpressure deadlocked the queue under the replay test last month." A thumbs-up emoji. Work resumes. The ticket gets closed.
Six things just happened, and only one of them survived.
- A decision was made (throttle at the gateway with a token bucket).
- A reason was attached (per-consumer backpressure deadlocked the queue).
- An option was rejected (the backpressure approach).
- A constraint was referenced (last month's replay test).
- The decision was acted on (the code shipped).
- None of it was written anywhere durable.
The code survives, so the choice survives. But the choice is the least useful part. What a future engineer needs is steps 2, 3, and 4 - the reason, the rejected path, and the constraint. Those live only in a thread that, within two weeks, no one can find and no one thinks to look for.
This is the shape people who study the problem keep describing. Engineering intent gets created outside any system of record, and the rationale ends up in chat and threads no one can locate six months later. The decision was never lost in a dramatic way. It was simply never kept.
Why the reason evaporates first
Here is the cruel part of the timeline. The choice and the reason decay at different rates.
The choice is encoded in the system. You can read the code and infer that someone chose a gateway limiter. But code cannot tell you that per-consumer backpressure was tried and rejected for deadlocking the queue. Code records what is, never what was considered and dropped. So the rejected option and its reason - the most expensive knowledge the team produced that Tuesday - is exactly the part with no home.
A few months later, Daniel rolls off the team. A new engineer, Mara, is staring at the fan-out path and notices the gateway limiter. It looks crude. There is a cleaner pattern available: push backpressure down to each consumer so they self-regulate, fewer moving parts in the gateway. Mara writes a tidy refactor, ships it, and feels good.
Mara has just rebuilt the abandoned path. The replay test that deadlocked backpressure the first time will deadlock it again, except now it is in production and the person who ran that test is gone. The team will spend a week rediscovering a fact it already knew and paid for once. This is the most common shape of a lost decision: not a gap in knowledge, but a re-litigation of a decision the team already settled and then forgot it had settled.
"Just write an ADR" is not the fix you think it is
The standard answer is architecture decision records. Write the decision down, with context and consequences, in a versioned file. It is a good idea, and on the right decisions it genuinely works.
The trouble is that ADRs depend on discipline that real teams do not sustain. The failure pattern is well documented. Teams log trivial decisions and cosmic ones while skipping the load-bearing ones, the file count grows, and the signal drowns in noise. Worse, an ADR captures a decision at a single moment and then quietly drifts out of sync as the system changes. Once the log disagrees with reality, engineers stop trusting it. They check it less, then stop writing new ones, and the practice dies a quiet death.
The format is not the problem. The dependency is. ADRs assume someone remembers to stop, switch contexts, and write a durable record after the messy part is over and the next thing is already pulling at their attention. That bet loses to the next interruption almost every time. You cannot fix a capture problem with a tool that still relies on manual capture.
And the friction is not small. Asana's research puts the average knowledge worker at around 60% of the day on "work about work" - searching for information, switching apps, chasing status - rather than the work they were hired for. Asking that same engineer to add one more deliberate writing step at the worst possible moment is the wrong ask, of the wrong person, at the wrong time.
What it would take to actually keep a decision
Work backwards from the failure. A decision gets lost because the reason and the rejected options have no durable home, and because the only proposed home requires a human to stop and fill it in. So the fix has two parts.
Capture where the signal already lives. The decision was made in a thread; that is where it should be caught. Not by asking Daniel to re-type it into a wiki, but by reading the surfaces the team already uses - chat, tickets, pull requests - and stitching the scattered signal into one decision with its reason and date attached. This is what a team knowledge graph is for: it turns the thread, the ticket, and the PR into a typed object that says we chose the gateway limiter, because backpressure deadlocked the replay test, on this date - and keeps the rejected option visible instead of letting it vanish.
Record changes as changes, not overwrites. Decisions change. Backpressure might become viable later when the consumer pool is rebuilt. When that happens, the right move is not to delete the old record and write a new one, which throws away the reason the first decision existed. It is to keep the history as a before-to-after diff with the reason and the date. A changed decision is a change, not a conflict. The path of why we believed X, then why we moved to Y is the single most valuable thing for the next person, and overwriting destroys it.
This is the core idea behind decisions that do not rot: the record is not a snapshot someone has to maintain, it is a living object the system keeps current, with sources you can trace back to the original thread. And it holds to one rule - Ody senses continuously, but it acts only when a human says so. When it spots a decision that looks changed, it nudges the owner to confirm or revise. A nudge is the ceiling. There are no silent overwrites, because a log a machine rewrote with no human in the loop is exactly how you get a record that is confidently, invisibly wrong.
The test: can your team answer "why" in thirty seconds?
Here is a concrete way to check whether your decisions are getting lost. Pick a non-obvious choice in your codebase - a queue, a caching layer, a vendor, a schema call. Ask anyone on the team two questions: Why did we do it this way? and What did we reject?
If the answer is "ask Daniel" and Daniel is the only one who knows, the decision is not kept. It is borrowed against one person's memory, and you will learn how much you leaned on them the Friday they give notice. If the answer is "I'm not sure, let me dig through Slack," the decision is already lost; you just have not paid for it yet.
The goal is not more documentation. It is fewer rebuilt mistakes. A decision is kept when its reason and its rejected options are callable by the next person - and increasingly by the next agent, since the coding assistant refactoring your fan-out path needs to know about that replay test exactly as much as Mara did. When the why is captured at the moment it is spoken and kept as it changes, nobody rebuilds the abandoned path, because the abandoned path is right there, with the reason it was abandoned still attached.
If you want to see what this looks like across a single week of one team's decisions, the week-in-one-team story walks through it. Or book a demo and bring the one decision your team keeps re-litigating.
Common questions
Why do engineering decisions get lost so often?
Most decisions are made in the places work happens - a Slack thread, a pull request comment, a hallway conversation - and those places were never built to store reasoning. The conclusion gets acted on, but the why is never written anywhere durable. Documentation that depends on someone remembering to write it after the fact loses to the next interruption almost every time.
Don't architecture decision records (ADRs) solve this?
ADRs help when teams maintain them, but most teams stop. An ADR captures a decision at a single moment and then drifts out of sync as the system changes. Once the log disagrees with reality, engineers stop trusting it, check it less, and eventually stop writing new ones. The format is fine; the problem is that it depends on continuous manual upkeep that real teams do not sustain.
What actually gets lost when a decision disappears?
Three things: the choice itself, the reason behind it, and the options you rejected. The choice usually survives in the code. The reason and the rejected options are what evaporate, and they are exactly what a future engineer needs. Without them, someone eventually rebuilds a path the team already tried and abandoned, with no memory that it failed.
How do you keep decisions from getting lost without more process?
Stop relying on people to record decisions by hand. Capture the signal where it already happens - chat, tickets, pull requests - and let a system stitch it into a decision with its reason and date attached. When the decision later changes, record the change as a before-to-after diff rather than overwriting it, so the history of why stays intact and callable by both people and AI agents.