The 80% Problem in Agentic Coding

Jan 28

Managing comprehension debt when leaning on AI to code

29 Comments

The 80% problem maps directly to what we've seen in production. The gap isn't model capability — it's the compound cost of verification.

That last 20% isn't linear. Each incremental % requires exponentially more human oversight, edge case handling, and rollback infrastructure. The economics flip somewhere around 75-85% depending on domain complexity.

Most teams underestimate this until they've built it twice.

Reply (1)

Addy Osmani

Yeah I think what can be particularly insidious is that the first 70-80% feels so effortless that teams underestimate the verification infrastructure needed for the remainder. By the time they realize the exponential cost, they're already committed to the approach.

Building it twice is unfortunately common (I believe) - the first time to learn where the real costs hide.

Reply (1)

Devesh

"Committed to the approach" is the trap. We hit this exact point at Kult around 82% accuracy on shade matching. Leadership had already announced the feature. Engineering had invested months. The remaining 18% became a sunk cost negotiation rather than a design decision.

What finally worked: We stopped trying to close the gap with the same system. Built a hybrid where the AI handles the 80%, flags uncertainty, and routes edge cases to structured human review. The economics shifted overnight.

Twice is mercy. Some teams build it three times before they realize the 100% target was never the right goal.

Esborogardius Antoniopolus

We should not take the opinion of Andrej too seriously. He is an AI researcher, and most AI researchers are generally at most passable software engineers, if not outright bad ones.

It is the same old age problem of scientists code.

Reply (1)

Addy Osmani

Yeah the distinction between "scientist code" and production engineering is fair (this is one reason I included the Claude Code team perspective too), but the patterns he describes - assumption propagation, abstraction bloat, sycophantic agreement - these show up regardless of whether you're doing research or building enterprise systems.

Amitabh Ranjan Sharan

Very good read , here is my perspective as well - https://open.substack.com/pub/amitabhsharan/p/the-one-question-that-matters-in

Reply (1)

Addy Osmani

Thanks for sharing!

Thomas Junghans

Thank you for this overview! Excellent read!

Reply (1)

Addy Osmani

Thank you! I'm glad to hear it was helpful!

Harshal Shah

Great article, a lot of points really resonated with me.

Reply (1)

Addy Osmani

Happy to hear it resonated, Harshal!

Paweł Twardziak

Wow! Extremely interesting to me! I love your essays around AI!

Reply (1)

Addy Osmani

Happy if they are helpful in any way!

Reply (1)

Paweł Twardziak

They are, indeed, undoubtedly!

Kimberley Modeste

Great article as always, Addy. We constantly see Agentic systems optimize locally, but not structurally. Without deterministic guardrails, context decay turns “correct” diffs into long-term drift. This is why post-hoc review can’t keep up, governance has to run continuously at the point of change. That’s the problem we’ve built Mault to solve. (Mault.ai)

Reply (1)

Addy Osmani

Yes! "optimize locally, not structurally" - that's a great way to frame it. Context decay is real, especially in longer agentic sessions where early decisions compound into architectural drift.

I think that continuous governance at the point of change probably makes sense.

Post-hoc review alone creates that 91% increase in review times we're seeing. Will check out Mault - interested to see your approach! :)

Reply (1)

Kimberley Modeste

1dEdited

Addy, I’d love to provide you with a free Mault Pro account and get your feedback. I’ll DM you with it shortly.

Nick Coleman

Thanks Addy, a great read.

Reply (1)

Addy Osmani

Thanks for the kind words, Nick!

TheNeverEndingFall

What would you tell someone who wants to learn to become a software developer starting today?

(I see three responses, heavily concentrated along (1) and (3)

1) Don't bother. The job as you know it won't exist in 2–3 years. It's like learning to ride a horse in 1906.

2) Still do it, it's risky like learning to ride horses in 1906. The occupation will massively shrink in quantity and compensation, but you may be one of the ones that make it.

3) The best time to learn coding/software engineering.)

Reply (1)

Addy Osmani

Great question. I'd probably lean toward (2) with optimism skewed toward (3) for the right people.

The fundamentals matter more than ever: understanding systems, architecture,

debugging, problem decomposition. AI doesn't eliminate the need for these - it amplifies them. If you learn to code today with AI as a learning accelerator (not a crutch), you can cover more ground faster than previous generations.

But the job is changing imo. If your goal is to write syntax all day, that's disappearing. If your goal is to solve problems with software, we need more people who can do that well - especially people who understand both the technical fundamentals AND how to orchestrate AI effectively.

The field won't disappear, but I do think it will transform (and we don't know exactly what shape that is going to take yet). Learning now means you grow up as a native in this new paradigm rather than having to unlearn old habits later.

Rishav Mitra

Until it deletes everything (Replit XD)

Michael Utz

15h

One thing I think is overlooked in this discussion is the variety of ways that people read and understand code. I, for instance, have ALWAYS had trouble just looking at a wall of code and conceptualizing what's going on.

I have to step through it. I have to change a boolean value manually and watch the logs. That may mean I'm not a 10x developer, but if industry-wide adoption is the goal for AI, then they need lil' 1x engineers like me to feel like it's making their job easier.

And, frankly, it just isn't. If I can parallelize the work of conceptualizing the code and creating it, I find myself moving faster. The extent to which LLMs can assist me on that journey is the extent to which I have found them valuable or useful.

Danilo Velasquez

18h

I've been thinking on the same problem a lot. I think the way most people think on the tools are skewed on "replace your engineering team" rather than to "enhance yours".

Product managers and engineering managers already think like that: how do I make myself clear, how do I delegate, how do I recognise a good fit, whereas engineers don't have developed this skill.

I'm a way we are all becoming platform engineers. We own the platform, AI live in it. How do we make it thrive?

The Baffled Reader

Very interesting article, thank you! I understand the concept of running multiple AI agents ("orchestrator" role), but I wonder about the real long-term productivity gains, as multitasking (referred to as ‘micromanagement tax’) is known to be counterproductive.

Devesh

The commitment trap you mention is real.

We hit this exact pattern with our shade matching system. By the time we realized the confidence scoring was masking failure cases, we'd already integrated it into three different product flows.

The pivot cost wasn't the code — it was recalibrating customer expectations. They'd gotten used to "AI always has an answer." Teaching them (and our CS team) that "I'm not sure, let me show you options" is actually better... that took longer than the rebuild.

The second build was faster but the trust rebuild was slower.

Nick W

Agentic AI is the gateway to unlimited automation which is a must when wanting to save time and make more profits, refer to, Agentic: https://promptengineer-1.weebly.com/agentic.html

Also, Agentic AI Prompt Vault: https://promptengineer-1.weebly.com/agentic-ai-prompt-vault.html

Francisco d’Anconia

Whatever it is that you’re afraid that the agent is going to do, just build a workflow that relentlessly tests for those things and rejects them as automatically as you can.