Andrej Karpathy: The AI Workflow Shift Explained

The man who named "vibe coding" just revealed what comes after it

Mar 24, 2026

Most engineers think AI made them faster. Karpathy thinks it made them irrelevant to their own workflow.

When one of the most respected AI researchers alive tells you he hasn’t typed a line of code since December, that’s worth 2 hours of your time. I spent them. Here are the 10 biggest takeaways.

1. December 2024 Was the Inflection Point. Most Engineers Missed It.

“In December is when it really just... something flipped where I kind of went from 80-20 of writing code myself versus just delegating to agents to like 20-80.”

Something snapped in December. The ratio flipped and kept flipping. He now delegates nearly everything to agents. He told his parents. They didn’t grasp it. That gap is the real story.

Agents could complete entire tasks, not just lines. The shift from writing to delegating happened in one month.

If your default workflow hasn’t changed since late 2024, you’re behind by a full professional generation.

2. “Coding” Is the Wrong Word. The New Verb Is “Manifest.”

“Code’s not even the right verb anymore. But I have to express my will to my agents for 16 hours a day. Manifest.”

A precise description of what the work actually is now.

You’re stating intent:

▫️ Decomposing a goal into tasks

▫️ Assigning tasks to agents

▫️ Reviewing outputs at the macro level

▫️ Iterating on the instructions themselves

The skill being built right now is judgment: what to delegate, how to specify it, how to review it fast.

Intent specification and task decomposition are the new coding. Start training that muscle.

3. You’re Working With a PhD Student and a 10-Year-Old. Simultaneously.

“I simultaneously feel like I’m talking to an extremely brilliant PhD student who’s been a systems programmer their entire life, and a 10-year-old.”

Karpathy calls this “jaggedness.” Models are unevenly capable in ways humans simply aren’t.

A human expert at systems programming is probably decent in adjacent domains. Models solve a hard distributed systems problem and then fail at something obvious.

Reinforcement learning improves everything verifiable:

▫️ Did the code work?

▫️ Do the tests pass?

Softer signals like knowing when to stop and ask fall outside that loop entirely.

Build checkpoints around the jaggedness. Your job is to catch the 10-year-old moments before they compound.

4. AutoResearch Beat 20 Years of Karpathy’s Intuition Overnight.

“I’ve gotten to a certain point and thought it was fairly well tuned. And then I let AutoResearch go overnight and it came back with tunings I didn’t see.”

Karpathy tuned NanoGPT by hand. Two decades of experience. He thought it was well-optimized.

He was wrong.

AutoResearch found the wrong weight decay on value embeddings and that his Adam betas were off. These interact. Fixing one means re-tuning the others.

He’s one of the best researchers alive. And no human should be doing hyperparameter search anymore.

The recipe:

▫️ Define an objective

▫️ Create a verifiable metric

▫️ Set boundaries on what the agent can change

▫️ Remove yourself and let it run

Ask yourself: what objective haven’t you handed over yet?

5. Token Throughput Is the New GPU Utilization.

“I feel nervous when I have subscription left over. That just means I haven’t maximized my token throughput.”

In the GPU era, an idle GPU meant wasted capital. The instinct was to keep them saturated.

Karpathy applies the exact same logic to token throughput. Idle tokens mean you’re the bottleneck. You’re not delegating fast enough. You’re not running enough agents in parallel.

Finishing one agent task and then starting the next means working serially in a parallel world.

The operators who win are learning to parallelize their judgment, not just their code.

6. Every App in the App Store Probably Shouldn’t Exist.

“These apps for smart home devices, they shouldn’t even exist in a certain sense. Shouldn’t it just be APIs, and shouldn’t agents be using them directly?”

Karpathy controls his lights, HVAC, shades, pool, and security through a single WhatsApp conversation with an agent called Dobby. He used to use six separate apps.

The agent called multiple APIs in sequence. It reasoned about the result. It took compound actions across systems. No single app could do that.

The UI layer is shifting toward agents, acting on behalf of the person.

▫️ Build APIs first, UI second

▫️ Document for agents, not for users

▫️ Assume your product will be consumed programmatically

7. Agent Personality Is a Product Decision.

“With Claude, I think they dialed the sycophancy fairly well. When Claude gives me praise, I do feel like I slightly deserve it.”

Karpathy compares two agents:

▫️ Claude feels like a teammate. It modulates enthusiasm based on how good the idea actually is.

▫️ Codex is dry. It doesn’t seem to care what you’re building.

When an agent calibrates its feedback well, you trust its signals. When every idea gets the same neutral response, you stop using the agent as a thinking partner.

Personality is a product decision. Calibrated feedback builds trust. Flat affect kills it.

8. Documentation Is Dead. Write for Agents Now.

“You shouldn’t write documentation for people anymore. You should have Markdown documents for agents instead of HTML documents for humans.”

If an agent understands your library, the agent can explain it to any human who needs it. Your job is to make sure the agent gets it.

The framework for where to spend your time:

▫️ What the agent cannot do: your job

▫️ What the agent can do adequately: delegate it

▫️ What the agent can do better than you: stop doing it entirely

Spending significant time on things an agent can handle means adding friction, not value.

9. Jevons Paradox Says AI Multiplies Software Jobs.

“Software was scarce. So if the barrier comes down, the demand for software actually goes up.”

The ATM analogy. Banks introduced ATMs expecting to replace tellers. The cost of running a branch fell. More branches opened. More tellers.

Code is now cheaper to produce. Latent demand gets unlocked that was always there but too expensive to serve:

▫️ Every company that couldn’t afford a custom internal tool now can

▫️ Every workflow that required a developer now doesn’t

▫️ More software gets built. More demand for people who can direct it.

Cheaper production expands the addressable market. The skills that remain valuable are the ones that direct the new volume.

10. Open Source Will Win the Way Linux Won. By Being Everywhere.

“Linux is an extremely successful project. It runs on the vast majority of computers. There is a need in the industry to have a common open platform that everyone feels safe using.”

Karpathy maps AI directly onto the OS ecosystem:

▫️ Closed frontier models (OpenAI, Anthropic) = Windows and macOS. At the capability edge.

▫️ Open source models = Linux. Six to eight months behind. Gap narrowing fast.

The industry needs both. Frontier models handle the hardest problems. Open source handles the vast majority of simpler use cases and runs locally. Open source is foundational infrastructure.

The Karpathy Playbook for the Agentic Era

Now the bottleneck is you.

For founders: In three years, an agent will consume your software before a human does. Build APIs. Write Markdown for machines. Design for programmatic access.
For investors: Cheaper production expands the market. Token throughput infrastructure, agent orchestration, verifiable evaluation frameworks deserve serious attention.
For engineers: Identify one manual task with a verifiable output. Hand it to an agent. Remove yourself from the loop. That’s the first step.

The bottleneck is you. Maximize token throughput.
Move in macro actions. Delegate entire functionalities, not individual functions.
Write for agents first. Everything else will follow.
Personality is a product decision. Calibrated feedback beats flat affect.
What agents can’t do is your job. Everything else is theirs.

The new verb is manifest.

Start now.

ToxSec

Mar 25

love the point on coding not being the right word. at a certain point, it does feel like a new verb is needed lol. i like manifest hah!

Dhruv Jain

The jaggedness point is the most underrated takeaway here. I build with agents daily and the gap between "solves a distributed systems problem" and "fails at obvious file naming" is real and constant. The people who get burned are the ones who trust the PhD-level output and stop checking. You have to keep catching the 10-year-old moments or they compound into hours of debugging. I'd disagree slightly with the Jevons paradox framing though. More software gets built, sure. But a lot of that new software will be disposable, single-use stuff that nobody maintains. The demand for people who direct volume only holds if the software being built actually needs directing.

1 reply

3 more comments...

The AI Corner

Discussion about this post

Ready for more?