Vishnu Vardhan Bangalore, India
← Back to essays
January 2026

Grokking

Machine learning researchers noticed something weird. They'd train these small models on simple stuff, like addition but with some twist, and the model would just memorize everything perfectly but fail every single test. Then for thousands of training steps, nothing. Flatline. Looks like it's stuck just repeating what it saw without actually getting it.

Then out of nowhere, boom, it shoots up to perfect performance. Suddenly it can generalize. They called it grokking, this word from Heinlein about deep intuitive understanding. The obvious story is that it was memorizing, then something clicked, and finally it understood.

But when they actually looked inside, that story was wrong. It wasn't stuck at all. During all those steps where it looked like nothing was happening, it was quietly building structure, putting the pattern together, strengthening the right circuits, slowly moving toward the solution. The breakthrough wasn't sudden. The metric just couldn't see it. Accuracy only tells you right or wrong. It doesn't show confidence going from 47% to 99%. The model was learning the whole time, we just couldn't tell.

And we do the exact same thing to ourselves.

You know that plateau. Three weeks into some course and nothing makes sense. You read the book, do the problems, watch the lecture twice. The symbols are just symbols. You can follow along when someone guides you, but the second you're on your own, it all falls apart. Feels like carrying water in a sieve.

Then one day, supposedly, it clicks. Eureka. We love that story.

But really, what happened? You didn't just fail for a month and then wake up understanding. You failed, then failed in a different way, then got one piece right while messing up another, then got the right answer but couldn't say why, then could explain why but only for that exact problem. The click is real. The "suddenly" is just how we rewrite history afterward.

That eureka moment isn't knowledge arriving. It's knowledge reorganizing.

Before the click, you've got scattered pieces. Facts, formulas, half remembered examples. They're in there but not organized. You can pull them up if someone asks the right way, but you can't move between them. Can't derive one from another.

The click is when those fragments snap into a shape. Not more information, but structure for the information you already have. You stop reciting steps and start actually moving through the ideas. You can rederive what you forgot because you know where it lives.

That's why it feels like lightning but isn't. Your mental model becomes a tool for reasoning, not just remembering. The insight didn't arrive. It crystallized.

Before the click, learning feels like accumulation. You're stacking facts, hoping the pile turns into a building. Every new piece feels heavy.

After the click, learning feels like recognition. New facts get pulled into what you already have, like iron filings to a magnet. An expert skims a paper and knows right away where it fits, not because they read faster, but because they have a place for it. The structure has gravity.

That's why experts seem to learn effortlessly in their domain. They're not smarter. They've built the scaffolding. New knowledge just slots in.

And that's why the plateau feels so demoralizing. You're building scaffolding, but scaffolding doesn't feel like progress. It feels like nothing, until suddenly it can hold weight.

So next time you're in that plateau, rereading, redoing problems, feeling like nothing connects, remember what's actually happening. You're not stuck. You're building structure that your own metrics can't see yet.

The click will come. Not as a lightning bolt, but as the moment you finally notice what you'd already built.

Keep grokking.