AI Glossary

Grokking

A phenomenon where a neural network suddenly achieves perfect generalization long after memorizing the training data, suggesting models can learn true understanding given enough training.

The Discovery

First observed in 2022 on algorithmic tasks (like modular arithmetic). Models would memorize training data quickly (high train accuracy, low test accuracy), then after much more training, suddenly 'grok' the underlying pattern (test accuracy jumps to 100%).

Implications

Grokking challenges the assumption that generalization must come quickly or not at all. It suggests that models can develop structured internal representations even when they initially memorize, raising questions about when to stop training.

← Back to AI Glossary

Last updated: March 5, 2026