Epoch
One complete pass through the entire training dataset during model training.
How Training Works
Training a neural network requires multiple passes through the data. Each pass is one epoch. Within each epoch, the data is split into batches, and the model updates its weights after each batch. A typical training run might involve 3 to 100+ epochs depending on the task.
How Many Epochs?
Too few epochs: the model underfits (hasn't learned enough). Too many epochs: the model overfits (memorizes training data). Early stopping monitors validation loss and halts training when it starts increasing.
LLM Training
Large language models typically train for just 1-2 epochs over their massive datasets (trillions of tokens). Training for more epochs on the same data risks memorization and degraded generalization.