Catastrophic Forgetting
The tendency of neural networks to abruptly lose previously learned knowledge when trained on new data or tasks.
The Problem
When you fine-tune a neural network on a new task, the weight updates can overwrite the representations learned for earlier tasks. The model 'forgets' what it previously knew. This is a major challenge for continual/lifelong learning.
Mitigation Strategies
Elastic Weight Consolidation (EWC): Adds a penalty for changing weights important to previous tasks. Replay/Rehearsal: Mixes old training data with new data. Progressive Networks: Adds new capacity for each task while freezing old parameters. LoRA/Adapters: Adds small trainable modules without modifying the base model.
Relevance to LLMs
Catastrophic forgetting is a key concern when fine-tuning large language models. Parameter-efficient methods like LoRA help because they modify only a small fraction of the model's weights.