Noise (in ML)
Random, irrelevant variations in data that don't reflect the true underlying patterns, which models must learn to ignore rather than memorize.
Sources
Measurement errors, label errors, irrelevant features, natural variability, and data collection artifacts. In images: sensor noise, compression artifacts. In text: typos, ambiguous labels, subjective annotations.
Handling Noise
Regularization prevents overfitting to noise. Robust loss functions downweight outliers. Data cleaning removes the most obvious noise. Ensemble methods average out random noise across models. In diffusion models, noise is intentionally added and removed as a generative mechanism.