AI Glossary

Normalization

Techniques that standardize data or neural network activations to a consistent scale, improving training stability and convergence speed.

Data Normalization

Min-max scaling: Scale features to [0, 1]. Standard scaling (z-score): Transform to mean=0, std=1. Robust scaling: Uses median and IQR, resistant to outliers. Applied to input features before training.

Activation Normalization

Batch Normalization: Normalizes across batch dimension. Layer Normalization: Normalizes across feature dimension (used in transformers). RMSNorm: Simplified layer norm without mean centering.

Why It Helps

Prevents features with large values from dominating. Stabilizes gradient flow in deep networks. Reduces sensitivity to learning rate and initialization. Enables faster training with larger learning rates.

← Back to AI Glossary

Normalization

Data Normalization

Activation Normalization

Why It Helps

Related Articles

Batch Normalization: Why It Works and How to Use It

Weight Initialization: Why It Matters More Than You Think

Text Preprocessing: Tokenization, Stemming, and Lemmatization

The Vanishing Gradient Problem and How to Solve It

Neural Network Fundamentals: Layers, Weights, and Biases

Related Concepts