Model Calibration
How well a model's predicted probabilities match actual outcome frequencies.
Overview
Model calibration measures whether a model's confidence scores accurately reflect the true probability of being correct. A well-calibrated model that predicts 80% confidence should be correct about 80% of the time for such predictions. Many models, especially deep neural networks, are overconfident — they assign high probabilities even to incorrect predictions.
Key Details
Calibration is assessed using reliability diagrams (plotting predicted vs actual probabilities) and metrics like Expected Calibration Error (ECE). Calibration methods include temperature scaling (dividing logits by a learned temperature), Platt scaling (fitting a logistic regression on logits), and isotonic regression. Good calibration is critical for decision-making in healthcare, autonomous driving, and any application where confidence scores inform downstream actions.