Data Drift
A change in the statistical properties of the data a model receives in production compared to the data it was trained on, causing model performance to degrade over time.
Types of Drift
Covariate drift: The distribution of input features changes (e.g., new types of customers). Concept drift: The relationship between inputs and outputs changes (e.g., what constitutes fraud evolves). Label drift: The distribution of target labels changes.
Detection
Statistical tests (KS test, PSI), monitoring prediction distributions, tracking model performance metrics over time, and comparing feature distributions between training and production data.
Mitigation
Regular model retraining, online learning, maintaining monitoring dashboards, and building retraining pipelines that trigger automatically when drift is detected.