Machine Unlearning
Techniques for removing the influence of specific training data from a trained model.
Overview
Machine unlearning is the process of removing the influence of specific data points or data subjects from a trained machine learning model, as if that data had never been included in training. This is motivated by privacy regulations like GDPR's right to be forgotten and by the need to remove toxic, copyrighted, or erroneous training data.
Key Details
Naive approaches (retraining from scratch without the data) are computationally prohibitive for large models. Efficient unlearning methods include SISA (Sharded, Isolated, Sliced, Aggregated training), influence function-based approaches, and gradient-based methods that approximately reverse the effect of specific data. Verifying that unlearning was successful is itself a challenging open problem.