AI Glossary

Machine Unlearning

Techniques for removing the influence of specific training data from a trained model.

Overview

Machine unlearning is the process of removing the influence of specific data points or data subjects from a trained machine learning model, as if that data had never been included in training. This is motivated by privacy regulations like GDPR's right to be forgotten and by the need to remove toxic, copyrighted, or erroneous training data.

Key Details

Naive approaches (retraining from scratch without the data) are computationally prohibitive for large models. Efficient unlearning methods include SISA (Sharded, Isolated, Sliced, Aggregated training), influence function-based approaches, and gradient-based methods that approximately reverse the effect of specific data. Verifying that unlearning was successful is itself a challenging open problem.

Related Concepts

differential privacy • ai safety • fine tuning

← Back to AI Glossary