AI Glossary

Model Collapse

Performance degradation when AI models are trained on data generated by previous AI models.

Overview

Model collapse is a phenomenon where AI models progressively lose quality and diversity when trained on data generated by previous generations of AI models, rather than on original human-created data. Each generation amplifies errors and reduces the diversity of the data distribution, eventually producing models that generate only a narrow, degraded subset of possible outputs.

Key Details

This is an increasing concern as AI-generated content floods the internet and becomes a larger proportion of training data for future models. Research shows that even small amounts of recursive AI training data can cause significant quality degradation. Mitigations include curating high-quality human data, watermarking AI content for filtering, and careful data mixing strategies. Model collapse represents a systemic risk as the ratio of AI-generated to human-generated content increases online.

Related Concepts

model collapsesynthetic datatraining data

← Back to AI Glossary

Last updated: March 5, 2026