Survey of 65+ papers on model collapse. Key finding from Dohmatob et al. (ICLR 2025): even 0.1% synthetic contamination in training data causes measurable degradation.
No major dataset (FineWeb, RedPajama, C4) currently filters for AI-generated content.
How about complete crap data? We know there are people generating rubbish specifically to feed to "AI". Can they generate enough to cause problems?
Both angles are real but they play out differently.
On the deliberate side: Nightshade showed you can poison image models with a few hundred modified samples. Backdoor attacks on LLMs (sleeper agents, trojan triggers) are an active research area, and the attack surface is huge because most training pipelines just scrape the open web. So yes, someone generating garbage on purpose can cause targeted damage, especially if they understand how the data gets collected.
But the scarier part is that nobody needs to try. The accidental contamination is already happening. Models train on web data, produce outputs that end up on the web, next generation trains on that. Dohmatob et al. showed 0.1% synthetic contamination is enough to cause measurable degradation. Right now no major dataset (FineWeb, RedPajama, C4) filters for AI-generated content.
What makes this harder to think about: data quality and model performance don't always follow "garbage in, garbage out." I wrote about a related paradox where Qwen2.5-Math trained with deliberately wrong reward signals still improved almost as much as with correct ones: https://ai.gopubby.com/false-rewards-make-ai-smarter-paradox...
Models are simultaneously fragile to recursive contamination and weirdly resilient to corrupted training signals. The picture is messier than either side suggests.
Survey of 65+ papers on model collapse. Key finding from Dohmatob et al. (ICLR 2025): even 0.1% synthetic contamination in training data causes measurable degradation.
No major dataset (FineWeb, RedPajama, C4) currently filters for AI-generated content.
How about complete crap data? We know there are people generating rubbish specifically to feed to "AI". Can they generate enough to cause problems?
Both angles are real but they play out differently. On the deliberate side: Nightshade showed you can poison image models with a few hundred modified samples. Backdoor attacks on LLMs (sleeper agents, trojan triggers) are an active research area, and the attack surface is huge because most training pipelines just scrape the open web. So yes, someone generating garbage on purpose can cause targeted damage, especially if they understand how the data gets collected.
But the scarier part is that nobody needs to try. The accidental contamination is already happening. Models train on web data, produce outputs that end up on the web, next generation trains on that. Dohmatob et al. showed 0.1% synthetic contamination is enough to cause measurable degradation. Right now no major dataset (FineWeb, RedPajama, C4) filters for AI-generated content.
What makes this harder to think about: data quality and model performance don't always follow "garbage in, garbage out." I wrote about a related paradox where Qwen2.5-Math trained with deliberately wrong reward signals still improved almost as much as with correct ones: https://ai.gopubby.com/false-rewards-make-ai-smarter-paradox...
Models are simultaneously fragile to recursive contamination and weirdly resilient to corrupted training signals. The picture is messier than either side suggests.