5

Proof or Bluff? Evaluating LLMs on 2025 USA Math Olympiad

> Our results reveal that all tested models struggled significantly, achieving less than 5% on average