news
newest
ask
show
jobs

5

Proof or Bluff? Evaluating LLMs on 2025 USA Math Olympiad

2 days agomauriziocalo1 comment

> Our results reveal that all tested models struggled significantly, achieving less than 5% on average

2 days agogalaxyLogic