TAAFT
Free mode
100% free
Freemium
Free Trial
Deals

MATH-500

MATH-500 (OpenAI subset)

500-question subset of MATH popularised by OpenAI's o-series releases. Reported widely as the standard 'MATH' number on modern leaderboards.

Math Text Accuracy Max 100.0% Released May 2023
6
Results
6
Models scored
97.3%
Top: DeepSeek-R1
92.6%
Median

Best results

Top primary scores; one row per model.

Frontier over time

Each dot is one model result; the line traces the running best score.
Best score over time0.0025.050.075.0100.0Dec 2024Feb 2025Apr 2025

All results

Showing one canonical row per model. Show all configurations
# Model Score Conditions Eval date Source Flags
1 DeepSeek-R1 97.3% CoT 21 Jan 2025 Paper Primary
2 Claude Sonnet 3.7 (Thinking) 96.2% 24 Feb 2025 Self-reported Primary
3 Llama 4 Behemoth 95.0% 05 Apr 2025 Self-reported Primary
4 DeepSeek V3 90.2% 26 Dec 2024 Paper Primary
5 Claude Sonnet 3.7 82.2% 24 Feb 2025 Self-reported Primary
6 Nova Premier 82.0% 30 Apr 2025 Self-reported Primary
0 AIs selected
Clear selection
#
Name
Task