TAAFT
Free mode
100% free
Freemium
Free Trial
Deals

MATH-500

MATH-500 (OpenAI subset)

500-question subset of MATH popularised by OpenAI's o-series releases. Reported widely as the standard 'MATH' number on modern leaderboards.

Math Text accuracy Max 100.0% Released May 2023
6
Results
6
Models scored
97.3%
Top: DeepSeek-R1
92.6%
Median

Best results

Top primary scores; one row per model.

Frontier over time

Each dot is one model result; the line traces the running best score.
Best score over time0.0025.050.075.0100.0Dec 2024Feb 2025Apr 2025

All results

Showing one canonical row per model. Show all configurations
# Model Score Conditions Eval date Source Flags
1 DeepSeek-R1 97.3% CoT Jan 21, 2025 paper primary
2 Claude Sonnet 3.7 (Thinking) 96.2% Feb 24, 2025 self reported primary
3 Llama 4 Behemoth 95.0% Apr 5, 2025 self reported primary
4 DeepSeek V3 90.2% Dec 26, 2024 paper primary
5 Claude Sonnet 3.7 82.2% Feb 24, 2025 self reported primary
6 Nova Premier 82.0% Apr 30, 2025 self reported primary
0 AIs selected
Clear selection
#
Name
Task