MATH-500

MATH-500 (OpenAI subset)

500-question subset of MATH popularised by OpenAI's o-series releases. Reported widely as the standard 'MATH' number on modern leaderboards.

Math Text Accuracy Max 100.0% Released May 2023

Results

Models scored

97.3%

Top: DeepSeek-R1

92.6%

Median

Best results

Top primary scores; one row per model.

97.3%

96.2%

95.0%

90.2%

82.2%

82.0%

Each dot is one model result; the line traces the running best score.

Showing one canonical row per model. Show all configurations

#	Model	Score	Conditions	Eval date	Source	Flags
1	DeepSeek-R1	97.3%	CoT	21 Jan 2025	Paper	Primary
2	Claude Sonnet 3.7 (Thinking)	96.2%	—	24 Feb 2025	Self-reported	Primary
3	Llama 4 Behemoth	95.0%	—	05 Apr 2025	Self-reported	Primary
4	DeepSeek V3	90.2%	—	26 Dec 2024	Paper	Primary
5	Claude Sonnet 3.7	82.2%	—	24 Feb 2025	Self-reported	Primary
6	Nova Premier	82.0%	—	30 Apr 2025	Self-reported	Primary