TAAFT
Free mode
100% free
Freemium
Free Trial
Deals

AIME 2024

American Invitational Mathematics Examination 2024

30 problems from AIME I and II 2024. Standard high-school competition math eval before AIME 2025 superseded it as primary signal.

Math Text Accuracy Max 100.0% Released Feb 2024
19
Results
18
Models scored
93.4%
Top: o4 mini
69.6%
Median

Best results

Top primary scores; one row per model.
1
93.4%
2
91.6%
8
74.3%
10
69.6%

Frontier over time

Each dot is one model result; the line traces the running best score.
Best score over time0.0025.050.075.0100.0Oct 2024Jul 2025Apr 2026

All results

Showing all configurations including non-primary alternates.  · Show only primary
# Model Score Conditions Eval date Source Flags
1 o4 mini 93.4% 16 Apr 2025 Self-reported Primary
2 o3 91.6% 16 Apr 2025 Self-reported Primary
3 Qwen3 235B A22B 85.7% 28 Apr 2025 Self-reported Primary
4 Phi 4 reasoning plus 81.3% CoT 08 Jul 2025 Self-reported Primary
5 Qwen3-30B-A3B 80.4% 28 Apr 2025 Paper Primary
6 Qwen3 30B A3B 80.4% 28 Apr 2025 Self-reported Primary
7 DeepSeek-R1 79.8% CoT 21 Jan 2025 Paper Primary
8 o1 74.3% 16 Apr 2025 Self-reported Primary
9 Magistral Medium 73.6% CoT 10 Jun 2025 Self-reported Primary
10 Kimi K2 69.6% 11 Jul 2025 Self-reported Primary Verified
11 Claude Sonnet 3.7 (Thinking) 61.3% 24 Feb 2025 Self-reported Primary
12 Nemotron 3 Super 53.3% pass@32 03 Apr 2026 Self-reported Primary
13 Grok 3 52.2% 19 Feb 2025 Self-reported Primary
14 Grok 3 52.2% 19 Feb 2025 Self-reported Primary
15 GPT 4.1 48.1% 14 Apr 2025 Self-reported Primary
16 Grok 3 mini 39.7% 19 Feb 2025 Self-reported Primary
17 DeepSeek V3 39.2% 26 Dec 2024 Paper Primary
18 Claude Sonnet 3.7 23.3% 24 Feb 2025 Self-reported Primary
19 Claude Haiku 3.5 5.30% 0-shot · CoT 22 Oct 2024 Self-reported Primary
20 Claude Haiku 3 0.80% 0-shot · CoT · standard 22 Oct 2024 Self-reported
0 AIs selected
Clear selection
#
Name
Task