Humanity's Last Exam
A 2,500-question exam crowdsourced from subject-matter experts across hundreds of disciplines. Designed to remain unsaturated by frontier models for as long as possible.
Best results
Frontier over time
All results
| # | Model | Score | Conditions | Eval date | Source | Flags |
|---|---|---|---|---|---|---|
| 1 | Kimi K2.6 | 54.0% | CoT | 20 Apr 2026 | Self-reported | Primary |
| 2 | Claude Opus 4.7 | 46.9% | — | 16 Apr 2026 | Self-reported | Primary |
| 3 | Grok 4 Heavy | 44.4% | CoT · Pass@1 | 09 Jul 2025 | Self-reported | Primary |
| 4 | Gemini 3.1 Pro | 44.4% | CoT | 19 Feb 2026 | Self-reported | Primary |
| 5 | Muse Spark | 42.8% | CoT | 08 Apr 2026 | Self-reported | Primary |
| 6 | GPT 5.5 | 41.4% | CoT | 23 Apr 2026 | Self-reported | Primary |
| 7 | Gemini 3 Deep Think | 41.0% | CoT | 12 Feb 2026 | Self-reported | Primary |
| 8 | Deepseek 3.2 | 40.8% | Pass@1 | — | Paper | Primary Verified |
| 9 | Gemini 3 Pro | 37.5% | CoT | 18 Nov 2025 | Self-reported | Primary |
| 10 | Gemini 3 Flash (Thinking) | 33.7% | — | 17 Dec 2025 | Self-reported | Primary |
| 11 | Claude Sonnet 4.6 | 33.2% | — | 17 Feb 2026 | Self-reported | Primary |
| 12 | GLM-5.1 | 31.0% | — | 07 Apr 2026 | Self-reported | Primary |
| 13 | DeepSeek 3.2 Speciale | 30.6% | Pass@1 | 01 Dec 2025 | Paper | Primary Verified |
| 14 | Grok 4 | 25.4% | — | 09 Jul 2025 | Self-reported | Primary |
| 15 | GPT 5 (Thinking) | 24.8% | — | 07 Aug 2025 | Self-reported | Primary |
| 16 | DeepSeek V3.1 Terminus | 21.7% | — | 22 Sep 2025 | Self-reported | Primary |
| 17 | Gemini 2.5 Pro (Thinking) | 21.6% | — | 17 Dec 2025 | Self-reported | Primary |
| 18 | o3 | 20.3% | — | 16 Apr 2025 | Self-reported | Primary |
| 19 | Gemini 2.5 Pro | 18.8% | CoT | 25 Mar 2025 | Self-reported | Primary |
| 20 | GLM 4.6 | 17.2% | — | 30 Sep 2025 | Self-reported | Primary |
| 21 | Gemini 2.5 Flash (Thinking) | 11.0% | — | 17 Dec 2025 | Self-reported | Primary |
| 22 | o1 | 8.12% | — | 16 Apr 2025 | Self-reported | Primary |
| 23 | GPT 5 | 6.30% | — | 07 Aug 2025 | Self-reported | Primary |
| 24 | Gemini 2.5 Flash-Lite | 5.10% | — | 26 Sep 2025 | Self-reported | Primary |
MongoDB - Build AI That Scales
