MMMU

Massive Multi-discipline Multimodal Understanding

11.5k college-level questions across 30 subjects requiring image + text reasoning (charts, diagrams, medical scans, music notation, …).

Multimodal Multimodal Accuracy Max 100.0% Released Nov 2023

Homepage Paper Code

Results

Models scored

84.2%

Top: GPT 5.1

74.4%

Median

Best results

Top primary scores; one row per model.

GPT 5.1

84.2%

GPT 5 (Thinking)

84.2%

82.9%

82.9%

81.6%

80.7%

77.8%

77.6%

Qwen 3.5 122B A10B

76.9%

Llama 4 Behemoth

76.1%

Frontier over time

Each dot is one model result; the line traces the running best score.

All results

Showing one canonical row per model. Show all configurations

#	Model	Score	Conditions	Eval date	Source	Flags
1	GPT 5.1	84.2%	0-shot · CoT	13 Nov 2025	Self-reported	Primary
2	GPT 5 (Thinking)	84.2%	—	07 Aug 2025	Self-reported	Primary
3	o3	82.9%	—	16 Apr 2025	Self-reported	Primary
4	Qwen 3.6 27B	82.9%	0-shot · standard	—	Self-reported	Primary
5	o4 mini	81.6%	—	16 Apr 2025	Self-reported	Primary
6	Claude Opus 4.5	80.7%	—	24 Nov 2025	Self-reported	Primary
7	Claude Sonnet 4.5	77.8%	—	29 Sep 2025	Self-reported	Primary
8	o1	77.6%	—	16 Apr 2025	Self-reported	Primary
9	Qwen 3.5 122B A10B	76.9%	—	24 Apr 2026	Third-party	Primary Verified
10	Llama 4 Behemoth	76.1%	—	05 Apr 2025	Self-reported	Primary
11	GPT 4.1	75.0%	—	14 Apr 2025	Self-reported	Primary
12	Claude Sonnet 3.7 (Thinking)	75.0%	—	24 Feb 2025	Self-reported	Primary
13	Claude Sonnet 4	74.4%	—	22 May 2025	Self-reported	Primary
14	GPT 5	74.4%	—	07 Aug 2025	Self-reported	Primary
15	Seed 1.5	73.9%	—	22 Jan 2025	Self-reported	Primary
16	Llama 4 Maverick	73.4%	—	05 Apr 2025	Self-reported	Primary
17	Claude Haiku 4.5	73.2%	—	15 Oct 2025	Self-reported	Primary
18	Grok 3	73.2%	—	19 Feb 2025	Self-reported	Primary
19	Claude Haiku 4.5	73.2%	—	15 Oct 2025	Self-reported	Primary
20	Grok 3	73.2%	—	19 Feb 2025	Self-reported	Primary
21	Gemini 2.5 Flash-Lite	72.9%	—	26 Sep 2025	Self-reported	Primary
22	Claude Sonnet 3.7	71.8%	—	24 Feb 2025	Self-reported	Primary
23	Llama 4 Scout	69.4%	—	05 Apr 2025	Self-reported	Primary
24	Grok 3 mini	69.4%	—	19 Feb 2025	Self-reported	Primary
25	GPT-4o	69.1%	—	16 Apr 2025	Self-reported	Primary
26	Pixtral Large	64.0%	CoT	18 Nov 2024	Self-reported	Primary
27	Pixtral 12B	52.0%	CoT	10 Oct 2024	Self-reported	Primary

Go to section

Search

MMMU

Best results

Frontier over time

All results

Help

People also viewed

Create AI Tools

Mini Tool

Vibe code an AI Tool

Choose listing type: