Sponsor Flow - 4x faster than typing 🎤 Dictation

186,830 searches today

The front page of AI.Used by 90M+ humans.

Generate images Create AI Tools

YouTube Submit AI School Companionship SEO Summaries Chatbots Music Funny

Free mode

About Free mode

100% free

Freemium

Free Trial

Prompts Deals

DROP

Discrete Reasoning Over Paragraphs

Reading-comprehension benchmark requiring discrete operations (addition, counting, sorting) over passages. Mostly saturated by frontier models.

Reasoning Text F1 Max 100.0% Released Mar 2019 Saturated Possibly contaminated

8

Results

8

Models scored

93.0%

Top: Seed 1.5

83.3%

Median

Best results

Top primary scores; one row per model.

1

93.0%

2

91.1%

3

85.4%

4

83.4%

5

83.1%

6

80.2%

7

79.3%

8

52.0%

Frontier over time

Each dot is one model result; the line traces the running best score.

All results

Showing one canonical row per model. Show all configurations

#	Model	Score	Conditions	Eval date	Source	Flags
1	Seed 1.5	93.0%	—	22 Jan 2025	Self-reported	Primary
2	Command A	91.1%	—	07 Apr 2025	Self-reported	Primary
3	Nova Pro	85.4%	6-shot · CoT	03 Dec 2024	Self-reported	Primary
4	GPT-4o	83.4%	—	16 Apr 2025	Self-reported	Primary
5	Claude Opus 3	83.1%	3-shot · CoT	22 Oct 2024	Self-reported	Primary
6	Nova Lite	80.2%	6-shot · CoT	03 Dec 2024	Self-reported	Primary
7	Nova Micro	79.3%	6-shot · CoT	03 Dec 2024	Self-reported	Primary
8	Gemma 2	52.0%	3-shot	25 Feb 2025	Self-reported	Primary

✕

0 AIs selected

Clear selection

#

Name

Task