Sponsor Flow - 4x faster than typing 🎤 Dictation

231,218 searches today

The front page of AI.Used by 90M+ humans.

Generate images Create AI Tools

YouTube Submit AI School Companionship SEO Summaries Chatbots Music Funny

Free mode

About Free mode

100% free

Freemium

Free Trial

Prompts Deals

IFEval

Instruction-Following Eval

Verifiable instruction-following: ~25 instruction types whose compliance can be checked deterministically (e.g. word counts, formats).

Language Text Accuracy Max 100.0% Released Nov 2023

12

Results

12

Models scored

93.2%

Top: Claude Sonnet 3.7 (Thinking)

89.6%

Median

Best results

Top primary scores; one row per model.

1

Claude Sonnet 3.7 (Thinking)

93.2%

2

92.1%

3

92.1%

4

90.9%

5

Claude Sonnet 3.7

90.8%

6

89.7%

7

89.5%

8

87.2%

9

87.0%

10

Mistral Small 3

82.9%

Frontier over time

Each dot is one model result; the line traces the running best score.

All results

Showing one canonical row per model. Show all configurations

#	Model	Score	Conditions	Eval date	Source	Flags
1	Claude Sonnet 3.7 (Thinking)	93.2%	—	24 Feb 2025	Self-reported	Primary
2	Nova Pro	92.1%	0-shot	03 Dec 2024	Self-reported	Primary
3	Llama 3.3	92.1%	—	06 Dec 2024	Self-reported	Primary
4	Command A	90.9%	—	07 Apr 2025	Self-reported	Primary
5	Claude Sonnet 3.7	90.8%	—	24 Feb 2025	Self-reported	Primary
6	Nova Lite	89.7%	0-shot	03 Dec 2024	Self-reported	Primary
7	Seed 1.5	89.5%	0-shot · CoT	22 Jan 2025	Self-reported	Primary
8	Nova Micro	87.2%	0-shot	03 Dec 2024	Self-reported	Primary
9	GPT 4.1	87.0%	—	14 Apr 2025	Self-reported	Primary
10	Mistral Small 3	82.9%	—	30 Jan 2025	Self-reported	Primary
11	Llama 3.2	77.4%	—	25 Sep 2025	Self-reported	Primary
12	Qwen 3.5 27B	76.5%	—	24 Feb 2026	Third-party	Primary Verified

✕

0 AIs selected

Clear selection

#

Name

Task