Sponsor Wispr Flow - Don’t type, just speak 🎤 Dictation

206,449 searches today

The front page of AI.Used by 90M+ humans.

Generate images Create AI Tools

YouTube Submit AI School Companionship SEO Summaries Chatbots Music Funny

Free mode

About Free mode

100% free

Freemium

Free Trial

Personal

Work

Creativity

Prompts

Deals

SWE-bench Multimodal

Variant of SWE-bench where issues include screenshots, diagrams and other visual context. Tests multimodal software-engineering ability.

Coding Multimodal Accuracy Max 100.0% Released Oct 2024

Homepage Paper Code

Results

Models scored

70.2%

Top: Deepseek 3.2

70.2%

Median

Best results

Top primary scores; one row per model.

Deepseek 3.2

70.2%

Frontier over time

Each dot is one model result; the line traces the running best score.

Not enough data to plot a trend yet.

All results

Showing one canonical row per model. Show all configurations

#	Model	Score	Conditions	Eval date	Source	Flags
1	Deepseek 3.2	70.2%	—	01 Dec 2025	Paper	Primary Verified

0 AIs selected

Clear selection

Name

Task

Go to section

Search

SWE-bench Multimodal

Best results

Frontier over time

All results

Help

People also viewed

Create AI Tools

Mini Tool

Vibe code an AI Tool

Choose listing type: