TAAFT
Free mode
100% free
Freemium
Free Trial
Deals

Mercury 2

Mercury 2 generates via parallel refinement rather than sequential decoding, targeting high throughput in production loops like agent chains, retrieval pipelines, and real-time interaction. The post lists performance and product features including 128K context, tunable reasoning, native tool use, and structured JSON outputs aligned to a schema.
New Text Gen 7
Released: February 24, 2026

Overview

Mercury 2 is Inception Labs’ diffusion-based reasoning LLM designed for real-time latency, with tunable reasoning, long context, native tool use, and schema-aligned JSON output.

About Inception Labs

Inception’s breakthrough diffusion-based approach to language generation enables the world’s fastest, most efficient AI models with best-in-class quality.

Location: Palo Alto, California, US
View Company Profile

Tools using Mercury 2

  • Inception Chat
    The fastest commercial-grade diffusion LLM
    Open
    Inception Chat — v2
    Adds an explicit “reasoning” positioning, described as the fastest reasoning language model in their lineup. Introduces tunable reasoning as a first-class feature for trading speed vs depth per request. Adds native tool use plus schema-aligned JSON output as headline production features. Sharpens the “production loops” focus (agents, RAG, extraction) as the core value proposition rather than just raw fast generation. OpenAI-API compatibility so it can slot into existing stacks without rewrites.
Last updated: February 24, 2026
0 AIs selected
Clear selection
#
Name
Task