Mercury 2
Overview
Mercury 2 is Inception Labs’ diffusion-based reasoning LLM designed for real-time latency, with tunable reasoning, long context, native tool use, and schema-aligned JSON output.
About Inception Labs
Inception’s breakthrough diffusion-based approach to language generation enables the world’s fastest, most efficient AI models with best-in-class quality.
Tools using Mercury 2
-
Inception Chat — v2Adds an explicit “reasoning” positioning, described as the fastest reasoning language model in their lineup. Introduces tunable reasoning as a first-class feature for trading speed vs depth per request. Adds native tool use plus schema-aligned JSON output as headline production features. Sharpens the “production loops” focus (agents, RAG, extraction) as the core value proposition rather than just raw fast generation. OpenAI-API compatibility so it can slot into existing stacks without rewrites.
