Cosmos Reason

Cosmos Reason

NVIDIA Cosmos Reason is a vision-language model built to power robots and vision AI agents with physical common sense and long chain-of-thought planning. It reasons over multimodal inputs (video, images, text), understands fundamentals like space, time, and causal physics, and outputs step-by-step decisions for embodied tasks—spanning robot planning, autonomous-driving perception, data curation/annotation, and video-analytics agents. First shown in the 2025 GTC wave of “Cosmos world models,” it’s released as an open, customizable 7B model under the NVIDIA Open Model License, with deployable endpoints via NVIDIA NIM and weights on Hugging Face/GitHub.
The Cosmos-Reason1 paper details a four-stage pipeline—vision pretraining, general SFT, Physical-AI SFT, and Physical-AI RL—and reports embodied-reasoning gains; research also describes larger 8B and 56B variants alongside the public 7B release. NVIDIA further claims state-of-the-art results on physical-reasoning leaderboards and a 65.7 average across key robotics/AV benchmarks.

Overview

Cosmos Reason is NVIDIA’s open, customizable 7B-parameter reasoning VLM for physical AI and robotics. It learns space/time/physics “common sense,” plans actions from video/images, and was trained with SFT + RL. Available under the NVIDIA Open Model License on GitHub/Hugging Face and as an NVIDIA NIM.

🚗Autonomous driving 🍔Burger images 🤖Cybernetic art

About NVIDIA

Industry: Computer Hardware Manufacturing

Company Size: 36000

Location: Santa Clara, California, US

Website: nvidia.com

View Company Profile

Tools using Cosmos Reason

No tools found for this model yet.

Last updated: February 26, 2026

Search

Overview

About NVIDIA

Tools using Cosmos Reason

Related Models

Help

People also viewed

Create AI Tools

Mini Tool

Vibe code an AI Tool

Choose listing type: