Cosmos Reason
Overview
Cosmos Reason is NVIDIA’s open, customizable 7B-parameter reasoning VLM for physical AI and robotics. It learns space/time/physics “common sense,” plans actions from video/images, and was trained with SFT + RL. Available under the NVIDIA Open Model License on GitHub/Hugging Face and as an NVIDIA NIM.
Description
The Cosmos-Reason1 paper details a four-stage pipeline—vision pretraining, general SFT, Physical-AI SFT, and Physical-AI RL—and reports embodied-reasoning gains; research also describes larger 8B and 56B variants alongside the public 7B release. NVIDIA further claims state-of-the-art results on physical-reasoning leaderboards and a 65.7 average across key robotics/AV benchmarks.
