Qwen-RobotWorld
Overview
Language-conditioned video world model for embodied AI that predicts physically grounded future visual trajectories from natural language instructions and current observations. Supports robotic manipulation, autonomous driving, indoor navigation, and human-to-robot transfer. Built on a 60-layer double-stream diffusion transformer coupled with Qwen2.5-VL semantics.
About Alibaba
Chinese e-commerce and cloud leader behind Taobao, Tmall, and Alipay.
View Company ProfileTools using Qwen-RobotWorld
No tools found for this model yet.
MongoDB - Build AI That Scales
