NVIDIA has launched NVIDIA Cosmos 3, an open world foundation model for physical AI built on a mixture-of-transformers architecture that combines vision reasoning, world generation, and action prediction in a single system, according to NVIDIA.
Cosmos 3 is a fully open omnimodel that can natively understand and generate text, images, video, ambient sound and actions with physics accuracy, reducing physical AI training and evaluation cycles from months to days.
NVIDIA also launched the NVIDIA Cosmos Coalition, a global collaboration between world model builders and AI developers.
“The big bang of physical AI is just around the corner thanks to breakthroughs in multimodal reasoning language, vision and world models,” says Jensen Huang, founder and CEO of NVIDIA. “The Cosmos 3 family of open, frontier omnimodels gives developers a generational leap in ability to build robots, autonomous vehicles and vision AI that perceive, reason, plan and act in the physical world.”
A New Architecture for Physical AI
Cosmos 3 addresses a challenge in physical AI: enabling robots, autonomous vehicles (AVs) or vision agents to generalize in the real world with limited training data and fragmented simulation stacks.
The model’s mixture-of-transformers architecture pairs a reasoning transformer with an expert generation transformer, enabling Cosmos 3 to understand object interactions, motion and spatial-temporal relationships before generating video and action trajectories.
Developers can use Cosmos 3 as:
Cosmos Coalition and Open World Model Development
The Cosmos Coalition is a global collaboration between world model builders, AI developers and physical AI leaders to advance open world models across industries. Founding coalition members include Agile Robots, Black Forest Labs, Generalist, LTX, Runway and Skild AI.
Developers Build on Cosmos
The Cosmos platform powers NVIDIA’s physical AI stack to accelerate training and evaluation workflows across industries. The platform now includes new datasets for robotics, physics, human motion, autonomous driving, warehouse safety and spatial reasoning, as well as new physical AI agent skills for neural scene reconstruction, defect-image generation and video augmentation.
Availability
Cosmos 3 Super and Cosmos 3 Nano are available now, with Cosmos 3 Edge coming soon for real-time inference. Developers can try Cosmos 3 on build.nvidia.com, download open models from Hugging Face, customize models and generate synthetic data with Hugging Face Diffusers and resources on GitHub, and deploy the models as NVIDIA NIM microservices.
Sources: Press materials received from the company and additional information gleaned from the company’s website.


Since its founding in 1993, NVIDIA (NASDAQ: NVDA) has been a pioneer in accelerated computing. The company’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, ignited the era of modern AI and…
Cut Retrieval-Augmented Generation (RAG) Hallucinations by 50%
Most teams hit the same wall with enterprise AI: LLMs that hallucinate, pipelines that don’t scale, and infrastructure that’s harder to design than the models themselves.
DE's editors contribute news and new product announcements to Digital Engineering. Press releases may be sent to them via [email protected].
Follow DE
Join over 90,000 engineering professionals who get fresh engineering news as soon as it is published.