NVIDIA Debuts Cosmos 3 Foundation Model for Physical AI

Cosmos 3 is a fully open omnimodel with native vision reasoning and multimodal generation across text, image, video, ambient sound and action.

Source: NVIDIA

NVIDIA launches the NVIDIA Cosmos Coalition with AI labs and robotics leaders — including Agile Robots, Black Forest Labs, Generalist, LTX, Runway and Skild AI — to advance open world models. Image courtesy: NVIDIA

By DE Editors

June 9, 2026

NVIDIA has launched NVIDIA Cosmos 3, an open world foundation model for physical AI built on a mixture-of-transformers architecture that combines vision reasoning, world generation, and action prediction in a single system, according to NVIDIA.

Cosmos 3 is a fully open omnimodel that can natively understand and generate text, images, video, ambient sound and actions with physics accuracy, reducing physical AI training and evaluation cycles from months to days.

NVIDIA also launched the NVIDIA Cosmos Coalition, a global collaboration between world model builders and AI developers.

“The big bang of physical AI is just around the corner thanks to breakthroughs in multimodal reasoning language, vision and world models,” says Jensen Huang, founder and CEO of NVIDIA. “The Cosmos 3 family of open, frontier omnimodels gives developers a generational leap in ability to build robots, autonomous vehicles and vision AI that perceive, reason, plan and act in the physical world.”

A New Architecture for Physical AI
Cosmos 3 addresses a challenge in physical AI: enabling robots, autonomous vehicles (AVs) or vision agents to generalize in the real world with limited training data and fragmented simulation stacks.

The model’s mixture-of-transformers architecture pairs a reasoning transformer with an expert generation transformer, enabling Cosmos 3 to understand object interactions, motion and spatial-temporal relationships before generating video and action trajectories.

Developers can use Cosmos 3 as:

A vision language model that understands and reasons across modalities.
A world model or video foundation model that simulates physical environments and predicts future world states for training and evaluation.
The backbone for world action models that help train robots to perform specific tasks.

Cosmos Coalition and Open World Model Development
The Cosmos Coalition is a global collaboration between world model builders, AI developers and physical AI leaders to advance open world models across industries. Founding coalition members include Agile Robots, Black Forest Labs, Generalist, LTX, Runway and Skild AI.

Developers Build on Cosmos
The Cosmos platform powers NVIDIA’s physical AI stack to accelerate training and evaluation workflows across industries. The platform now includes new datasets for robotics, physics, human motion, autonomous driving, warehouse safety and spatial reasoning, as well as new physical AI agent skills for neural scene reconstruction, defect-image generation and video augmentation.

Availability
Cosmos 3 Super and Cosmos 3 Nano are available now, with Cosmos 3 Edge coming soon for real-time inference. Developers can try Cosmos 3 on build.nvidia.com, download open models from Hugging Face, customize models and generate synthetic data with Hugging Face Diffusers and resources on GitHub, and deploy the models as NVIDIA NIM microservices.

Sources: Press materials received from the company and additional information gleaned from the company’s website.

More about NVIDIA

Since its founding in 1993, NVIDIA (NASDAQ: NVDA) has been a pioneer in accelerated computing. The company’s invention of the GPU in 1999 sparked the growth of the PC gaming market, redefined computer graphics, ignited the era of modern AI and…

Cut Retrieval-Augmented Generation (RAG) Hallucinations by 50%

Most teams hit the same wall with enterprise AI: LLMs that hallucinate, pipelines that don’t scale, and infrastructure that’s harder to design than the models themselves.

Latest in NVIDIA

Latest in Physical AI

About DE Editors

DE's editors contribute news and new product announcements to Digital Engineering. Press releases may be sent to them via [email protected].

Follow DE
on Facebook
on Linkedin

NVIDIA Debuts Cosmos 3 Foundation Model for Physical AI

Cosmos 3 is a fully open omnimodel with native vision reasoning and multimodal generation across text, image, video, ambient sound and action.

More about NVIDIA

Latest in NVIDIA

Latest in Physical AI

About DE Editors

Related Topics

From our Sponsors

Digital Engineering 24/7

Design

Simulate

Additive

Digital Thread

Computing

Resources

Our Partners

Design

Top Story

Latest in Design

Simulation

Top Story

Latest in Simulation

Additive Manufacturing

Top Story

Latest in Additive Manufacturing

Digital Thread

Top Story

Latest in Digital Thread

Engineering Computing

Top Story

Latest in Engineering Computing

Subscribe

Latest Magazine

Latest Special Issue

Previous Special Issue

NVIDIA Debuts Cosmos 3 Foundation Model for Physical AI

Cosmos 3 is a fully open omnimodel with native vision reasoning and multimodal generation across text, image, video, ambient sound and action.

More about NVIDIA

Latest in NVIDIA

Latest in Physical AI

About DE Editors

Related Topics

From our Sponsors

Digital Engineering 24/7

Design

Simulate

Additive

Digital Thread

Computing

Resources

Our Partners