AI/ML News

Stay updated with the latest news and articles on artificial intelligence and machine learning

NVIDIA Cosmos – the secret weapon behind AI robotics

NVIDIA has unveiled NVIDIA Cosmos, an innovative platform designed to accelerate the development of physical AI – the artificial intelligence behind robots, autonomous vehicles (AVs), and other real-world automated systems. By combining state-of-the-art world foundation models (WFMs), advanced video processing tools, and an AI-driven data pipeline, Cosmos enables developers to create, train, and optimize AI models more efficiently than ever before.

Developing physical AI has traditionally required massive amounts of real-world data, making it a costly and time-intensive process. NVIDIA Cosmos aims to change that by offering physics-based synthetic data generation, allowing developers to create photorealistic 3D environments that mimic real-world conditions. These simulated environments help train AI models without relying entirely on expensive, manually collected data.

NVIDIA describes world foundation models as fundamental to the next wave of AI, much like large language models (LLMs) revolutionized natural language processing. WFMs use a combination of text, images, video, and sensor data to simulate real-world interactions, making them essential for robotics and autonomous systems that need to navigate complex environments.

Cosmos includes a range of advanced AI tools tailored for the development of robotics and AVs:

  • Synthetic Data Generation – Using Cosmos, developers can create high-fidelity, physics-aware video simulations of industrial and driving environments, reducing dependence on real-world data collection.
  • Video Search and Understanding – AI-powered search capabilities allow users to quickly locate specific training scenarios, such as hazardous road conditions or crowded warehouse environments.
  • Predictive Intelligence and “Multiverse” Simulation – Cosmos can simulate multiple potential outcomes of a real-world scenario, helping AI models predict the best course of action.
  • Advanced Data Processing – NVIDIA’s NeMo Curator accelerates the processing of massive video datasets, making AI training more efficient.

Cosmos also introduces a visual tokenizer, which can compress and process video data 12 times faster than existing methods, making it easier to convert video recordings into usable training data.

Several leading robotics and automotive companies have already begun integrating Cosmos into their AI workflows. Among them are XPENG, Agility Robotics, Figure AI, Wayve, and Uber, each leveraging Cosmos to develop next-generation AVs and humanoid robots. For example, Waabi, a company focused on AI-driven autonomous driving, is using Cosmos for data curation and AV simulation, while Uber is working with NVIDIA to advance autonomous mobility solutions.

As AI-generated content becomes more widespread, NVIDIA has built Cosmos with strong ethical safeguards. The platform includes guardrails to prevent the generation of harmful or misleading content, along with invisible watermarks to identify AI-generated videos. Cosmos aligns with global AI safety initiatives, including the White House’s voluntary AI commitments.

NVIDIA Cosmos is now available under an open model license on Hugging Face and the NVIDIA NGC catalog. With physical AI poised to transform industries from manufacturing to transportation, NVIDIA Cosmos marks a significant step toward making AI-driven robotics more scalable, efficient, and widely available.

Learn more about Cosmos World Foundation Model Platform for Physical AIin the article available on arXiv.