Last December, in its TMT (technology, media, telecommunications) Predictions for 2026, the analyst Deloitte wrote, “the global cumulative installed capacity of industrial robots could reach 5.5 million by 2026… We could see an inflection point by 2030, with annual new robot shipments doubling from current levels to reach 1 million a year, driven by the following growth catalysts: (1) labor shortages in specialized industrial applications in developed countries and (2) exponential advancements in computing power and the emergence of specialized foundational AI models.”
Another analyst, Goldman Sachs, predicted, “The global market for humanoid robots may reach a market size of at least U.S. $6 billion in 10-15 years, filling 4% of the U.S. manufacturing labor shortage gap by 2030 (estimate) and 2% of global elderly care demand by 2035” (“Humanoid Robot: The AI Accelerant,” January 8, 2024).
At technology conferences, whenever Boston Dynamics’ quadruped Spot and the humanoid Atlas make cameo appearances, they attract selfie takers and curious onlookers.
At CES 2026, Atlas shared the stage with Zachary Jackowski, general manager of Atlas. “We’ve always kept a close watch on when the missing pieces of technology would fall into place to make [humanoid robots] truly commercially viable. The rapid advancement of AI in the past few years were the technologies we needed, and that moment is finally here,” says Jackowski.
Boston Dynamics announced in a blog post it would “begin production of the new Atlas robots at its Boston headquarters immediately. All Atlas deployments are already fully committed for 2026, with fleets scheduled to ship to Hyundai’s Robotics Metaplant Application Center (RMAC) and Google DeepMind in the coming months. The company plans to add additional customers in early 2027.”
In this article, we take a look at how the robots see the world around them, what it takes to train them for deployment, and how they might change the manufacturing landscape.
Spot is a four-legged robot that moves like a dog. Atlas’ shape and movements closely resemble those of a human. Early versions of Atlas stood around 4’11”. The latest incarnations have grown to 6’2”. Adopting these forms allow them to operate in the physical world that has evolved over time to accommodate animals and humans, so the robots could easily fit in, figuratively and literally.
Matt Malchano, vice president of Software, Boston Dynamics, explains how Spot and Atlas “see.” In simple terms, they are equipped with sensors. The onboard computers and the connected neural networks help decode the sensor data, construct a 3D view of the surroundings, and facilitate appropriate actions.
“Spot, for example, has five sets of stereo cameras, giving it a 360 [degree] view. Atlas also has the same kind of vision. We’re also looking at adding other sensors like LiDAR sensors and Time of Flight sensors to provide the robots with additional information,” Malchano says. “Spot is applicable in asset management, particularly in monitoring your facility. Likewise, Atlas’ primary use will likely be in manufacturing.”
Spot and Atlas use the sensor data to estimate distances and proportions, which is the basis for determining how to navigate to the target objects and execute the required tasks. The sensors also give the robots insights not easily available to the human’s naked eyes. They could, for example, perceive and record thermal and acoustic data.
“Most customers buy more than one robot and position them in different areas of the facility. The robots can perceive fixed equipment that are beginning to degrade or break down, monitor that equipment, and perform daily patrols. They can give you a thermal or acoustic history of a certain motor or generator, so you can detect the changes over time,” says Malchano.
Just as a new pet needs time to get used to your apartment or learn a new trick, robots also require a training period. “Customers usually drive the robot through a facility with a joystick so it can learn the facility,” says Malchano. “We are building state-of-the-art teleoperation capabilities for Atlas so the robot could replicate and learn from a demonstrator wearing a VR [virtual reality] headset, to show it how to stand and what to do. We’d love to get to the point where we could show the robot how to do something a few times, and it could extrapolate the rest autonomously.”
This method bypasses the need to meticulously program the robot to perform a certain task, along with all the possible permutations. With the rise of AI models that can reason—a controversial concept that will likely be debated—robotic training is expected to become easier.
Spot has proven to be operable alongside humans. In 2022, Spot made headlines in the New York Times (“See Spot Save: Robot Dogs Join the New York Fire Department,” March 17, 2022). Shaun Ray, government sales manager for Spot, Boston Dynamics, points out, “[Spot is] simple enough to use that a 5-year-old could drive it up a flight of stairs. Because Spot can get where it needs to go so much faster and so much easier than other robots, it really reduces response time for public safety agencies.”
“For Atlas, due to its size and strength, we have aggressive R&D [research and development] programs to build safety features into it. Right now, we don’t recommend Atlas to be deployed near humans, but we believe it’s important for it to be able to work side by side with humans,” says Malchano.
Malchano believes those who have worked with computer-aided design (CAD) and finite element analysis (FEA) programs will find robotic training much easier. “The interface we’ve created to work with Spot and the UI [user interface] we intend to have on Atlas will feel very natural to them. At the end of the day, the robot needs geometric information to figure out a path or plan through a space, to know where to stop and look,” he says.
During his keynote at CES 2026, NVIDIA CEO Jensen Huang discussed the emergence of reasoning AI, which promises to cut down the time and workload required to train robots. “We no longer have to train an AI model to know everything on day one, just as we don’t have to know everything on day one; we should be able, to in every circumstance, reason about how to solve that problem,” he says.
Amit Goel, head of robotics & edge AI ecosystem, NVIDIA, points out most robots will be pretrained with basic navigation skills and ready to perform basic tasks. “Things like moving through space, avoiding obstacles, and picking up objects—the robots can learn these from internet scale data,” he says. He referred to the publicly available text, video, and demonstration data showing how such basic tasks are performed.
NVIDIA Groot, AI platform for humanoid robot development, promises to let robots learn from observing tasks performed by a human trainer wearing augmented reality gear. Image courtesy of NVIDIA.
NVIDIA Cosmos, a platform of open world-foundation models—more specifically Cosmos Reason—gives robots a basic understanding of the laws of physics, such as walls and barriers that cannot be passed through.
When AI was merely a system of perception, it could recognize an object as a box, calculate its location from XYZ coordinates, and estimate its size. “It turns out that’s not enough when you have to interact with the world,” says Goel. “The robot needs to understand if it’s operating in a pharmaceutical environment, manufacturing plant, a logistics warehouse, or someone’s kitchen. It needs to know if the content is fragile to select the appropriate approach and force to avoid crushing it. If the robot has a reasoning AI model, when it encounters a scenario not covered in the training data set, it can still take safe, coherent actions by thinking about the context.”
At CES, NVIDIA announced Alpamayo, a family of open-source AI models and tools for autonomous vehicle (AV) development. According to the announcement, “The Alpamayo family introduces chain-of-thought, reasoning-based vision language action (VLA) models that bring human-like thinking to AV decision making.”
Much of the robot’s reasoning logic, if you will, is an outcome of the VLMs (large vision language models) it is trained on. NVIDIA’s latest VLA GR00T N1.6 is built on NVIDIA Cosmos VLMs trained on millions of hours of data, Goel explains. “But converting this generalist model to a domain-specific expert requires some supervised fine-tuning on domain specific data,” he says.
That means the training data needs some annotation to explicitly instruct the AI model—and, by extension the robot it empowers—what to do in certain situations. “At the moment, annotation is done with AI that generates annotations with human experts in the loop,” says Goel.
For domain-specific robotic training, data from instruction manuals, digital twin simulations, overhead camera footage of the environment, and human experts executing similar tasks can be used, Goel points out.
“Another important source of data is teleoperation data from the robot itself, collected by the robot’s sensors while operated remotely by a human,” he adds. For the models to be resilient to variations, reinforcement learning (learn by practice) is often used to improve the robustness of the model.
“Advancements in simulation, digital twins and motion control are everywhere in the robotics world,” says Tim Culverhouse, editorial director of Robotics 24/7. “Organizations are pushing boundaries in the digital world to transform these teaching techniques for robots in the real world. We’re seeing robots ‘learn’ new things every day from these innovations, and it’s expanding the capabilities of these autonomous platforms across industries.”
Siemens provides robot developers and integrators with industrial AI applications for robots, ready for integration with Siemens industrial automation platform. “Our AI vision software uses 3D cameras with RGB and depth channels as input,” Christopher Schütte, senior business development and product portfolio manager, Siemens Factory Automation, explains. “Analyzing the RGB channel with our artificial neural networks is especially valuable for solving difficult problems, such as tight packing densities, heavily overlapping items, and understanding object dimensions and orientation. We have evaluated Time-of-Flight, LiDAR, and stereo vision technologies and have achieved high reliability with 3D cameras that use structured-light technology.”
Boston Dynamics’ quadruped robot Spot has shown to be operable alongside humans in public safety functions. Image courtesy of Boston Dynamics.
Siemens’ factory automation ambition is to shape the future of automation with robotics as an integral part of it. As Schütte sees it, computer vision and machine learning are currently demonstrating their potential to address challenges that rule-based automation cannot solve.
“If your goal today is productivity and throughput, use-case-specific, trained Physical AI approaches still outperform ‘one-size-fits-all’ generalized, multimodal robot foundation models in terms of reliability and speed,” Schütte notes.
For training robotics AI vision products, Schütte emphasizes that “high-quality real-world training data is mission critical.” A core challenge in this field is the lack of suitable data. Although large desktop models such as GPT can leverage vast amounts of internet data, access to robot-cell training data is highly fragmented and restricted. This creates a dilemma, he points out.
“Generating training data in virtual environments is fast and cost-effective, but a significant gap between simulation and the real world—the Sim-to-Real gap—still exists,” says Schütte. “One promising approach is the use of immersive, physics-based 3D environments such as NVIDIA Omniverse.”
Siemens has a partnership with NVIDIA, dating back to 2022, when Siemens decided to integrate NVIDIA Omniverse-powered visualization features and digital twin functions into the Siemens Xcelerator portfolio. At CES 2026, Siemens CEO Roland Busch and NVIDIA CEO Huang appeared together on stage—a gesture that marks “a significant expansion of their strategic partnership to bring artificial intelligence into the real world,” according to the press release.
“Together, we are building the Industrial AI operating system—redefining how the physical world is designed, built, and run—to scale AI and create real-world impact,” says Busch.
The ultimate goal for manufacturing, Schütte says, is “intent-based automation, where you can just describe the task to the robot in natural language, and the AI agent will figure out how to solve the problem and perform the task.” But he adds, “We’re not there yet.”

Kenneth Wong is Digital Engineering's resident blogger and senior editor. Email him at [email protected] or share your thoughts or suggestions at digitaleng.news/facebook.
Follow DE
Join over 90,000 engineering professionals who get fresh engineering news as soon as it is published.