New AI system lets robots understand and act on human commands in real time

Robots can now turn plain language into real-world actions using a new framework that connects AI models with control software.

Researchers from Huawei Noah’s Ark Lab, Technical University of Darmstadt, and ETH Zurich have developed a system that links large language models with the Robot Operating System, or ROS, enabling machines to understand instructions and execute them in physical environments.

The framework allows robots to process written commands and convert them into step-by-step actions.

This could help machines operate more effectively in homes, workplaces, and public spaces where human instructions vary widely.

“Autonomous robots capable of turning natural-language instructions into reliable physical actions remain a central challenge in artificial intelligence,” wrote Christopher E. Mower and his colleagues.

“We show that connecting a large language model agent to the ROS enables a versatile framework for embodied intelligence, and we release the complete implementation as freely available open-source code.”

The system works by breaking down instructions into smaller executable steps. For instance, a command like “pick up the green block and place it on the black shelf” is translated into a sequence of actions that a robot can carry out using ROS.

Turning words actionable

The framework combines the reasoning ability of large language models with ROS, a widely used open-source platform for robot control.

This integration allows robots to interpret instructions and decide how to act without requiring manual programming for every task.

“The agent automatically translates large language model outputs into robot actions, supports interchangeable execution modes (inline code or behavior trees), learns new atomic skills via imitation, and continually refines them through automated optimization and reflection from human or environmental feedback,” the authors wrote.

The system supports two execution approaches. In one, the model generates small snippets of code that directly control the robot.

In the other, it builds structured decision paths known as behavior trees, which help robots adapt if a step fails.

This dual approach improves flexibility, allowing robots to handle both simple and complex tasks while adjusting to changing conditions.

Real-world task testing

Researchers tested the framework across multiple robotic systems performing real-world tasks. The results showed that robots could reliably interpret instructions and complete assigned actions across different scenarios.

“Extensive experiments validate the framework, showcasing robustness, scalability and versatility in diverse scenarios and embodiments, including long-horizon tasks, tabletop rearrangements, dynamic task optimization and remote supervisory control,” the authors wrote.

“Moreover, all the results presented in this work were achieved by utilizing open-source pretrained large language models.”

The system also enables robots to learn from feedback and refine their actions over time, improving performance without extensive reprogramming.

By linking language understanding with physical execution, the framework could accelerate the deployment of robots in dynamic environments where adaptability is critical.

Subscribe to our Telegram channel and be the first to know the news!

New AI system lets robots understand and act on human commands in real time

El recommends

Shezhire in AI era: Maksat Zhabagin on preserving national digital heritage

Nearly 20.5 mln people: how Kazakhstan’s demographics is changing

Kazakhstan, Finland strengthen cooperation in carbon neutrality