Jailbroken AI Robots Could Have Disastrous Consequences

Researchers from Penn Engineering have discovered new security weaknesses in various AI-controlled robotic systems.

“Our findings indicate that, at this point, large language models aren’t sufficiently secure when connected to the physical world,” stated George Pappas, UPS Foundation Professor of Transportation in Electrical and Systems Engineering.

Pappas and his team established an algorithm known as RoboPAIR, which is described as “the first algorithm created to hack LLM-operated robots.” In contrast to existing prompt engineering techniques aimed at chatbots, RoboPAIR is specifically engineered to provoke “harmful physical responses” from LLM-driven robots, such as those being developed by Boston Dynamics and TRI.

RoboPAIR successfully bypassed security measures in three noteworthy robotics research platforms: the quadruped Unitree Go2, the four-wheeled Clearpath Robotics Jackal, and the Dolphins LLM simulator designed for autonomous vehicles. The algorithm took only days to gain full access to these systems and circumvent safeguard protocols. After taking control, the researchers successfully commanded the platforms to execute perilous maneuvers, including navigating through intersections without stopping.

“Our results demonstrate for the first time that the dangers posed by jailbroken LLMs reach far beyond text generation, highlighting the risk that such robots might inflict physical harm in the real world,” the researchers stated.

The Penn team is collaborating with developers to reinforce their systems against future breaches, but they caution that these vulnerabilities are systemic in nature.

“The insights gleaned from this study highlight the critical importance of a safety-first approach to enable responsible innovation,” emphasized Vijay Kumar, a coauthor from the University of Pennsylvania. “We need to address inherent vulnerabilities before deploying AI-driven robots in real-world settings.”

“Indeed, conducting AI red teaming—an approach that involves assessing AI systems for potential threats and weaknesses—is vital for protecting generative AI technologies,” said Alexander Robey, the lead author of the paper. “Identifying these vulnerabilities allows for testing and training procedures that can help mitigate them.”