秘密直播

News

秘密直播 APL Trains AI to Adapt Through Video Games

A researcher can use Meta Arcade to control the sizes, speeds and colors of everything in the game, or even create new ones. The ease of modifying games allows researchers to focus on a specific capability of an algorithm and measure how well an AI agent can handle those changes.

Credit: 秘密直播 APL/Kai Stone,鈥 Tom Wach

The more artificial intelligence agents are deployed in operational scenarios, the more reliably and quickly they will have to navigate unpredictable environments. Researchers at the 秘密直播 Applied Physics Laboratory (APL) in Laurel, Maryland, have created , a suite of arcade games that can be configured and used as training tasks for artificial intelligence systems. Initially developed under the Defense Advanced Research Projects Agency鈥檚 program, Meta Arcade trains AI agents to quickly adapt to new and changing scenarios.

Arcade Games for Critical AI Research

The games in Meta Arcade are modeled on classics like Pong and Breakout, common benchmarks in the deep reinforcement learning (DRL) community 鈥 experts focused on ways to improve how AI systems train and learn. Unlike a typical game, where settings and features are fixed, a researcher can use Meta Arcade to control the sizes, speeds and colors of game entities, or even create new games. The ease of modifying games through Meta Arcade allows researchers to focus on an algorithm鈥檚 specific capability and measure how well an AI agent can handle changes.

The core team behind Meta Arcade includes DRL researcher Ted Staley, AI engineer Chace Ashcraft and researcher Ben Stoler, all from APL鈥檚 Research and Exploratory Development Department (REDD). The tool is available to the public through the development platform , and the team hopes it sparks conversation about other potential tools the DRL community is currently missing. Meta Arcade was also shared at , the Conference and Workshop on Neural Information Processing Systems.

鈥淲e needed to develop a tool like Meta Arcade to study and advance our AI research,鈥 said Bart Paulhamus, chief of APL鈥檚 Intelligent Systems Center (ISC), which supported the development of Meta Arcade. 鈥淏y releasing it to the public, APL is accelerating the development of trusted AI for our nation鈥檚 most critical challenges. Now, AI researchers can focus their time on AI research, not tool development.鈥

Pushing the State of the Art

When it comes to DRL training, an agent is given the freedom to play a game repeatedly, making and learning from its own decisions. Each time the agent makes a decision, it is given a signal that describes how successful it was. Those signals allow the agent to learn through trial and error: Strategies that seem to produce positive signals are reinforced, and behaviors that lead to bad outcomes are used less and less, explained Staley.

But how well can the agent solve circular mazes? Or mazes displayed in a different color? Additionally, what tools and methods can be used to train an agent that solves mazes in general? Should it be shown many colors and then different maze types? Or maybe the other way around?

鈥淭hose are difficult questions to answer because the expertise of the agent is entirely measured against the training problem itself,鈥 Staley said. 鈥淲hat my colleagues and I realized when studying these topics is that we rarely have the tools to properly ask these research questions. That鈥檚 what prompted Meta Arcade.鈥

The tool鈥檚 name reflects its objective: Meta Arcade not only allows researchers to train AI agents through gaming but also prompts researchers to evaluate the games themselves. By creating new gaming environments through Meta Arcade, researchers can create problems and therefore benchmarks to evaluate algorithm performance, Ashcraft explained. This enables DRL researchers to create rich problem sets and compare one algorithm鈥檚 problem-solving capabilities to those of another.

鈥淭he value in creating new environments and setting new benchmarks,鈥 Ashcraft said, 鈥渋s that it helps us push the state of the art.鈥

The 鈥淔ruit Fly鈥 for Lifelong Learning Research

Genetic research on fruit flies set the path for research on more complex organisms, and AI techniques developed for chess playing were foundational to solving problems like data mining and molecular dynamics, explained Mike Wolmetz, who manages APL鈥檚 Human and Machine Intelligence program.

So, similar to how computer chess was once called the 鈥渇ruit fly of AI,鈥 Wolmetz said Meta Arcade is the fruit fly for lifelong machine-learning research 鈥 a critical mechanism through which more complex problems can be solved.

鈥淢eta Arcade is helping the Lab solve problems related to agent adaptability, including maritime overhead imagery recognition and missile defense in unpredictable contexts,鈥 he said.

Meta Arcade was developed with support from an APL team that includes Wolmetz as well as DARPA  project manager and technical lead Gautam Vallabha, robotics software engineer Kapil Katyal, electrical engineer Chris Ratto and AI researcher Cash Costello.

Meta Arcade Applied

In work funded by the Office of Naval Research (ONR), APL researchers are using Meta Arcade to study strategies for producing agents steeled for perception and task changes.

Jared Markowitz, an AI researcher in REDD who leads the ONR-funded project, said that insights gained from the arcade鈥檚 testing environments are being used to produce more versatile maritime platform defense agents capable of handling different fleet geometries, threat types and countermeasures. 鈥淢eta Arcade is also helping to refine algorithms that can classify overhead ocean imagery collected under variable viewing conditions,鈥 he noted.

Tamim Sookoor, a computer scientist in APL鈥檚 Asymmetric Operations Sector, and former staff member Christina Selby applied Meta Arcade while leading a project for the 秘密直播 University . The project,  (Runtime Assurance of Distributed Intelligent Control Systems), sought to understand and predict how DRL models will fail in a given scenario. Meta Arcade enabled the team to observe and quantify a DRL model鈥檚 uncertainty with respect to specific changes in the environment, like game background color and ball speed.

As APL鈥檚 sponsors look to deploy AI agents in unpredictable real-world environments, the Lab鈥檚 DRL community will continue to develop intelligent agents with the ability to quickly and reliably adapt their strategies to changing conditions in the field, according to Ratto, who leads the ISC鈥檚 Artificial Intelligence Group.

鈥淢eta Arcade will challenge the larger AI research community to develop better tools that improve AI robustness and strengthen trust in an AI agent鈥檚 decision-making,鈥 he said.