秘密直播

News

秘密直播 APL鈥檚 Cyclone Aims to Improve Collaborative Human-Machine Decision-Making

As human-machine teaming becomes prevalent, collaborative tasks between people and computers have grown more consequential. The elevated stakes associated with performing a medical diagnosis or operating missiles in battle call for improved teamwork and decision-making between human and machine. Artificial intelligence (AI) researchers at the 秘密直播 Applied Physics Laboratory (APL) in Laurel, Maryland, developed an AI agent that aims to address this need.

Making AI More Human

Nick Kantack, an AI researcher and software developer at APL, created one such AI agent called Cyclone and trained it to play the cooperative card game Hanabi through a unique learning process. Kantack moved away from the standard self-play approach to training AI, where agents play against copies of themselves, and instead adopted an approach where Cyclone learned by playing against copies of human players.

鈥淚 adopted this strategy: If Cyclone can play like a human, it鈥檒l probably play well with humans,鈥 Kantack said. 鈥淚n doing so, I hoped to improve collaboration between human and machine teammates in Hanabi.鈥

Hanabi players work together to achieve this goal: organize a random draw of numbered and colored cards into five separate lines, comprising five cards each, that are grouped by colors and sequentially ordered. The catch is that players cannot look at their own cards and must glean information about their hand from limited clues given by other players.

The game is an effective platform to understand how to improve the cooperation between AI systems and human operators in the field. For that reason, the Laboratory鈥檚 Intelligent Systems Center (ISC) challenged staff members to develop AI agents that played well with humans. The ISC serves as APL鈥檚 focal point for research and development in AI, robotics and neuroscience, as the Laboratory seeks to fundamentally advance the employment of intelligent systems for the nation鈥檚 critical challenges.

Four AI agents were developed by different APL teams as part of the ISC challenge, and those agents played 40 games of Hanabi with a human team member, an arrangement known as hybrid-play. Notably, Cyclone鈥檚 average score was higher than the score achieved by human-only teams and higher than that of its nearest hybrid-play competitor, earning it the challenge win.

鈥淕ames and competitions are a great way to push the boundaries of AI research. APL, and the ISC in particular, focuses on creating competitions where the underlying AI breakthroughs could apply to real-world applications for our sponsors,鈥 explained ISC Chief Bart Paulhamus. 鈥淲e were attracted to the Hanabi competition because human-AI collaboration is a major area of interest across most, if not all, of APL鈥檚 mission areas.鈥

Modeling and Optimizing Human Decisions

Cyclone is directed to pay attention to what Kantack calls human-preferred factors, or elements that humans pay attention to when deciding to put down or discard a card. For instance, when Kantack is playing Hanabi, he prioritizes gathering information tokens, which allow a player to provide clues to other team members. Kantack listed all his preferred factors and attributed a numerical value to quantify the importance he placed on each. He then fed that list and those values to Cyclone, which used Kantack鈥檚 self-identified factors and values to create a virtual copy of Kantack as a Hanabi player.

Kantack then directed Cyclone to analyze a database comprising 500 of his Hanabi moves. After analyzing this database, Cyclone created another virtual copy of Kantack. To create this second, more accurate model of Kantack鈥檚 playing style, Cyclone had to adjust Kantack鈥檚 self-reported values.

鈥淭his was a shocking result, because it meant that Cyclone was gaining insight into my play style that I didn鈥檛 even have,鈥 Kantack said.

Once Kantack saw that Cyclone was able to model his decision-making with 70% accuracy, he directed the agent to create more copies. 鈥淐yclone played just over 500,000 games with copies of my virtual self, exploring play styles that led to higher scores,鈥 he said.

Instructive AI

After promising results from the ISC challenge, Kantack sought to further Cyclone鈥檚 human-machine teaming potential. He partnered with Nina Cohen, a machine learning engineer and data scientist who serves as co-principal investigator alongside Kantack on a project funded by APL鈥檚 Research and Exploratory Development Mission Area.

鈥淚 partnered with Nick because Cyclone has potential to break new ground in the space of explainable AI,鈥 Cohen said. 鈥淏y providing instruction, instead of a suggestion or correction, it can become the ultimate teammate to a human.鈥

Cohen is the primary developer of the user interface the team is employing to test Cyclone in hybrid-play. Other members include Nathan Bos, Timothy Endres, James Everett and Corey Lowman. In the team鈥檚 experiments, what primarily separated instruction from suggestion or correction was the ability to discuss human reasoning rather than just human action.

鈥淔or the instruction group, Cyclone was trying to get inside its human teammate鈥檚 head and then tell the human how to think better about Hanabi moves. When giving suggestions or corrections, Cyclone just told the human what to do,鈥 Kantack explained.

For example, a suggestion included, 鈥淭he AI recommends that you play your right-most card,鈥 whereas an instruction read, 鈥淭he AI thinks the team would be more successful if you paid closer attention to opportunities to discard.鈥

By the end of the experiments, the team found that although all groups showed some improvement 鈥 suggesting humans tend to improve their decision making over time 鈥 the instruction group showed significantly greater improvement in its decision making than all others even though this feedback was requested the least often from test subjects.

Kantack and Cohen also expanded Cyclone鈥檚 modeling abilities by feeding it the moves of 10,000 Hanabi players, which were pulled from online Hanabi game logs. Cyclone was able to identify 12 events that could impact the final score of the game and then predict how a single move might affect the likelihood of those 12 turning points. This expanded database not only enabled the team to categorize play styles but also significantly improved Cyclone鈥檚 ability to play well with humans.

The team  at the Society of Photo-Optical Instrumentation Engineers conference in April, noting that tailored recommendations based on strategy are what allow humans to alter and improve their decision making. The team鈥檚 findings also confirmed that when humans are told what to do after making a decision, they are merely observing that decision, which typically doesn鈥檛 lead to more strategic decision making.

Speaking Like a Human

The overall trend toward improved decision making can also be attributed to Cyclone鈥檚 plain English recommendations.

Cyclone鈥檚 recommendations are captured as simple, intuitive insights. For instance, instead of saying, 鈥淰alue discards by 0.65 points,鈥 Cyclone instead says, 鈥淰alue discards a little more.鈥

鈥淭his kind of feedback allows the human to process Cyclone鈥檚 suggestions and strategy more quickly,鈥 Cohen said.

鈥淲hile this might feel like we鈥檙e being less precise, evidence from our experiments suggests that this approach is more effective at improving the skills of the human players,鈥 Kantack added.

Furthering Human Subject Research

This summer, the team will expand its human subject research, testing Cyclone鈥檚 ability to model, analyze and improve the Hanabi play style of interns participating in the Lab鈥檚 CIRCUIT (Cohort-based Integrated Research Community for Undergraduate Innovation and Trailblazing) program. Looking ahead, the team hopes that with further funding and development, it can position Cyclone to meet needs across the nation鈥檚 most critical missions.

鈥淲e want to refine instructive AI for high-stakes missions,鈥 Kantack said. 鈥淏ut the idea is to always have the instructive AI agent and human working together so that the team makes fewer errors than either individual.鈥

鈥淥ver the last decade, we鈥檝e seen a steady flow of breakthroughs in AI research 鈥 in game playing, computer vision, autonomy, human language technologies, scientific discovery and beyond 鈥 all isolated from the users, analysts and operators,鈥 said Mike Wolmetz, APL鈥檚 program manager for Human and Machine Intelligence. 鈥淥nly through human-centered AI research like this can we realize the full potential of human and machine intelligence.鈥