How Eidos-Montréal created Grid Sensors to improve observations for training agents

Within Eidos Labs, several projects use machine learning. The Automated Game Testing project tackles the problem of testing the functionality of expansive AAA games by modeling player behavior with agents that have learned behavior using reinforcement learning (RL).  In this blog post, we’ll describe how the team at Eidos Labs created the Grid Sensor within the Unity Machine Learning Agents Toolkit (ML-Agents) to better represent the game for machine learning, improving training times and ultimately leading to less expensive models.

Eidos-Montréal

Founded in 2007, Eidos-Montréal’s first goal was to rejuvenate old Eidos Interactive franchises such as Deus Ex: Human Revolution. In 2009, the studio was acquired by Square Enix. Fast forward to 2020, Eidos-Montréal now has about 500 employees working on both games and research projects. The studio recently announced the opening of Eidos-Sherbrooke, a regional chapter that houses Eidos Labs, a cutting-edge team dedicated to driving the technological innovation of Eidos-Montréal.

Automated Game Testing using reinforcement learning

There have been many achievements applying reinforcement learning to create AI systems that can play games at human to superhuman levels, such as StarCraft, Dota 2 and the Atari 2600 suite. However, one of the largest challenges to game developers is the amount of compute time and cost to train these models. Exacerbating this challenge is the sometimes impromptu nature of developing AAA games – developers often add features or update textures and animations, rapidly and dramatically changing a game in its early phases when testing is most needed. Within the Eidos Labs team, finding a middle ground between model expressiveness and training speed while remaining independent of ever-changing game visuals is one of the core goals of the Automated Game Testing project.

To drive innovation and progress, Eidos Labs partnered with Matsuko, a deep tech company focused on AI and 3D, in the development of the Automated Game Testing project, which led to the creation of the Grid Sensor. The team also leveraged the Unity ML-Agents Toolkit for prototyping. The core team consisted of:

Defining observations in RL and the Unity ML-Agents Toolkit

The ability of an agent to observe its environment is a key concept in reinforcement learning (RL). After an agent takes an action based on its policy (which defines how the agent should behave at any given time), the agent observes the different states of the environment and determines if the reward has gone up or down. Although rewards and actions are solid levers for improving an RL policy, representation of observations can also significantly affect the agent’s behavior, especially since game engines can take more varied approaches to observations than the real world can offer.

In ML-Agents, sensors are the main mechanism to represent observations for training and executing models. In addition to a general interface, ML-Agents provides two types of sensors to generate observations for the agent that are used to train an RL model. The first type is the use of raycasts, which allow agents to observe and collect data about a GameObject down a line of sight. The developer has control to send not only the distance from the agent to the GameObject but also a reference, allowing the agent to look up other data points such as the GameObjects’s health or whether it’s a foe or friend.

The second type

Continue reading

This post was originally published on this site