Building an AI DOOM bot - The final post

Hey there boys and girls, this is the final piece of the puzzle in the ‘Building an AI DOOM bot’ blogpost series. If you have not already, I recommend you read posts [1], [2], & [3]. By the end of this blogpost I will hopefully have a formidable AI. We will cover topics such as performance improvements, as well as explore methods to customize scenarios and ‘scoring’.

In our previous blog post we discussed the idea of training our AI to perform the ‘Deadly Corridor’ task and see how well that translates to the general DeathMatch scenario. The assumption was clear, as both tasks required moving a lot and Deadly Corridor is definitely an easier training situation. We trained our AI for about 150 epochs and were receiving some rather good results.

In order to test the AI in the Deathmatch environment we need to do a couple of things:

1) Create a new Deathmatch.cfg (call it Deathmatch_testing.cfg) and restrict the amount of available actions to match the number (and hopefully order and type) as those used for the Deadly Corridor AI for step 1.
2) You have a couple of options, first you can create a new python file and copy over everything in the training program except the ‘learning’ function, or you can just comment out the call to the learning function.

Now your AI will go straight into the final testing phase.

Doom Scores 1

Table 1: Scores, each kill can range from 1 - 3 points.

Looking at the scores in Table 1 tells us a few things, but more importantly in paints a clear story of the AI’s performance. Although the AI performs the ‘Deadly corridor’ task well, it fails to maintain equivalent performance in the Deathmatch environment.

Doom AI Wall Hug

Image1: Our AI hugging the wall.



For some reason, our AI bot constantly hugs walls. In order to understand why, we need to dig a bit deeper into the actual doom scenarios being used for training purposes.
In order to do this we are going to Install Slade which is a complete Doom editing package. With Slade we will be able to truly customize our training environment and hopefully get some sweet results.

The step by guide is for Ubuntu but other guides can be found.

First we ensure that our development environment has all the necessary tools to compile it.

$sudo apt-get install build-essential libgtk2.0-dev libglew1.6-dev libfreeimage-dev libftgl-dev libfluidsynth-dev libsfml-dev libgstreamer0.10-dev libgstreamer-plugins-base0.10-dev libgconf2-dev freeglut3-dev cmake libmodplug-dev git libwebkit-dev libwxgtk3.0-dev libwxgtk-media3.0-dev libcurl4-openssl-dev libwxgtk-webview3.0-dev

Next we will go ahead and clone the slade repository.

git clone https://github.com/sirjuddington/SLADE.git slade  

Enter the Directory

$cd slade

And checkout a stable branch

Git checkout stable  

To finish it off head to the distribution directory and run cmake.

To finish it off head to the distribution directory and run cmake.

$cd dist/
cmake .. -DUSE_WEBKIT_STARTPAGE=ON  
make  

Once compilation is complete should ALMOST be ready to edit Doom scenario. You now need a special compiler to build your game scripts. This actually took me a while to figure out, particularly at one point my particular compiler was missing the all important “zcommon.acs” script. By the way, ACS seems to have very similar syntax to C++, so if you have experience there then you should be golden.

We continue by cloning the acc_compiler git.

Git clone https://github.com/rheit/acc.git  

Sadly this repo does not have a make file. Instead of writing one we can download a pre-compiled version here. Usually the only reason to get the repo is for that zcommon file. Move the file to the newly downloaded folder. And when running Slade include a path to the compiler (you will see this option).

Ok, so let's get down to business. My last trained AI kept going to the wall, but it was doing a pretty decent job at being a killer machine in the deadly corridor scenario. For starters let’s make a copy of the deadly_corridor.wad file so if we break something (and we will), we can reset.

$cp deadly_corridor.wad really_deadly.wad

Open up the file in Slade. You will see a few things, including a map, and a script. We can alter the script and give the bot a bit more focus on the ‘killing’ part.

#include "zcommon.acs"

int monsters = 10;  
int start_id = 6;  
global int 0:reward;  
int armour_x = 1312.0;  
int armour_reward = 0.0;  
int armour_tid = 22;  
int kill_reward = 0.0;  
int waste_penalty = 0.0;

/* Skill level 5 suggested*/
script 1 OPEN  
{
    for(int i =start_id;i< start_id+monsters;i++)
    {
        SetActorProperty(i, APROP_Speed, 0);
        SetActorProperty(i, APROP_Health, 1);   
        SetThingSpecial(i, ACS_ExecuteAlways, 3);

    }
    SetThingSpecial(armour_tid, ACS_ExecuteAlways, 4);


}

script 2 ENTER  
{
    reward = 0;
    SetActorProperty(0, APROP_DamageFactor, 2.0);
    ClearInventory();
    GiveInventory("Pistol", 1);
    SetWeapon("Pistol");
    GiveInventory("ClipBox", 6);
    int bullets = CheckInventory("Clip");
    while(1)
    {
        int t_bullets = CheckInventory("Clip");
        if(t_bullets < bullets)
        {
            waste_penalty = waste_penalty - 10.0;
        }
        bullets = t_bullets;
        delay(1); 
        reward = armour_reward + kill_reward + waste_penalty;
    }

}

script 3(void)  
{
    kill_reward += 1000.0;
}


script 4(void)  
{
    armour_reward = 1000.0;
    Exit_Normal(0);
}

This altered script now takes care of rewarding our agent for its kills, which is important. Train a new AI and run it in a death_match scenario.

I will let you come up with your own conclusions as each training set is different, and sometimes raw numbers do not give us a complete story. In my training, this configuration had a pretty specific problem, and that’s the lack of randomization. Because enemies are placed at the same spot and our agent starts at the same location, we will end up with a one trick pony. Agent will learn to kill the first 2 monsters, move, kill the next two and so on. This does not work in a true death-match environments, and is an example of over-fitting our AI. Fixing this problem is not inherently easy in this scenario, but there are a few things we can do to mitigate. For one try setting ‘kill_reward’ closer to ‘200’ and try to also set a reward for moving forward such ‘GetActorX(0)’ and add that to the reward calculations.

A better alternative is to work on another scenario. I spent a long time trying to get deadly corridor to work well, one, because the AI can clearly see the enemy and two, because it was simple enough to show legitimate learning. I feel that in order to make it work I would need to create a much more complicated scenario with a similar closed in structure. The biggest issue with the scenario is the lack of randomization, which will always result in an agent which is designed to deal with a specific situation. Much how people can learn to beat Mario levels automatically because they know what’s going to happen, same thing with our AI.

Let’s look at the ‘defend the center” scenario, where there's lots of open space and enemies. Some issues with this scenario are that player always spawns at the same point and enemies always spawn at the same locations. We can fix that.

#include "zcommon.acs"

global int 0:reward;

int kill_reward = 0.0;  
int waste_penalty = 0.0;  
int move_reward = 0.0;  
int newId = 10;  
int r = 735.0;  
int a = 350.0;  
str monsters[2] = {"MarineChainsaw", "Demon"};

function int SpawnTarget(int monster, int spawn_angle)  
{
    int spawnedId = newId++;
    int x = random(-a,a);
    int y = random(-a,a);

    int look_angle = (spawn_angle +128)%256;
    str m = monsters[monster];
    Spawn("TeleportFog", x, y, 0.0 );
    if(Spawn(m, x, y, 0.0, spawnedId, look_angle)<1)
        return 0;

    spawn_angle = random(0.0, 1.0);
    SetActorProperty(spawnedId, APROP_Health, 1);
    SetThingSpecial(spawnedId, ACS_ExecuteAlways, 3,0, monster,spawnedID, spawn_angle);
    return 1;
}

script 1 OPEN  
{
    reward = 0;
    SpawnTarget(0,0.0);
    SpawnTarget(1,0.2);
    SpawnTarget(0,0.4);
    SpawnTarget(1,0.6);
    SpawnTarget(0,0.8);

}

script 2 ENTER  
{
    SetActorPosition(0,random(-a,a),random(-a,a),0,0);
    SetActorAngle(0,random(0,1.0));
    //Move player to a random location
    ClearInventory();
    GiveInventory("Pistol", 1);
    SetWeapon("Pistol");
    GiveInventory("ClipBox", 7);
    int bullets = CheckInventory("Clip");
    int x = GetActorX(0);
    int y = GetActorY(0);
    while(1)
    {
        int t_bullets = CheckInventory("Clip");
        if(t_bullets < bullets)
        {
            waste_penalty = waste_penalty - 10.0;
        }
        bullets = t_bullets;
        if (bullets == 0)
        {
            arm();
        }
        int new_x = GetActorX(0);
        int new_y = GetActorY(0); 
        if((x != new_x) && (y != new_y))
        {
            move_reward = move_reward + 10;
        }
        else
        {
            move_reward = move_reward - 10;
        }
        x = new_x;
        y = new_y;
        delay(1);
        reward = kill_reward;
    }

}

script 3 (int monster_class, int id)  
{
    Thing_Remove(id);
    kill_reward = kill_reward + 105.0;
    delay(90);
    int spawn_angle = random(0,1.0);
    while(SpawnTarget(monster_class, spawn_angle)<1)
    {
        delay(1);
    }
}

function int arm(void)  
{
    GiveInventory("ClipBox", 6);
    SetWeapon("pistol");
    return 0;
}

This is an altered script of the defend the center scenario (also remember to give your .cfg all the actions you want your agent to have access too). This script does a few things. First of all, it randomizes where the monsters spawn, which is important to add a bit of variation, and 2, it randomizes where the player spawns. It also keeps track of kills and such and rewards accordingly. A fairly simple script.
This should work as it produces a fairly ‘general’ environment and rewards for killing. It is not ideal and I do not foresee anything coming out of it that could rival Intel’s Act agent, but it’s a start.

Best advice would be to write out some scripts and functions to get the scenario to work for you. Also feel free to try making a new map like a maze, it might actually provide you with some sweet results.

Result wise it’s not particularly bad, and the numbers tell us at least one thing, and that is that there is indeed is some focus on killing enemies. Behavior wise the AI seems to have a strategy where it camps in a corner and ‘sprays’ for kills. This is likely a side effect of the training scenario being an open space.

Conclusion:

I hope this series has given you a starting point with which to explore your curiosity on the matter. In my case it has been a rather interesting trip and I can see how improvements in this field will eventually lead to smarter and more autonomous AIs (and robots). I will likely continue working on the subject and will release more posts when I make breakthroughs.

My personal goal while working on this series was learning more than actually building a deadly AI, and that goal has been met. Following some of the methods detailed in the post should yield you with a nice AI Agent, but there is also a lot of work on your end to optimize it. Experiment wise my next step with ViZDoom will likely be to improve on the formula and deviate from ‘teaching’ and see how far I can get.

Download our Incubator Resources

 

WANT MORE?

We’re known for sharing everything!

HANDBOOK

Save more time, get more done!

FREE HANDBOOK

Innovate from the inside

Written by
Abel Castilla 08 May 2017

Classically trained physicist and programmer with a passion for everything AI (also my favorite movie).

YOU MIGHT ALSO LIKE

comments powered by Disqus