AI and multi tasking (part 3) – The Proc System

Way back when I started work on GOAL! (or Kick Off 3 as it was then) I realised that I needed something a little more robust than a flat state machine.

Picture a football player a moment. If you want to simulate a football player, it is quite logical to think of states. In fact if you have been following my train of thought in the previous post on this subject, you will know that thinking in states is actually something programmers do all the time, because each line of code is a state:

StandInLineUp();
WaitUntilAnthemsFinished();
if(isTeamCaptain()) {

WalkToCentreCircle();

WaitForRefereeCoinToss();

.....
}

You can also show the above as a state diagram, when suddenly people start thinking of it as a state machine, instead of just lines of code. But there’s no difference, really.

: Code as a state diagram

The reason for this kind of split thinking has to do with the need to multitask, i.e. do many things at once. The above code, if written as straight C++ code, has high level concepts such as “Wait”. It is logically impossible for a piece of traditional C/C++ code to “Wait” without devoting the entire program to that single task. If it is running in its own thread, one of many processes, then it can ‘wait’, whatever that actually means in practice (loop? Stopped process waiting for a message?). However we do not usually use operating system threading, or multiple processor threading for this kind of simple stuff. It’s not portable. It’s usually pre-emptive (meaning that you cannot control when the CPU switches between threads, such a switch can happen at any time and that’s usually not desirable in game code). It can be costly if you have thousands of threads.

So… what we do as programmers is not encode the above state graph as straight code; instead we create a state machine that is effectively a virtual machine running inside our program that takes each of the boxes (aside from the decision represented by the diamond) and labels them as a state, represented by some code number.

: Labeled states

We then have a variable associated with the player, called something like “state”, which holds the value of the current state, so that the player ‘knows’ what he is doing at all times. According to the state, the appropriate piece of code is executed. In this example, let’s start with state ‘1’. “Stand up in line” sets the direction and speed of the player to move, one game loop at a time, to the correct position in the line up, for instance. When the player gets to the right position, it then changes state to ‘2’.

“Wait until anthems finished” could simply be a test to see if the music of the anthems are complete. This could be a thread somewhere else… maybe when the anthems are finished, this thread communicates that by setting a flag. The player would be looking at the flag every game loop until it shows that the anthems are finished. When this happens, it then checks to see if the player is the captain or not, and if yes sets the state to ‘3’. And so on.

It’s a far cry from simple code. It’s messy, in fact it’s a bit like coding in assembler, only worse… as you shall see shortly.

Well with the original football games, I actually did program them in assembler, and in Kick Off, Kick Off 2 and Player Manger I used precisely this kind of system. There were 23 threads (22 players, and one game control thread) and each of these had a flat state machine. There were around 55 states for players and 45 states for the control thread, with different variations for player controlled players and the goalkeeper. The state variable was used as a lookup into a table of pointers to subroutines. Pause states were done by a general pause function that could be applied at any time and would return the player back to what he was doing when complete. This last point is interesting because it is a kind of ‘hardwired’ nesting of states: a crude hierarchy. Pausing was in effect a subtask, although this really did not make the system a HFSM (Hierarchical Finite State Machine). It was more of a hack addition for a general state applied everywhere.

All of these states transitioned between each other in much the same way as a goto statement can transition among lines of code. Funny that, isn’t it? When you type goto in C or C++, you hear this kind of booming voice overhead, accompanied with a crash of thunder “Thou Shalt Not Use The Goto!”. Yet a lot of programmers don’t think twice about the fact that when you make your typical state machine, say, using switch statements and jump between states, then you are in fact using a goto within your virtual state machine!

Have you heard of spaghetti code? Well I have a different anti-pattern, which is actually the same one really: spaghetti states!

But it gets worse: without the ability to nest states (i.e. have a sub state) then you are not only coding by use of your own goto statements, but you are coding without the aid of the humble subroutine. Yes, that’s right: imagine for a moment trying to code without functions. Ludicrous to attempt it? Well… that is in effect what you are doing when you work with a flat state machine, because it is like working with a microprocessor that has no stack. Think about that, and chew on it for a while….

Hierarchical Finite State Machines (or having a stack)

A long time ago, the creators of computers figured out that just being able to execute a stream of instructions and jump around, while useful, was a bit limiting because there was no way to create any form of modularity. If you had some code and wanted it usable by other bits of code, this was tricky to achieve, because although you could jump to it, there would need to be a way of returning to the calling code.

: Subroutines: the first important abstraction of programming

Now you could solve this by storing the return address somewhere, of course. If you look at the above example, you will see a very basic behavior. The entity goes to a series of waypoints. After reaching it waypoint it pauses for a certain length of time, then scratches its nose, then pauses again and then moves to the next waypoint. Clearly breaking down the code into three subroutines (Pause, Scratch Nose, Goto Next Waypoint) is a great way to manage complexity (which is the main purpose of abstraction). To achieve this, the code to call, let’s say ‘Pause’ could be stored some place where the Pause routine should return to when completed. In this way we get a useful modularity; we only need to write one Pause routine, and we can call it anywhere and then continue with the next instruction following that call.

“OK, OK, Dino” (say’s a group of programmers with a quizzical look) “We know all this”. Yes, quite. But I really want to hammer home an important point here. This revisiting of the most basic of programming principles is relevant because it seems to often get completely forgotten when it comes to implementing AI, to the extent that we have scripting languages at one end and encoding of the entire ‘code base’ in a data structures on the other (for example, in Behavior Trees). Let’s be clear why we jump through hoops: we are creating a virtual machine, to provide the same functionality that we take for granted when we write simple, straight forward code. When we create a finite state machine in C++ (or other language), if it is a flat state machine then make no mistake about it: we are creating a virtual execution environment (similar to what you find in a microprocessor) that does not have a stack, and cannot easily support sub-states (which are in fact subroutines).

The stack is a wonderful invention of course. The problem of where to store return addresses is non trivial. If we only have the memory to remember a single return address, then we cannot call a subroutine that calls another subroutine. That requires remembering two addresses. So this is why the invented stacks: so that you could call subroutines that themselves call subroutines. The stack is implemented often in hardware on microprocessors for the main purpose of remembering return addresses.

It is also important to note that if we encode the AI algorithms in a data driven virtual machine, then all we are doing is creating a virtual machine using complex data structures that replaces compiled code as we normally know it. Does that really make sense? I mean what happened to the idea of writing code in a language? Does it make sense to ditch that concept and replace it with some magnificent, complex system that basically does everything that a compiler does, except it stores the virtual ‘code’ in a tree? The only really valid direct advantage is the ability to modify the code while the code is running. That of course is valuable, but for most projects it is not *that* relevant. Most development still takes place with the good old compile and run approach.

What I am saying, in a nutshell, is:

Scripting languages; implemented state machines; data trees (e.g. behavior trees) and so on are solutions that replace simply writing the code in C++.
We do this mostly because C++ does not support multi tasking.
We engineer a virtual machine that allows us to multitask in C++.
We then implement much of the behavior at the virtual machine state level, instead of directly in C++.

Now I am not saying that this is wrong. What I am saying is that we should be clear about why we do it. However having said that, I do feel there are some major disadvantages to typically observed solutions to this problem:

If you use a scripting language, you now have to work in two codebases and two languages instead of one
If you use data trees, you have to develop a lot of tools just to be able to visualize what is going on, or use a domain specific language to populate the data tree.
It adds a lot of complexity to your project
It can have major performance costs (I do wonder sometimes about programmers who obsess about virtual function calls, but are quite happy to have the gameplay code running in Lua, Lisp or Python)
You need to inevitably duplicate some code between the two languages
Debugging can often be difficult if you do not spend time developing debugging tools
If you do this without having an architecture to manage it, the code will quickly become a spaghetti.

What I want to do is simply write everything in C++… or to get as close to this as possible… and still have my codebase manageable. This is what my PROC system aims to do.

Coroutines: The Starting Point of PROC

Around 1992, I invented for myself a concept actually previously invented in the 1960’s called coroutines. I had no idea of this of course. The internet as we know it today did not exist, and it was actually quite hard to get information on programming other than with paper books. Research was difficult, and so many of us pioneers (you can forgive me labeling myself as such I hope) had to invent things for ourselves.

The assembler I used was that which came with the AZTEC C compiler. This was a macro assembler; this meant it had a macro capability similar to that of the C pre-processor. I used this to solve a problem. It was really bugging me that if I had a sequence of states that need to be executed one after the other (for example the loop in the previous diagram) I would have to encode this with a unique state number for each stage. It made for a lot of clumsy code.

: Part of a flat state machine for handling animations

In the diagram above, you can see an example of what I mean. I am showing three states of the flat state machine for the AI of a game entity which are all concerned with playing an animation. It’s a fictitious example to demonstrate a point. Note that there are two levels of state here: the state of the CPU (it’s a flowchart and in effect where you are pointing with your finger as you trace it is where the program counter is). The second level of state is the state variable. To start playing the animation, some other code in the game will set that state to 5 to commence playing the animation.

Let’s work through it. State 5 is all well and good, but we must make sure to store the parameter (the ID of the animation to be played) somewhere. The next time that the AI tick is called, this will be forgotten otherwise. So that’s what happens first. Then we have to set the state to 6, because we do not want to do 5 again (it would corrupt the animation ID).

We then check to see if a non interruptible animation is playing. If yes, then we just return (i.e. that is the end of the AI for this entity for this game tick).

When we have the all-clear, we remember the current animation (if one is playing). We ask the animation system to start playing the new animation, and then set the state to 7, because we do not want to call 6 again (it would invalidate the save of the current animation and re-trigger the animation from the beginning again).

If the animation is not finished, we return… so we are waiting until the animation is completed. When it is completed, we restore the previous animation, and then go on to do something else (not indicated in the diagram).

Simple, eh? Well, not really. It’s a bit of a mess. If we could, we would want to to write this code something like this:

AI {
State: PlayAnimation(_anim) {
  anim = _anim;
  WaitUntilInterruptable();
  previousAnim = getCurrentAnim();
  StartAnim(anim);
  WaitUntilAnimFinished();
  retsoreAnim(previousAnim);
}

Let’s compare that with C code for the above flowchart:

AI() {
  switch(state) {
    case 5:
      anim = _anim;   // where are these stored?
      state = 6;
    case 6:
      if(isAnimInterruptable()) {
        previousAnim = getCurrentAnim();  // where is this stored?
        startAnim(anim);
        state = 7;
       } else {
        return;
       }
     case 7:
       if(isAnimFinished()) {
         restoreAnim(previousAnim);
       } else {
          return;
       }
      // etc
   }
}

This is quite a simple example, but as you can see it’s much messier and harder to work with in C++ than our ‘ideal’ language solution. Even in assembler, where I could make jump tables rather than use switch statements, it was not much better than the C solution. The particular thing that is really painful is how simple transitions between sequential states require a named state and code to set the that state after every stage. What a pain!

Then suddenly, I came up with an idea, and something that cannot be readily done in C++ (actually there are ways by using setjmp or inline assembler… but… they only take you so far as we will see later). I created a macro called “FromNowOn”. This macro simply took the program counter address of the first instruction following it, and stored that as a jump target. Instead of the switch, the address of the code to execute for the current state was stored in memory, so as soon as you executed FromNowOn, only the code following this macro would be called on the next tick. It is in effect a virtual program counter implementation. I found out a few years ago that this idea was known as a coroutine.

To conclude this part of the series, here is a version of the above in pseudo C++ that implements a FromNowOn instruction in a macro. It is definitely an improvement, as we no longer have to create labelled states for every stage of a sequence!

AI() {
  switch(state) {
    case 5:
      anim = _anim;   // where are these stored?
      FromNowOn;
      if(isAnimInterruptable()) {
        previousAnim = getCurrentAnim();  // where is this stored?
        startAnim(anim);
       } else {
        return;
       }
       FromNowOn;
       if(isAnimFinished()) {
         restoreAnim(previousAnim);
       } else {
          return;
       }
      // etc
   }
}

This entry was posted on April 23, 2011 at 1:05 am and is filed under Uncategorized. You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

4 Responses to “AI and multi tasking (part 3) – The Proc System”

gaz Says:
April 27, 2011 at 1:45 pm | Reply
Do you have this actually working? It’s not clear how or where you’d store the various stacks for the coroutines.

I have a system similar to this. It’s 99% platform independent (bar one small function) but I’m not clear how you’d do this with a macro.

Cheers 🙂
- Dino Dini Says:
  April 27, 2011 at 3:31 pm | Reply
  Yes, it works, and I’ll be revealing more of it in further articles. Cheers!
bart knuiman Says:
April 28, 2011 at 12:49 am | Reply
I feel the need to comment on a couple of things here. Though it is kinda hard to quote here, since its such a huge text and there is no quoting functionality..

Anyway ill give it a try.

Dino wrote:
“then all we are doing is creating a virtual machine using complex data structures that replaces compiled code as we normally know it. Does that really make sense?”

Yes, a lot. In your 30 years of experience (or more) u should know by now that if u work on a big project, compile time becomes a major issue. Im currently working on a game and use 4 computers with a distributed build mechanism to solve this problem a bit. However the linking time (which isnt distributed) is still a pain.

This introduction text is to support my next argument which states that ‘especially’ in AI programming u have to change a couple of lines (set the wait time to 2 seconds, or 3 seconds) play a cinematic or make him snatch his nose etc etc. This involves a lot of changing small pieces of code and testing. Each time u have to recompile and link (= pain). It is not just a pain, it is just not possible, it requires way too much time. It is unbearable (not to mention how we can get frustrated which will drain the motivation = production result slowdown).

I am not completely speaking from zero experience here. I did an internship and I was responsible for adding new features for the AI. All this was handled through an editor which saved its behaviour trees to XML files and then loaded them back in the game. Much like u just explained. This is also my introduction to the next argument.

Dino wrote:
“If you use data trees, you have to develop a lot of tools just to be able to visualize what is going on, or use a domain specific language to populate the data tree.
It adds a lot of complexity to your project”

That is completely true. But you forget that this is only true for programmers (considering a game is made by a team not just only programmers). We cannot expect from game designers, who actually make the AI behaviour, not the programmers; that they are going to make all these behaviours in C++. Therefore it is true that programming wise the project complexity increases but for game designers it makes designing the AI lots easier.

I cannot imagine that in a big company the programmers are responsible for the AI behaviour itself that would be a waste of skill. The programmers are hired to build to tools to assist game designers in their AI design.

I have read somewhere a while back a quote from you claiming that your PROC system can be used for ANY AI work.

This is not true. For reasons i proposed earlier.
In fact i think the proc’s system use is limited (im not talking about its functionality).
The PROC system may be useful if you do not plan to involve other people in the creation of your project (artists , game designers). It may also be useful if the project size is not likely to be very big. Well this is in line with my former arguments about compile time and that we cannot expect artists and game designers to design AI in C++.

So for instance, for your new football manager game it might be a good system since you are the only one working on it (or at least for the major part) and the compile time problem is probably not situated. But the claim u did earlier: ‘it can be used for any AI work’ is certainly not true.
- Dino Dini Says:
  April 28, 2011 at 12:07 pm | Reply
  Thanks for your comments and questions. I should point out to other readers that Bart is a student at IGAD. First of all, I am well aware of compile time issues and make it a major point of discussion in the GT2 course as you should know. My solution is the use of appropriate abstractions and management of header files. It is a very high priority to me to make sure that the develop cycle is as short as possible. You are also correct in that strategies are required to avoid having to repeatedly recompile and run the game, especially for tuning. In the GT3 course, I show how it is possible to create a database driven from a text based file with a domain specific language where parameters for tuning are modifiable and update automatically on-the-fly without having to restart the game. This can of course be extended to also encode various behaviors and make them data driven. I say this to highlight that I am keenly aware with my 30 years experience of the issues you raise, and I implement solutions as a matter of course on any project I work on.
  
  On the subject of using game designers to create AI behavior, this might be appropriate for some projects, but it certainly is not true of all. In fact I use a different model where programmers actually encode behaviors and designers describe what they want. Programming is more than just knowing how to speak C++ or whatever language you use. It is about constructing algorithms and managing data, and I have also found that to get rich behaviors it is better to have a designer and programmer work together than just give a designer a tool. But this is the result of my experience in the kinds of games I create.
  
  My statement that the PROC system can be used for any AI work is true, because I am referring to functionality. All methods have advantages and disadvantages. For many projects, the PROC system is appropriate, but it depends on th nature of the game, the nature of the team, the size of the team and so on. What I have found is that the PROC system allows the creation of very complex behaviors with a small team. For example, do not underestimate the complexity of the AI in a soccer game.
  
  The PROC system scales very well for large projects, because it turns AI programming into modular code. It in fact has all the advantages of other solutions, but with one disadvantage only: this disadvantage is that it is a compile time solution for AI. You can do all AI with it, but you might not want to for your particular project. However, I have found that using this system and combining it with property tress works really well for the kinds of projects I often find myself working on.
  
  In the end it is all about having a good set of solutions available for the domain of the problem you are facing. I have tried various approaches over th last 30 years, and have found my way of creating games with very complex behaviors. One should not underestimate the degree of complexity of the AI of a team based 3D game implementing all the game rules.
  
  And of course I have mentioned other solutions. But, here in my blog, I present an alternative with a simple principle: writing the code and the AI directly in C++. For many projects, I maintain this is the best way, if it can be done.
  
  Thanks for your interest.

Dino Dini's Blog