Tuesday, April 5, 2011

pre-alpha-2 – 1 – Yummy tasks

Right now I am at a moment where I am waiting both on people and on the results of some events so I can't really work on new features. Until these issues are resolved, I need to find something to fill up my time. So I thought this is going to be a perfect moment for pre-alpha-2. I'll just dive into it, without a great announcement announcing an even greater announcement where I announce that I have started working on the new pre-alpha.

Pre-alpha is stretching a little the meaning of the word. It is more a refactoring, polishing and straight Q&A phase this time. I am going over everything, checking if my designs were sound, rewriting when necessary and fixing bugs. I am not adding new features and rewritten code will do the same thing the old one did, only better. There is one exception though: I am making sure that all coordinates are stored and processed as 3D coordinates. Making the game use all the axes is going to be very difficult, and having everything use the extra Z axis even at this stage should make things easier.

Wow, this makes for a short post. I know! I'll fill the space by going into technical detail on the functionality of the scheduler, the part I'm polishing right now. How fun! I will even do something that I never did before: post some snippets of code.  This is the perfect point for people who are not interested in the technical details to stop reading.

Scheduling is the meat of such a game and thus it tends to be the most important component. And the most complicated one. Add path finding to the mix, and is very easy to create an unmanageable monster. To make things work, I have the scheduler which is a very complicated beast and I feed it several list of tasks, represented by points and several task filters. Every single task from tree cutting to item hauling across multiple stockpiles to making dwarves go to sleep when they are out of energy and continue their tasks once they have woken up is resolved by using these task lists and task filters.

Let's see how the task filter looks for the digging operation (excuse the formating but Blogger keeps ignoring my formating so I gave up)


class WallDigFilter: public TaskFilter {
public:
WallDigFilter() {
smart = true;
}
virtual void OnFail(Point3D aP) const {
World::Cell[aP.z][aP.y][aP.x].SetSelected(false);
}
virtual void OnSuccess(Dwarf& aD, Point3D aP) const {
aD.AddTask(Task::dtWallDig, aP, EngineParams::WallDigTime);
aD.Energy -= EngineParams::WallDigEnergy;
}
virtual bool IsValidSource(Point3D aP) const {
IItem& item = World::Cell[aP.z][aP.y][aP.x];
return (item.Is(Item::itWall) && item.GetData() <= 2) || item.Is(Item::itRamp);
}
};


Never-mind the structure of TaskFilter. Filters are created to build upon a fixed set of rules that were enforced by other components and they don't enforce these rules, so I can keep them short and readable. 

First we have the constructor, which only sets "smart" to true. A* is great algorithm, but since it needs to be very fast it has quite a few data structures that need to be set up. Over very short distances, this set up is a very costly operation. When turning on smart path finding, over short distances I uses a very fast but incredibly inflexible algorithm that only reads data and does not update any data structure except for the path if it has found one. Without it, several tasks that tend to benefit from this algorithm would be impossible to do as fast as they are done in my stress videos. Just setting smart to false, the default, would make digging reduce the framerate 2-3 times. Other actions do not benefit from smart path finding, and thus A* is always used.

Then we have the OnFail method. The scheduler takes the list of tasks associated to what it is trying to do, and goes over them using a heuristic algorithm to determine if it should try to find a path to it. This algorithm is very fast, but not very accurate. Several passes are done with different algorithms, and in the end we have a subset of the initial list of tasks that are possible candidates for scheduling. Unreachable cells are ignored until the shape of the reachable area changes so they can be taken into consideration again. This is why I can select half a mountain or the content of a closed off cave for a given task and there is no impact on the performance or framerate. But even so, once in a while we get to cell that should have been filtered out and deemed invalid, thus removed from all the list permanently, but due to a set of complicated conditions it is not. Such cells would remain in the system, reducing performance. Over time, once hundreds or thousands of such cells have cumulated over the duration of several game sessions, totaling probably days of game play in real time, the game may even become unplayable. To avoid this, such cells are culled and OnFail is the fail safe method that does some clean up. Digging is a self sufficient operation (except for the dwarf) and clean up is easy. Just set the selected property of the cell to false. Selected does not mean what you may think. These are not the cells you select by dragging the mouse. These are the cells that are highlighted in green meaning that you have given a task that includes them. So you select a region of the wall, give the order to dig and the region becomes highlighted in green. As tiles become reachable, more and more tiles will be dug out, while unreachable ones just sit there without consuming CPU power through scheduling. But when a truly unreachable tile is found, OnFail is called and the tile will loose its green color. You can select is again if you wish, but I'm hopping that seeing the tile change to green and back to its original color will make it clear that your intentions are futile and you should select a different cell/task.

Then we have the OnSuccess method. This is called when a task has been green-lit for execution and a path has been found to it. The dwarf and the point are passed as parameters. The implementation is very simple. We add a new task to the dwarf, giving the task type, the point where the task will be executed and the duration of the task. And we subtract an appropriate amount of energy. All other information is already in place.

Finally, we have the IsValidSource method, which again filters the list of tasks and acts as a failsafe. This method is used in the beginning of the process. It is called when selecting an area, so the GUI can count all cells that can be used for a task. When you bring up your dig window, you will see an option to "dig N walls".  Then it is used when adding the task to the appropriate lists. Then, it is used during the fast heuristic method of determining reachability. And for path finding. And finally as a fail safe. Normally, nothing unexpected happens, everything works according to plan and all algorithms are perfect. Perfect like myself. But if the impossible happens, you can wind up with an error or oversight. To give an example, maybe something caused a tree to be added to the next digging tasks. This will probably result in corrupted data or if you are lucky a crash. IsValidSource is used here again to test the validity at a few key locations and it will pick up on the presence of the tree were it should not and eliminate it silently from the list. The implementation is very easy. I check if the item is a wall and if its data is less or equal to two. This data represent the smoothness of the wall. And I check if the item is ramp.

You may think that calling this method so many times is slow or that having slightly paranoid methods that check just to make sure is not a good idea. But when having thousand of interconnected multi-phase tasks executed by hundreds of dwarves, such methods prove to be life savers. And the methods have short bodies that do little. The scheduler would not be as fast as it is right now if it did a lot in a single sitting. It is designed to take a few small decisions, but repeat the process in an incremental fashion. And the filter classes can easily be made to use template mechanism instead of virtual methods if I get desperate and need to inch out every single drop of performance, but I will not come to that.

Now let us see how we expand upon this class to add the wall smoothing operation:

class WallSmoothFilter: public WallDigFilter {
public:
void OnSuccess(Dwarf& aD, Point3D aP) const {
aD.AddTask(Task::dtWallSmooth, aP, EngineParams::WallSmoothTime);
aD.Energy -= EngineParams::WallSmoothEnergy;
}
virtual bool IsValidSource(Point3D aP) const {
IItem& item = World::Cell[aP.z][aP.y][aP.x];
return item.Is(Item::itWall) && item.GetData() == 0 && !item.IsSoil();
}
};

Very similar. OnSuccess does the same thing but with different numeric values, and the check in IsValidSource comes down to being a wall with data set to zero that is not a soil, because you can't smooth soil or already smoothed walls.

I wanted to show a third example about interconnected tasks, like hauling to/from stockpiles, but that is harder to wrap ones mind around and the post is getting long.

So basically this is it. I have one or more classes for each logical task that all feature very short and clear implementations, that build upon previous result both at a code level and as logical implications. If A implies B, that and filter called during the B phase will not check the conditions from A. This method has proven very robust. Adding a new task is very easy and this is the reason I managed to add beds, energy and sleeping very easily in a short period of time. The what I consider elegant implementation for filter classes is nicely contrasted by the scheduling algorithm, which is one scary monster with a huge number of lines of code able to do pretty much anything based on points and filters. There are many thing that I did not talk about. Filters have methods to test if a dwarf can execute a task (has the labor enabled, has the right tools once inventory system is fully implemented) and many more. The strenght of the system is in keeping it simple. You only implement the minimal amount of methods to achieve your goals.


2 comments: