Wednesday, February 29, 2012

80 – Bumpy ride

I uploaded Snapshot 8. This time it is a real snapshot. The point of the snapshot systems was to get version out there. Release early, release often. But taking 1-3 days before a release to polish up a weekly snapshot kind of defeats the purpose of it all. So I took the best version I had at the moment, did 30 minutes of testing and labeled it Snapshot 8. I'm hoping I will do this for all future snapshots. Once in a while a super stable version will be releases. Then it is OK to super polish it.

With the freed up time I gained from not extensively polishing the last snapshot I added tree cutting, plant harvesting, food stockpiles and sacks. And maybe a few questionable interface choices :).

The rest of the time I worked on the new model for the world. The short blocks were cute, but I realized I want more than that. In the end I did not go with either the voxel/DF model or the multi-heightmap model. I am using something in between.

The world can now have any shape. You still interact with it using a grid based system, so a few mechanics might turn out to be a little bit awkward, at first at least. I am also constraining the world to have a fairly large flat-ish surface where you can act. This is level zero. Level zero is special because it is tightly linked to the above levels. These levels can not be represented in isolation. Level 0 and the above landscape constitutes the "overlevel". This is the first type of level. The second type of level are the underground levels. These can be represented in isolation, and you can just place two on top of each other and they will look good. 

Free-form shape worlds also mean ramps. Ramps are a huge pain in the butt and I hated implementing them from day one. How do you solve this? By making everything a ramp!

Since the world is free-form, I ended up with a system that takes the height of the land in a small rectangular area, interprets it and makes some judgement. We wind up with the following classes of mini height-maps:
  • 0: here elevation is very mild. You can freely walk on top of it and you can place items on it.
  • 1: you can still walk on it, but elevation is unregulated enough that you can't place items. There are not many such cases, this is mostly a transition block so you'll often ignore them, and if you really need to place an item there you'll flatten it first.
  • 2: here elevation change is so abrupt that normally you won't be walking up or down there. This seems a little bit strange because clearly a dwarf could walk on that zone. But dwarves are all about changing the environment and traversal optimization. They don't want to climb even a medium incline with a backpack and while caring stuff, so they would rather flatten it first. But the area is not high enough to allow you dig a cave in it. This is another transition tile and the most awkward of them all. This is somewhat compensated by the fact that you'll only need to dig out a few such tiles before you get to the good stuff:
  • 3: very high blocks! Here you can dig caves. You can have an elevation level at great height and under that a lower elevation level defining the cave
This is like in real life, if you would dig a cave though a hill. You start of on a relatively straight piece of land. The first few meters you did the highest point you had to remove dirt from is probably a lot shorter than you. As you continue, the dirt levels around you rise and rise, until you can't dig from the highest point to level zero, but instead start digging caves. In real life everything would cave in upon you! But this is dwarf world! Get me Keanu Reeves!

I implemented all this except for the last part, cave digging:




Surprisingly almost nothing got broken by this change. Quite a few systems needed minor adjustments. Items are no longer placed on high elevation tiles. I could place them inclined, but this is very dangerous. At the slightest provocation I'll start implementing physics and then I'll never finish the game! The problem is that this way hills look to barren. Some plants will still be placed there in the future and I probably need to add clutter/detail objects here and there. Trees needed to be adjusted to be placed at correct height and so did dwarves for walking purposes.

And don't get me started on how bland the elevation looks! I still have no clue how I could possibly shade that. Oh lighting, you so whack!

I found plenty of advanced solutions for terrain. I need something simple in the first stages. So please don't link me to that awesome pixel shader 3 terrain implementation that does height maps with GPU and what not. I need "baby's first terrain" shading solution. And things like "take the normal, multiply it with the direction of the sun and recite a poem" won't help yet, because I am not generating normals for the terrain. So much to learn!!!

But otherwise pretty good. It took me a few hours to implement this.

Next step is to make caves work again. And get Snapshot 9 ready with this new terrain, while hopefully not losing or diminishing a single feature.

Sunday, February 26, 2012

79 – So natural

I managed to crate a good release candidate so Snapshot 8 will be uploaded later today. It is still buggy as hell and I might be able to iron out a few bugs here and there, but most of the bugs are not something you discover in 5 minutes. You really need to be looking for them.

Other than the general fine tuning and performance adjustments (I am comfortable again having vertical sync on; when your map is super busy it is still not 100% smooth to scroll, but getting there) this is the first version that theoretically supports multi-threading. I say theoretically because I don't like the feel of multi-threading in the game. Something is off when I use it. Can't really describe what, just a feeling. And it crashes. A short google search lead me to believe that Irrlicht is not thread safe, so that is probably the cause for it. So Snapshot 8 will still use a single thread, but if I can fix the crashes until 9 then we'll get a first MT release. I am also starting to suspect that Irrlicht is partially responsible for the monstrous difficulty of creating an engine that supports so many items. And my total n00bishness with creating 3D engines that prevented me from realizing this fact. I am not 100% sure yet, going by what I found on google with other people having problems, but if this turns out to be true I am going to be extraordinarily angry. It took me about 3 months to get item population working and I am just about to add another extreme optimization because I am starting to hit a wall again.

This is the last snapshot for February. Like I said, I am at a big crossroads with the design of the game and I am going to dedicate March to testing and evaluating multiple paths before I choose which one to pursue.

The first model is the Dwarf Fortress model. This has nothing to do with the content of the game, it is just the world structure model. What do I mean by this? Let's take a simple example: Minecraft. Minceraft uses a model where the entire world (except for falling water and lava) is made out of cubes. There are a few things that are not cubes, but they still live under the constrains of the cube.  Minecraft also uses the voxel model, which means that cube/non cube boundaries are efficiently represented, there are no constrains on cube diversity and at any given position in a potentially "infinite" world a cube can be placed. Well, if a full voxel model is used. Minecraft may or may not have aditional rules and deviations from this model, this is not important here.

The Dwarf Fortress model can again be considered a voxel. You have a 3D matrix of walls, floor and objects to explore and interact with. There are not limits on what you can place where, but you are constrained at interacting with a single level at a time.

And this is very awkward, especially in 3D. Think about the situation where you are interacting with some items on level 51. You are standing on the walls that are on level 50. But in a few positions in level 50 there are no walls. There is an empty space. You would like to put something in that empty space. You can see that that space is empty from level 51, but before you can interact with it you need to switch to level 50. This system is clunky. No mater how good I make the interface, a new player will need a lot of explanation, a fair amount of practice to get the feel of it and even then is going to need a lot of time to get used to it and don't be bothered by the system. In 3D and first person mode things are even worse. You are sitting exactly at the edge of the empty space from level 50 and the position right next to it on level 51 where your character is. You can very easily see the empty space and realize what it is. But you mouse is helplessly stuck in the plane of level 51. You need to change levels. But what happens if you change levels without moving first/jumping down into the empty space. You new position after the level change would put you right in the middle of the wall, so the engine will have to compensate for this and put you somewhere else. And again the system will confuse you, because there is big chance that you'll need a second or two to realize where you are. And after you change back to 51 you probably won't be in the place you started from.

So the system is clunky. It has its good sides too, like how it allows for deep gameplay and world interaction. Every single alternative model I though up reduces this depth. So the real challenge is to either create and alternative model that maintains the depth or one that reduces the depth in an unperceivable manner.

So as a first try for March I won't be using a voxel model, but a multiple (for starters 1) height-map model. A height map defines the surface of an object, in our case we'll have the surface of the world, the surface of an underground cave, the surface of a stream, etc. I'll start with on height-map and see how it goes and more importantly, how it feels. The multiple-height map model is very similar to the voxel one, with the main difference that two height maps can't touch or intersect. Theoretically they can, but implementing that is hard and not fitted for a first taste of the new model. There is also a difference in perception. Because of the way each model works, in Dwarf Fortress you have a tendency to expand horizontally. In a height-map model you have a tendency to expand vertically. You may win up with the exact same map, just how you got there may be different.

My third model is one specially designed for the mages. PEW PEW! UNLIMITED POWAAAAAAAAAAAAAAAAAAAAA!

The key in getting these models out there is to not reinvent the wheel. I need to implement them without changing the engine in non-minor ways, so the models can coexist. If I add tree cutting (the next thing on the agenda) to one, it should instantly work in all models.

I added some support to model switching to the engine. By changing only two lines of code and recompiling, we go from our standard model that I've demonstrated in the past to a more natural model illustrated in this video:



As you can see, there is no manual level switching. You just walk around like you would do in a first person shooter, with your cursor detecting the surface it is on and allowing you to interact. If a cave entrance was placed somewhere, you would just walk inside. Pretty natural, am I right? Ignore the jerky walking-up ledges animation and how with smoother elevation the borders don't seem to be contributing visually anymore to the scene. In top down view you would prably still use manual level switching, but only to control the postion of the cut-off plane. Without a cut-off plane (the world slice) you often wouldn't be able to see what is going on. But still, if your dwarf is on level 51 and you want him to interact with the item on level 50 that is visible, you would place the cursor over the item, the engine would figure out that there is a difference of level and position the cursor appropriately in 3D space and if a path between your dwarf and the item exists the dwarf would be able to interact with it, without a single level switch operation.

This is just an early prototype, but please tell me what you think!

Friday, February 24, 2012

78 – What What (In the stockpile)

You may have noticed in my last video that when dwarves dig though stone there are no usable boulders left on the floor like in the past. I'm not sure about boulders, but some stone, probably bricks should be left as a byproduct.

Next on the list is reimplementing tree cutting, so I also need multiple logs pilled up somewhere. So logs + bricks equals stockpiles.

Logs are boring so I start with bricks. But I don't have pictures, so I"ll show you a video first and then describe what you saw.


The first things I demonstrate is brick stockpiles. They can be manipulated directly in creation mode. Simply adding a brick somewhere changes that to a brick stockpile. Removing all the bricks destroys the stockpile. 

I was just going to add support for multiple brick types stockpiles, but then I realized that makes no sense. What if you need 3 native copper bricks, but they are under 300 bricks you don't care about. What is a dwarf supposed to do? Sort though 300 bricks? How? Using a temporary stockpile? What is this? The Towers of Hanoi problem? How is a dwarf even supposed to remember where the bricks are if they are not visible? So no multi-material stockpiles! Maybe for food. But not for bricks and logs.

So why are there so few maximum bricks in that stockpile. Why only 47? The short answer is that I want bricks to be put down in a logical and aesthetically pleasing manner, so all positions must be created individually and I go bored after 47 bricks. The long answer is than in the future I will have to optimize this, making sure that hidden bricks take no resources. So no use working a lot now on positioning since I'll have to redo that when I'll optimize it. I may even give it the "voxel" treatment, making sure that optimal face rendering is used.

The individual positions at which bricks are placed form a pattern. Currently only one pattern exists, but the next step is to add patterns for everything that can be reasonably created from bricks: walls, houses, bridges, towers, balconies, castles, dungeons, etc.

The final goal is to create a city of Elder Scrolls complexity, one brick and log at a time.

The final result of a pattern probably won't look that good. I will have to take that result, take its general shape, remodel it traditionally and break it into pieces to create a new pattern.

And as always, lighting sucks on those bricks. If this where a AAA title you would have pre-calculated lighting and baked in ambient occlusion for each pattern. But you are stuck with me, so have some very shinny bricks! You are welcome!

The UI only allows to add or remove a single item at a time, but in the future this will be more flexible. I also textured the bricks, quite poorly.

Next in the video we have log stockpiles. I am using the high quality logs created by BrewStew which I textured myself, again quite poorly. The problem is that the logs are so distinctive that you immediately notice that there is one one log mesh. Maybe I have to go with a more simple cylindrical log to make it less apparent that there is a finite (1 right now) number of meshes. Maybe some random scaling and rotation in the future.

Optimizing logs is harder than bricks because they do not create patterns that fully cover underlying logs. So I might have to model each pattern separately so I can hide some extra faces. Anyway, I am not wasting time optimizing things right now. Add features today, optimize on a full moon during a lycan attack.

So in conclusion things are shaping up. Creation mode has most of the basic tools implemented. Normal mode is still lacking, but one of the next updates should add tree cutting now that log stockpiles work again. Adding food and harvesting back is trivial, but I am not clear on the design for the logistics: what do you do with the harvested plant? If you have no storage do you just leave it on the ground? Then I need to model harvested plants. Do I use the solution from the old 2D engine with magical sacks that appear out of nowhere? Do I create mixed solution where if you have no storage you leave it on the ground, if you have available sacks you use them? Or maybe I don't use sacks and instead use barrels?

Anyway, things are on feature freeze right now and Snapshot 8 will be uploaded on Monday. I will do my best, but expect to see a few bugs.

After Snapshot 8 I want to do a quick experiment. My plans and ambitions for this game are leading me in a different direction from the DF one. In March I will try to take the game in a slightly different direction. Have no fear, I'll be using the same engine and mechanics. Dwarves will do the same stuff and new actions that work both in the DF model and my model will be added as scheduled, so if the March experiment fails the game overall will have taken a few steps forward.

But the DF model can be quite confusing. You have a multi level map where you interact with it one level at a time. You need to change levels a lot. You don't see what is happening one level higher. And if you do, it can obstruct your view of the current level. Implementation wise the map eats a lot of RAM and there are extremely rigid data structure requirements to do path finding in such a map.

In March I'll try a model that is not level centric, but "view" centric. Your default view is to see the entire surface of the map. Elevation is smoother and dwarves walk all around the place. The surface of the view is deformable and a lot of actions change the shape of the map. Maps are larger. I'm thinking of eventually giving you a 3 square kilometer playground and see how that goes.

This will be just an experiment to see which perspective I like better. If the new model works out I have great plans for immigrants. I said before that the number of active dwarves will be a lot lower than in DF. Dwarves will be few and very valuable. One dying will be a huge disaster with profound implications on your productivity. If you have 200 dwarves, one dying is negligible. If you have 15 dwarves, one dying is more important. Maybe the one who died is the only one you had with a given skill, so now not only did your general productivity fall, you are no longer able to do a task you previously could. I am also thinking about making dwarves have only a limited number a skills they can perform. In DF all dwarves are novice at all skills. I'm thinking of giving them about 6 skills, with a starting dwarf being fairly skilled at 1-2 of them. They can't perform skills they are not trained for before they receive tutoring. And you will be able to buy skill books if a trainer is not available, but it will take weeks for a dwarf to learn a new skill from a book.

So what do you do with low number of dwarves but a big map? Immigrants! Wait, what? Did't I say that the number of dwarves will be low? Well there are going to be two kinds of immigrants: squad mates and civilians. Squad mates will be like your normal dwarves and under you command, while civilians will mind their own business, sometimes helping you with tasks, sometime just trying to survive, eating your food and blundering about during combat like Jar Jar in the final battle from Episode One.

So you'll have a large living city full of civilians that go by their daily routine and don't take direct commands and a small number a dwarves in your squad that you control. Civilians will be able to sometime perform tasks like hauling and digging, sometimes for free, sometimes with payment. Things your highly skilled dwarves are too good for. Because they are better than you!

Thursday, February 23, 2012

77 – Wait... didn't I do this before?

Do you remember my first video ever?


I recreated the video with the 3D engine:


There is not much to say about digging that I did not say before. Creation mode is now split from normal mode and you can switch between the two modes. Sleeping has been disabled for this video. I think that half height walls support should be added to the 3D engine too. They are not so vital like in the isometric one, but still. I tweaked the interface a little and added progress bars to tasks.

Something must be done to unify the interface. What works for creation mode does not work that well for normal mode.

But the scheduler works great. Adding back all the actions is just a matter of having all the right meshes and going over the task filters to make sure that they still do the correct thing.

I'll add back tasks one at a time in preparation of a real demo that is basically the 2D engine at its high point converted to 3D. And I'm hopping next snapshot will be public on Monday.

Thursday, February 16, 2012

Scheduler is back!!!

Proper introduction some other time. Today only short video. No more work today.



I added a temporary tool for direct movement commands. Since direct movement was not present before, I added a new kind of task for it. Here is the source code:

class DirectMoveFilter: public ClearingTaskFilter {
public:
DirectMoveFilter() {
smart = true;
single = true;
}
virtual bool TestDwarf(const Dwarf& aD) const {
return true;
}
virtual bool IsValidSource(Point3D aP) const {
return DH::Cell(aP).IsWalkable();
}
virtual void OnSuccess(Dwarf& aD, Point3D aP) const {
}
};

Time to learn and teach again

I am on a mini break right now. I'm testing and optimizing the engine to get it ready for heavy duty lifting so I'm not adding explicit new features for about a week. I optimized it down to 14 / 2.4 ms (two key values in milliseconds, the lower the better). Found an experimental way to make non-animated trees have zero impact on scrolling speed.

I also continued my experiments with making the engine do stuff it was no designed for. I'll talk about that soon. I am also preparing to bring back scheduling. Once the main scheduler is back, all tasks should be re-enabled one by one in a fairly fast succession. The target is to get the game at comparable levels to its 2D version by (the end of) March.

I am preparing to bring back the LL3DLGLD series, this time with the subject of shaders and more didactic than ever!

I also did some research in getting the game to look more modern. Increasing the poly count is out of the question, so I though I should investigate some effects. Here is bump mapping:


The bump map was produced by grayscaling the texture and converting it to a normal map without the actual geometry of the object involved, so it is far from ideal, but the result is not that bad. Here is parallax mapping:


I like bump mapping result better, especially since parallax mapping is darker, but it gives good results on closeup view:


The only problem is that using these effects makes everything look dark. I was showing you the light side of the object, but here is the dark side:


Unacceptable for an outside scene. After a lot of investigation I found the cause for this: Irrlicht is limited to two lights when using these effects. And I can't figure out a way to create and outside scene lighting only with two lights. 

Before I continue, let me show you another bump mapping sample, this time with a normal map created based on geometry:


Not perfect, the little circular bumps on top being particularly bad. Anyway, I only spent 5 minutes creating this map and it works as a proof of concept.

I also discovered this effect:


This effect accentuates the actual polygon structure of the object, but unlike flat shading, the margins of polygons are rounded and the shading is made to be uniform on each individual face. Not very useful, but cute!

So what can we do about the two lights limitation? I think the only solution is to take some third party shaders and use them. I found some but they don't seem to work for me. Probably because I don't know a thing about shaders.

I probably won't be able to maximize the potential of the engine without using shaders eventually, so I guess the path of least resistance is to learn and master shaders.

So if you are interested in learning shaders, keep an eye out for the continuation of the series!

Tuesday, February 14, 2012

Snapshot 6 Changelog (and it's private)

The new streamer is working surprisingly well. I just put it to the test under some interesting test scenarios which I am going to describe soon in another post and it has proven both stable and advanced enough to handle stuff it was not designed for.

Still, it is too early to release it into the public and since this eat up most of my time I still didn't bugfix Snapshot 5 100%, so Snapshot 6 will be kept private. It is feature complete, but buggy and I don't want to knowingly release buggy software if it can be avoided.

Here is a screenshot with it:


This screenshot was done on a computer with and integrated cheap Intel GPU without any dedicated RAM. The new engine has higher requirements but it can just about manage to slug along on such low hardware. So the upgrade in requirements is minor. 

Actually, I would like to do an experiment soon. Model only 2-4 objects at the standards of (the end of) 2012, with a polygon count perfectly suited for the complexity of the object and 2048x2048 textures. Try the engine in first person mode and see how it looks. And how it performs. I'll try with an 8800 GT and a 540M GPU.

Here is the changelog for Snapshot 6 (I won't go into detail because I've talked about a lot of this stuff in previous posts):
  • Uses the second generation 3D engine
  • Graphics overhaul. The current level and the one bellow are rendered with high quality blending to create the illusion of a less cube based terrain. Most meshes have been fine tuned for for better visuals and better LOD behavior.
  • Up to 60 levels can be rendered.
  • Streaming biased LOD switcher implemented. The system uses some simple heuristics to try and prioritize changes that have the potential to create a better image, while changes that don't have such impact or only free up resources are pushed to the back. Item building almost two times faster.
  • Crafted items have quality ranging from poor to epic (when it makes sense).
  • Cleaned up and bugfixed the item description panel.
  • Quick navigation keys: press "]" to go to the highest level of the current map, thus rendering the entire map. Press "[" to go to the lowest level of the map, thus rendering only that level and the one bellow.
  • Trees are back. They are still too high poly and represent a massive performance bottleneck. Animations had to be turned off. You can toggle the rendering of tree canopies on or off with the "T" key. Turning them off offer a performance boost and can even help you navigate the map easier because canopies are quite large.
  • Implemented wall smoothing.
  • Fixed horrible screen tearing in full screen mode with vertical synchronization turned off.
  • Smoothed out jerky movement when scrolling or zooming. Actually, this effects everything, from time passage to animations, but scrolling was where it was the most visible. Currently bugged if vertical synchronization is on.
  • When in first person mode and the game window is not active the mouse cursor is no longer grabbed and continuously repositioned at the center of the screen. Furthermore, while the window is not active the game is paused and the game will display the last render it had without consuming any resources except for RAM.
  • Better terrain generation: wold is made out of random thickness layers. A layer is predominately a single kind of material but has random chunks of other materials inside. The system still isn't as sophisticated as the one present in the 2D version.
  • New containers: armor stand and weapon rack.
  • New furniture: wooden chair, table and bed.
  • Improved item interaction consistency. Placing block on top of items used to work, but placing items on top of blocks didn't. Now all items and block are considered to be in the same category, so you can replace anything with anything without having to delete something from a given location.
  • Zooming now works with numpad +/- too.
  • Mouse wheel scrolling support to the engine. Scroll with the mouse wheel to zoom and hos shift and scroll to tilt the camera in tot down mode. In first person mode you can control the camera FOV with mouse wheel scrolling. 
  • Introduced a new super safe API for common tasks. This new API is very slow but it checks for most inconsistencies and reports them. I started using this API for most interaction tasks that are not performance critical. This is a developers tool meant to keep the number of bugs low and as the engine matures this new API will be slowly phased out.
  • The "X" key now also deletes all walls in selection, not just items.
  • When starting a new map the engine will try to position the camera upon an interesting location with a clearing rather than randomly in the middle of nowhere. Alternatively it is capable of placing you on the highest point of the map. But most of the time it will just put you in a corner with a clearing.
  • A few extra options added to the launcher to control mip mapping and filtering.
  • Experimental: The game can now be compiled and executes correctly without having to modify the Irrlicht package.
  • BUGFIX: When replacing a block with a block of the same type but a different texture, sometimes visually the update would only trigger after a secondary block has been replaced.
  • BUGFIX: Placing new blocks would not always update the height map.
  • BUGFIX: Items would sometimes be created over the upper border of the map if the map was too high.
  • BUGFIX: Placing items at world geometry seems can cause visual issues and even a crash.
  • BUGFIX: When changing levels in first person mode the camera no longer behaves in a disorientating way.

Monday, February 13, 2012

Second generation 3D engine done!

It's about time! For future generations I have planned some light multithreading, at least some gray blobs behind objects to simulate shadows and indoor lighting. 

But let's talk about how the second generation engine came to be and what it can do.

Like I said in the previous post, I am raising the quality of graphics just a little. So I set up a heavy stress test and started tweaking the objects. The problem was that I was getting heavy snagging: while scrolling the map, the scroll animation would freeze for just a fraction of a second, but enough to be jerky and disturbing.

There are many reasons for this: it was a stress test so item density was high. I tested on a map without elevation and elevation uses less resources than items, so the load was higher. But there are two main reasons for this. The first is that I fixed scroll animation to be smooth. In the past it was never smooth, not even with an empty map. So with jerky scrolling you did not notice the extra snags. The second reason is that the new LOD system was designed with the first person camera in mind. I said back then that for top down the new one is a little bit slower and I did not get around to do something about this. Ironically, the primary camera for world interaction that also has the best performance has the worst LOD implementation. For first person camera, the snags are not an issue.

So I ignored this issue for starters and went though the entire item list optimizing LOD levels to obtain the best possible visuals. This meant going to a LOD switch border, taking a step forward, one back, from different angles and observing the change. Then adjusting items: adding a face her, a loop cut there, removing some faces, changing shading, scaling, moving, etc. It took me about six hours to do so for the 14 items I have right now. So I don't want to touch Blender again for at least a week.

So now I have a good looking set of items with all LOD levels 95% optimized. I went for visuals, not performance, so GPU requirement is a little bit higher, as planned.

This is my reference point. This is how graphics will look and how many resources they will eat. So I must make sure that for these levels the engine works fine and is snag free.

In order to remove the load on the CPU I implemented a streaming LOD switcher. Streaming means that the system determines the work load and has a speed at which it sets of to execute that work load in the background. This way even huge tasks can be executed, they only take longer. The speed must be set in such a way that it is a good compromise between CPU time and the resulting pop-in. Because streaming also means extra popin. With streaming you can walk for a while, then stop, and a few second after the engine is still working, optimizing what you see.

I did two fun experiments that were visually interesting for me. The first was creating a map but not pre-populating it, so in the first second in was completely empty. With a streaming speed set to low it was fun to see the world be built little by little for a couple of minutes. The second experiment was the opposite. I created the world and pre-populated it with maximum LOD levels. It ate up almost 1 GiB of video memory. Then I watched as the LOD switcher reduced detail for distant objects little by little, eventually reducing the memory consumption to 150 MiB.

I fine tuned the values for what I expect my target hardware to be and got this result:


To make things more interesting I set the game to ultra and set antialiasing to 32x CSAA. This is why performance is so low. With 4x antialiasing and without FRAPS performance is at normal levels.

What can't be seen in the video is what I did next. I replaced the simple streaming LOD switcher with a biased streaming LOD switcher. This means that it uses heuristics to try and prioritize updates that would lead to a better quality render.

Combining all these techniques I get a really spectacular engine for first person navigation. There is nothing I can think of that would improve it right now, except for adding free form height transitions.

In top down mode things are not that great. Streaming greatly reduced snags, but they are still present. Making them shorter and less frequent actually made things worse, because you are not expecting a snag.

So the next step was to do some heavy profiling to determine what causes the snag. I created for these tests very small fixed size maps with an uniform distribution of barrels to eliminate all randomness. I scaled things so the streaming LOD switcher would interpret this as the worst case scenario with the highest work load.

To my surprise the snag seemed present and of similar length even on such small maps. This meant that map size was not an issue, and the snag was constant per work unit. So you could determine the approximate snag duration by multiplying the work load with the cost of the work unit.

There were two important measurements to determine. I won't go into detail of what these represent because I can't explain it without fully explaining the streaming LOD switcher with formulas and all. It is suffice to say that for each LOD switch for the maximum work load I will give two numbers in milliseconds. Both numbers need to be as low as possible, but if the second is not that low it is not a big deal.

For the game compiled in release mode I got an average of 29 / 4 ms, and for the debug mode 88 / 12 ms. Having a 30 ms delay when scrolling when you are rending a lot of frames per second seems like a good candidate for the snags. Even if you have just 100 FPS, a frame takes 10 ms to render. So while pressing a directional key we have 10 ms, scroll, 10 ms, scroll, 10ms, 30 ms, scroll, 10 ms, scroll, 10 ms, scroll. You see what the problem is? Once in a while a frame takes 4 times as much to render and this makes makes scrolling jumpy. And the higher your framerate, the worse it gets. I did a test and I confirmed that making this delay shorter directly affects the smoothness of the scroll.

The first thing I needed to do is disable hardware buffers and see if the results remain consistent. I heavily use hardware buffers and if they are the cause for the length of the update operation, there is no way to fix the snag. Not without going into the Irrlicht code at least. Keeping my fingers crossed...

Disabling hardware buffers causes a 6-7 times drop in FPS, but the operations seem to take as much as before. Maybe times are sometimes lower by 1-2 ms. So this is great news. It seems that hardware buffers don't cause this, so it can potentially be fixed!

Then I change over to shallow building. Shallow building is very fast but can only be used under very limited circumstances. Shallow building changes the 29 ms to around 10 ms. This is not that good. It is not possible to get better results than shallow building for any build operation. To further improve upon that we need assembly and/or multithreading. But from 29 to 10 there is a big difference, and even bigger form 88 to 10 (it is still 10 under debug mode), so let's see if we can't improve upon this a little. These measurements were taken under worst case scenario, so a sizable but insufficient improvement here could translate in what is needed for normal scenarios.

The first step is to get rid of some vestigial shadow support calculations. Shadows probably won't make it into the engine for quite a while so no use having them around. This brings zero gain, but it is cleaner.

The second step is to try and optimize bounding box calculations. First we fully disable all such calculations to see what results we can get: 22 / 3 ms. Not bad! That's 7-8 ms spent just calculating bounding boxes. Of course, without bounding boxes frustum culling is dead and performance goes down while rendering is no longer correct. So let's optimize! After a 30 line optimization that pre-computes far more efficiently the bounding boxes using vector math we get 22 / 3 ms.  To make sure that everything is correct I keep both methods on for a while and added an assert to see if they produce different results while playing around on a full map. The new calculation gives the same results 100% of the time. I could replace them with a calculation that is faster, less precise and thus making the GPU work harder to render, but I don't think this will help the performance of the LOD switcher. For low to medium item densities this change is already enough to reduce snagging noticeably.

Next we go for the colorizing code! After doing this we get a minor and improvement. Sometimes we get a -2 / -0.5, but sometimes we don't. Such small duration are hard to measure with accuracy. There is still one if per vertex that I would like to see gone. Luckily, I find a loophole: I make a change that slows down level change a little but the if is gone. I am getting now a consistent 18 / 2.75 ms. This optimization can be applied  to shallow building too so that has been sped up a little bit.

The obvious and simple stuff is out of the way an we are sitting on 18 / 2.75, down from 29 / 4. Now comes the hard part.

Using extremely low level pointer arithmetic I squeeze out every possible CPU cycle out of index updating and get down to 16 / 2.5. Other than this I have only one more idea, an idea that uses about about about 12 MiB of static cache memory, but I wont resort to that only if I am desperate.

For vertices I try something with memcpy and pointers, but this does not give any benefits, so I try the same tactic with pointers like I did for indices and... I think it is 15.

So we'll remain by the 15 / 2.5 for now. This is as much as I can optimize for now, but already this gives great results. When testing out on a normal map snagging is completely gone when testing in top down mode without the massive performance hogs: the trees. With trees I'm 95% it is gone. Sometimes I get the impression that there might have been a minor snag. First person never had any snagging issues and now it is probably even better.

This change also generally effect anything that creates or destroys item so it is a global performance gain. Item creation is now almost two times as fast!

The only bug I can see is that streaming and vertical synchronization do not mix well. So for now it is recommended that you turn that of. Vertical synchronization adds frame delay which compounds with my delay and things are jerky again.

So the streaming LOD switcher and this massive item creation optimization are the stars of the second generation 3D engine. If you have a weak CPU you should instantly notice the change. If you have a strong CPU you won't notice anything except for a smoother scroll that is caused by bug I fixed some time ago. I think I wrote about it. Not sure.

This second generation 3D engine will serve me well, but because of the massive changes a bug or two might have slipped in. I need a few days to play test and make sure that everything works as expected. 

So I can't really release Snapshot 6 yet. If I deem the engine stable I might make it on time, but if not I'll post the chagelog for it and get started on Snapshot 7.

Snapshot 6 also marks the finalization of creation mode stage 1, so I might do a release only with this followed by a few bugfix releases, but my mind already wanders towards the reintroduction of dwarves. And a very special HD fork.

I won't work on the engine any more right except for bugfixing, but one day when I feel ready for it, I'll try and create the third generation engine that should do all the tasks with a little help from multithreading. 

Friday, February 10, 2012

Performance disscussion & graphical updates

If you don't have the patience to read through this long post just skip to the end: there are two screenshots with the way the game is looking right now.

This is one of three posts in which I talk about some specific theme. Today about performance, some other time about the design of the game systems and key differences from DF and finally I'll detail all the playable races with great (anatomic) detail.

So I was working on improving terrain quality and was trying a new technique that would make it prettier than ever. Getting the color blending right was near impossible, but I was getting good results. The caveat was that this method was two times slower, used the CPU too much and not enough GPU and used three times as much memory. This was the first alarm signal that I must greatly mind the performance implications of what I implement because this is not a normal game.

The second alarm signal came when I put together everything that I have ever implemented (except for shadows): fully editable 300x300x100 maps with level changing, rendering up to 60 levels, great item diversity and density. This was the result:


Yikes! Sure, most of the new items and the trees don't have low LOD variants, but still, the performance is way too low. Ten million triangles? Almost 500 MiB of mesh memory? A very heavy round of optimization ensued after seeing this and you will see the results at the end of the post.

So why is performance such an issue?

First of all, this is a DF inspired game. Just creating a game with half as much depth as DF will be heavy on the CPU. This is the first challenge.

Creating a game with the same depth, but with a big scope, scale and frequency of events as in DF is the second challenge. This is one is very hard, almost impossible. No one man in the history of gaming has managed this yet. You may be thinking: but what about Toady? Here's the thing: he is very close, but the game is not out of alpha yet so you can't really tell. Are the anecdotal jokes about needing a super computer to run DF under a given load that circle around in geek circles caused by the game still being in alpha or is this an issue that can't really be resolved. Could multi-threading help? There are a lot of questions and right now DF might be 99% there, but the way to 100% is still very long. Ah, and don't forget about the bugs.

The third challenge is the 3D. Creating an engine that can render everything and provide an interactive environment is not that hard. Creating one that does these things with a big item density, no billboarding and unlimited view distance is very hard. Creating one that responds to these interactions when they happen multiple time a second, while still being responsive and while still running pathfinding, scheduling, time compression, adjusting the needs/mood of dwarves and doing world simulation in background is a huge challenge.

So in order to have good performance there are a few things I must do and a few that you must do.

I need to be very careful with my ambitions. The above mentioned pretty terrain renderer was too much. I replaced it with a new light on resources one. This one uses color blending and some visual illusions to give a less rectangular and Lego like aspect to the terrain:


Sure, the predominant theme is still rectangles, but now they seem to be varied in shape. Here is a close up where the black spot seems smaller than the yellow one, even though they are exactly the same size:


Once in  a while this system produces a rather ugly transition, but this is not a problem I'll worry about. I also removed blending from selections, making them perfectly rectangular and with clear edges. This reinforces the illusion that "this a free form shaped world where dwarves select perfectly rectangular shapes because this is how they think and plan their tasks" rather than "this is a world made out of rectangles":


The next thing I must do is have a very clear idea of a target performance on the target hardware and make sure not to sacrifice visual quality in order to achieve higher and unneeded performance. My engine can handle 90000 items and more. But a normal map won't have such and item density. Rather than scaling my engine to handle absurdly high item densities well and have extremely good and unneeded performance under normal densities, I'll scale it so it has very good performance under normal densities and runs acceptably with absurd densities.

The performance will also be based on the main camera: the top down one (and later the isometric one). Top down is a lot faster than first person because it has a narrower field of view and frustum culling works extremely well with it. Furthermore, in top down you can't really turn on very extreme rendering ranges because you won't be able to see what you need to if you do.

The first person camera is a bonus. A very useful bonus! In first person mode you'll see farther but have lower performance and you may wish to increase the rendering range because it benefits you, unlike in top down mode.

So for hardware compatible with the minimal setup (which I have not precisely determined yet), my target is to have a smooth 60+ FPS in top down mode. In first person mode I'll accept lower performance, that is 40+ FPS. This is the bare minimum. If you have the required hardware and you don't have this performance I did not do a good job with the engine because I did not reach my target.

If you are curious, on my laptop I am getting 200-400 FPS in top down mode and 80-150 FPS in first person mode. These measurements are for fullscreen.

Also, in the future I'll make sure to cater for people with a thousand dollar video cards too. This means better quality models, postprocessing effects and very GPU intensive but pretty LOD switching profile.

So taking these into consideration I perfected the way of mixing smooth and flat shading in the same model to obtain the best look possible. Analyzing the item density requirements I am now comfortable with using meshes with up to 120% polygon count. I don't want to sacrifice a minimum level of visual quality for some unrealistic performance expectations. Normal games won't have 90000 items at once. So I made sure to obtain the best looking barrels and also managed to mostly eliminate the seem:


If you look carefully you can still see it, but I am fine with it. Annoyingly enough, from a distance it is more visible than from close up.

I did not manage to get antialiasing to work yet with Irrlicht. But the engine is capable of it. Using the NVidia Control Panel I forced the game to use antialising. Here is a set of barrels as rendered normally:


You can notice the aliasing effect at the edges of the barrels. And now with anti-aliasing:


This looks a lot better! There is a noticeable performance hit, but my laptop that routinely runs the game at 150+ FPS under normal circumstances does not care. I'll try to add options for direct antialising, but for now using the NVidia Control Panel will have to do.

I remodeled the mesh for the dawnsider:


I am not 100% happy with it, but it is better than the previous one. Here is an alternative dawnsider when bearing fruit.


I adjusted the weapons rack. It has the same concept, but now the polygon count is in line with the rest of the objects:


The armor stand was very hard to see from top down view so I made it thicker. Anyway, these are dwarves, not elves! They need thick and bulky armor stands. Maybe I'll increase the width even more:


I also modeled a very basic bed. It is just placeholder and has no sheets on top, but here it is:


That's about it for what I modeled, but then Bryan sent this to me:


A workshop! A very high quality and detailed workshop! Thank you Bryan! The patch that finally brings workshops to Dwarves & Holes has long since been heralded as the Messianic patch. If you have an interactive world with raw resources and workshops to process them, you are just about done! Am I right? Am I right? Guys? Am I right? Hello?

In all seriousness, workshops will be a major milestone, and while they are not planned until version 0.3.5-0.4, starting to think and plan a little the modelling part for it even now can't hurt. This workshop is too high polygon for starters. While you will never have a huge amount of workshops at once, this workshop has at least ten times the polygon count as it should. I'll reduce the polygon count and send a modified model back to Bryan when I have time. Another concern is the height of it. The little legs are probably not a good idea and I'll adjust it to suit the height of a block and relative size of a dwarf. That is if I still allow for single Z level workshops. Another concern is that I might add levels to workshops. When starting out, you create a low level and less productive workshop. Latter when you have time and resources, you will upgrade it. I'm thinking improvised, normal and finally masterwork workshops. The way this is modeled, with all the detail, it does not look like something a dwarf threw together in a few hours. This looks like the highest level workshop possible with all the detail. And a final concern is that while I want workshops to have a fixed size, all workshops will come with their built in variable sized rectangular stockpile area. This is major change and departure from the DF workshop model and I will detail it some other time.

Short version: the model looks great, hits all the right ideas with the detail, but I need to remove the feet, scale it on the Z axis and reduce polygon count.

So putting things together, after a healthy dose of optimization, here is how a first person fly by render of the game looks with a 300x300x100 map, with the engine trying to render up to 60 levels:


As you can see polygon count is not so high and memory consumption is lower. While item density has been  adjusted a little (there were too many trees before) this scene is largely equivalent to the previous one that ran poorly and had the white trees. The trees are too dark because I did this on a very bright monitor and did not notice the dark colors. An easy fix. Most of the new items have no low LOD meshes yet so the polygon count is a lot higher than it should be. This includes the trees. I need to figure out a way to do low poly distant trees that do not look like crap. 

In the next screenshot we have a more zoomed in view of the same map and thus performance is a lot better:


These two screenshots do not contain the adjusted dawnsider, weapons rack and armor stand, the bed and workshops are not present, but you can still make out most of the meshes.

So what do you think? I say with a little better terrain shading that takes into account the position of the sun this engine has the potential to look quite good! I'll create some screenshots with shadows too, but shadows are something that I did not manage yet to get working.

Thursday, February 9, 2012

Screens of the day 23 - Some new model tests

I've got my desktop back! It was repaired a few days ago but we were hit by an uncharacteristically snowy winter. We did not have such a winter in a lot of years and I had little inclination to spend time on the streets, even if that meant recovering my PC. But anyway it is back and works apparently well. I missed it! I am going to give it a name now! The desktop has been repaired and has two extra coolers so now it is loud as hell and greatly effects the temperature in the room. Having used my laptop for development during this period I can tell that my laptop is an awesome piece of hardware with sufficient performance. Well, it is a 600€ laptop. Still, using a 22 inch monitor on my desktop is a lot more comfortable that the 15 inch laptop screen. Just need some earmuffs now...

When my desktop broke it killed two 650 GB (not GiB) hard disks. There is world wide hard disk crisis (which I feel is partially artificial because of retailers panicking and buying more hard disks than the real demand) so I won't buy new disks right now because they are 2-4 times more expensive than before the crisis. So I am stuck with a 320 GB hard disk on my desktop. I have the habit of creating a partition for my Windows and a separate partition for everything else so I can format and reinstall Windows at will without having to back up data. But this time I made a single 200 GiB partition for everything. What am I going to do with the rest of the 90+ GiB? Install a few version of Linux and see if I can get the Linux version working. Using virtual machines is not a good idea at first because my game requires a lot of GPU power and hardware support under Linux is hardware support under Linux. Sigh...

Anyway, Snapshot 5 is buggy as hell. I fixed most of the bugs but there are still a few big ones left. And I've been working on some fairly interesting things. One evening after fixing some bugs I started playing around with the game and without noticing 30 minutes have passed! Everything is starting to come together and I think I'm only 1-2 snapshots away from a version that puts together everything you have ever seen in one of my videos or posts regarding the 3D engine. And about 3-4 snapshot away from finishing stage 1 creation mode. This version will be officially designated as version 0.2.5, while a stage 2 creation mode will be version 0.3.

Most of the new features and tweaks are only half way done so I can't really talk about them yet. I already have a 14 point changelog. While there are a few very interesting changes there, one of the more minor changes overshadows I feel even the huge ones: scrolling and zooming has been smoothed out. No more jerky camera movement. And using fullscreen without vertical synchronization has significantly less screen tearing. Before screen tearing was so bad that turning off vertical synchronization in fullscreen mode was a very bad choice. Probably no one could play like that without rage quiting and probably getting a headache. Still, the amount of changes is comprehensive enough that I probably won't finish and test everything by Monday, so it may turn out that Snapshot 6 will be delayed/kept private. I don't want another super buggy Snapshot 5.

So what can I talk about? Hmmm...

Here, have some new model concepts:


The above thing is a weapons rack. It is a little bit high poly, but I generally like the shape. It is meant to be made out of wood, maybe with the little things that stick out make out of metal. for now I am not planning to add stone weapon racks, but if I ever do they need to look differently.

This model combines smooth shading with flat shading. I have no idea if my engine likes this and the item will turn out looking OK. This render uses shadows so in the engine it will look slightly worse because of the lack of shadows. I really need to find a way to bake the shadows into the textures and see if this looks good. This will not work with dynamic lights, but for now I have no such things. Actually, I have absolutely no idea how all the items I am going to show today look in game because I haven't had the opportunity to check them out yet. The items are modeled quite amateurishly and are not optimized.


Above we have an armor stand. Low poly but it looks good. I need to add a little rounding to the bottom of the central pillar because it looks too flat but overall I am happy with it. Made out of wood.


Above we have a wooden chair. Low poly again and this model really lends itself to the "diversity" approach, that is making a lot of alternative models. The proportions are a little bit of, but generally it is good.

And finally we have the wooden table:


I wanted it short and stumpy but I don't like the result. The wooden models were designed to have a different shape from the stone ones. No effort has been put in yet to make sure that items look good and are instantly recognizable from a top down view, because this is how you will spend most of your time interacting with the world.

Still no luck with finding a modeler. I really need someone with good artistic vision to make sure that all items are consistent and look like they were part of the same universe. If said artist can also create more unique aesthetics so that everything not only looks consistent but also has personality, that would be just spectacular. Meanwhile you'll have to do with my models.

In my next screen post I'll show you the items in game.