Tuesday, February 19, 2013

Screens of the day 29 - These are features actually

I started creating the standalone library that will be the base of the engine. Right now engine code and game code are all in the same project. Before the split can occur, I need to clean up both and then make sure that the library only has generic functionality.

So while cleaning up the code, I got an idea: what if I investigate the range of texture weights that a terrain chunk has. If a chunk can be made to only sample 1/2/3 textures instead of the 4 that it currently does, I'm sure I can greatly improve the performance. But the numbers were not as prone to cooperate and most chunks seem to use 3 textures. I was hoping for more chunks to have 1/2. This distribution depends a lot on the properties of the terrain. Large smooth hills will probably have a more desirable distribution.

But since my terrain has its current structure, I decided to move this feature to the backlog. It will be a worthwhile optimization and might allow terrain than uses any number of textures, not just maximum 4 (with the caveat that a single chunk can have maximum 4 textures), but the benefits I would get right now are not worth it.

But since I started going over the terrain and shaders, I decided to try out spherical harmonics for terrain. Now, I am still not an expert in this domain and would greatly appreciate any input, but let's see my results.

So here is a terrain with basic shading and texturing:


Texturing looks good, but there is no fine surface detail. We traditionally fix this with normal mapping:


This add a fair amount of detail to surfaces that have such an angle that they receive enough light. But if you take a look at the area under the crosshair, which that is not under direct light, there is zero detail added by normal mapping. So here is where adding spherical harmonics might help, giving this result:


Better? I think so. Not sure if this is the way it should look. The effect is also quite subtle. Which is not ideal, since it is a fairly expensive one. It only works with pixel shader 3 or higher. Surprisingly, while the effect is pretty expensive, it is not significantly more expensive than normal normal mapping. Normal. This running joke is getting so old! Anyway, one could optimize it. Experimentally I noticed that for distant terrain, while the difference between no normal mapping and normal mapping is big, at least shading wise, the difference between normal mapping and spherical harmonics is not noticeable, except that the terrain gets a just a little bit darker. So the engine could decide to skip this effect for distant terrain.

Here are three more samples, in the same order:




Hopefully I did not mess up the upload order :)! Here the effect is harder to see because the whole surface is facing directly away from the sun. To better see the effect, download the pictures and use a program that allows fast swapping between them.

And a third sample:




OK, so what do you think? Does it look good? Is it worth it? Anyway, the engine now supports 3 quality settings for terrain. If your PC is beefy enough, there is nor reason not to turn one the spherical harmonics based rendering (which in the options menu will be called "high"), since I don't think it looks uglier.

Here is another screenshot, this time with antialiasing for a little bit of eye candy:


XNA support antialiasing out of the box and my engine also supported it pretty much from day one, but this feature was lost once I switched over form normal rendering to rendering to offscreen targets. Antialiasing is back and more customizeable than ever. Previously you could only choose if you wanted it or not, and XNA did its best to try and follow that suggestion. Now you can choose the number of samples.

Also, you may notice that the terrain is slightly more vibrant. I started messing around with the parameters and brought terrain lighting more in line with object lighting, creating a better illusion that everything is illuminated by the same sun.

And I continued experimenting. Not all of them turned out OK:


I'll continue to clean up the code and prepare for finalizing terrain code.

Friday, February 15, 2013

95 – Boulder (no more troubles), part 3

Eureka! Finally success! Boulder support is done and I actually have a fairly decent generic static clutter system for the map.

I write posts in two ways: one has a plan and a script, the other is a journal styled affair. This is what I used for these posts related to boulders. I just developed and took (mental) notes of interesting progress or observations, together with screenshots at key moments. After a while I gathered these notes and wrote a post about the progress in chronological order. Only since last post (posts sometimes have a delay, so when I say today it was probably yesterday when the events happened) I kind of forgot what most of the screenshots were for.

So today I'll be presenting the abridged version.

I did another step of optimization towards reducing RAM and improving the terrain pool manager.

I also implemented instanced physics. BEPUPhysics actually has support for this, but on top of XNA and BEPU I have my own engine and structures that couldn't handle instancing. Updating this required a minor rewrite of data structures and since I was kind of in a hurry, the rewrite comes with a caveat: you can't use instancing with composite meshes! Luckily boulders are simple one-piece meshes. In the future I'll improve the system to handle any meshes.

Putting the two together we get this:


We have the normal 4 square kilometer map with the old view distance and 10061 objects, yet the memory use is only 112/107/469! Fantastic progress! To think that a few days ago I couldn't fit a few hundred boulders! Grass is still offline because I didn't get to refactor it for streaming/low memory use. And I hate my current implementation, so one day I'll rewrite it.

The only problem is the performance. I am getting only around 26 FPS. Physics time is high and also a ton of objects are actually rendered.

Time to solve both!

I start implementing a very simple spatial division scheme based on the clunking of the terrain. Using this both frustum culling and distance checks are simplified. This was more complicated than it seems now writing about it, but:


Framerate is up to 40 and rendered objects is way down. I only render a fixed distance in front of the camera, but this distance is variable for different kinds of entities: terrain has a larger distance, boulders a smaller one, grass even shorter.

So now that rendering and spatial partitioning is fixed, let's do something about the physics time. For this we enable the new view distance and also increase the number of boulders to cover the map corner to corner:


I still need to tweak the view distance, but it is improved, thus consuming more resources. RAM use is up to 154/119/608, but we do have 17119 physics enabled objects, 590 being currently rendered in that shot.

To fix physics, I do two things. The boulders are not very detailed, but still a little bit high for physics. I want accurate physics, but this is probably too much. Take a look at the complexity and small terraces that the boulder has:



I implement collision meshes, making it so that a mesh can have a collision mesh that is not identical to its graphical representation. Because instanced physics supports only simple meshes, the limitation is in place for collision meshes too. I try and simplify the boulder mesh, while still keeping the general shape. I don't do a great job, but I'm an engine programmer, not a modeler:


Collision meshes are anyway optional, so when a very precise shape is needed, just don't use them. Or hire a modeler! Still, this shape is good enough for physics interaction:


The next step is to group up all those boulders into some larger entities, similar to the spatial partitioning scheme I use, thus greatly reducing the search space. At first this works horrible, performing poorly and eating up all the RAM.

So I decide to take a break.

I add support for placing meshes in such a way on terrain that they have a natural orientation taking into consideration the slope.

I start to investigate an ancient and very serious bug in the terrain access methods. I call it a day before finding the bug.

But the next day, with fresh forces, I fix the bug, implement physics grouping and get the girl!

Here is a screenshot:


We have 40 FPS, with negligible physics time, almost 1400 of of the 17000+ objects rendered and memory consumption up to norm! This on an Intel HD integrated GPU. BEPUphysics is a beast!

Just as planned! I'll be taking a small break, but next time we add a few new clutter objects, hopefully trees, columns and another boulder type.

Wednesday, February 13, 2013

94 – Boulder troubles, part 2

So how do we fix the memory issue? First, we need something to measure. I will be using three values: first is the amount of memory that the GC from C# believes is in use, the second the amount of private memory C# thinks it uses and third is the amount of total memory windows task manager thinks the application is using. These values will be represented as A/B/C.

So first we start testing with an map where we only load up physics, without creating terrain geometry:


See? There is nothing visible, but a barrel is still rolling around. Even better, let's not load up even physics to get a memory consumption of 22/89/75. This is the baseline.

After loading physics we get 22/91/80. Physics data should take up at least 16 MiB, so clearly these memory consumption values are not too accurate. I'll ignore this fact for now.

But when creating visual debug information for physics, even when not visible, we get a whooping 326/1017/1247:


So debug physics information should not be created only on special executions of the program meant for debugging  and even then only for a subset of the world.

So let's make some calculations: 60 * 16384 = 983040 (0.94 MiB). This is how much a chunk of world should eat up. Using 256 chunks we would get 240 MiB. That is too much and anyway the GPU can't render so many chunks at good speed. Rendering around 64 would give us a use of 60 MiB, so a total of around 140?

Well this is what I get for one single chunk: 308/766/552! Pretty bad. So for now I'll disable all vertex buffer creation and I'll send the data each frame to the GPU. I'm hopping that I'll replace this system with a smarter one that dynamically allocates and frees buffers on a need by need basis while keeping memory use under a certain threshold, but for now I'll just remove all vertex buffers and eat the framerate drop.

And we get 308/508/550. OK, time to go bugfixing. After a while for a single loaded chunk I get 70/249/237.

Then I do the same for index buffers and get the final values of 70/147/142 for one loaded chunk.

This seems like a good progress, but it is not. A day has passed, I'm loading up the game again and I am getting completely different values and they are larger! A lot larger!

Back to step zero for today. I start doing a far more thorough profiling and cleanup. The new baseline is 4/70/58 for nothing loaded. And I do mean nothing. I have no idea what is eating up all this RAM. After some investigation, the game seems to start at 19 MiB (Windows provided value) then it goes up to 26 after device initialization, then i goes up to 43 before starting to load the terrain. A lot, but fair enough. And finally once things are up and running, we get around 56 MiB use. So lets say that this is our new baseline: 4/64/56.

Now, I don't want o see a growth larger than 16 MiB now once I load up physics. That would make it somewhat consistent with yesterdays values and also makes sense. Alas, no such luck, we get today weird values of 70/130/124.

I start and do some ridiculously detailed memory profiling and also clean up all the code. After watching the memory allocator more intensely than watching a person you suspect is about to steal your friend chicken, I get back to 20/82/73. This is in line with yesterdays starting values and this time I am fairly sure that nothing gets allocated that is not needed.

This only for loading physics, not actually activating it. After activation the values remain roughly the same. So we are back to where we left of yesterday, only this time we have a more clean and controlled code doing the dirty work.

Now I need a pooling resource allocator that is 100% in charge of giving me fixed size memory chunks for my level geometry and also taking care of them when not needed. The implementation almost guarantees that the memory goes only up as you walk around, but never above the view distance. When less memory is needed when the max the "free" parts are not actually freed, just pooled for later use.

So using this allocator I got this result:


Finally, an improvement! After walking around for a few minutes the memory stabilized at around 119/95/321:



And with textures we get 142/95/590:


Now this is the first version only and there are still some improvements to be made until I can get the optimal values. I also tried to improve the view distance. Still needs some tweaks.

And add a skybox. And I failed miserably.  I guess a few hours of memory work turn you into an idiot who can't implement a simple skybox.

Tuesday, February 12, 2013

93 – Boulder troubles, part 1

There are tons of great looking games out there and things like Farcry 3 really push the envelope when it comes to lush and "realistic" landscapes. Of course, I am not aiming as high. But I still need to make the world more dense, so I'm thinking more grass and a few clutter objects could greatly improve the look of the land.

So today I am trying something new, an almost life journal of my first attempt to add a generic clutter system, that for its first iteration will support only one boulder type. This will be less edited than usual and ultimately tell the story of a first failure in chronological order of the events.

So first step is to find a free boulder model on the net, import it into Blender, export it into a XNA format, add a hardcoded line to the game's code informing it about the new model, add the exported file to the import project in Visual Studio, uncomment a line that enables model reprocessing thus making the game load all meshes in the slow loading XNA format and resave them in my own custom very fast format, then run the game only to have it crash because of a failsafe assert I added. The new model has no tangent information, so I need to close Visual Studio, edit the project file in Notepad and rerun the game, to get this result:


I know, I have pipeline issues! The next step is to adjust the size of the boulder, making it fit better with the landscape style:


But why is it black? I have this problem every time I add a new mesh and it eventually goes away on its own and it goes away if I edit the new meshes material related properties. I am getting tired of this, so I start debugging and actually find a bug in my material assignment code. With this issues fixed, all new meshes should look like this:


I worked hard on getting good ambient lighting into the engine, and the effect is clearly seen on untextured objects. Here is the object from behind and the lack of a normal texture is clearly visible:


So I reuse an existing material and get this result:


The stone is very poorly UV mapped and the scale is also wrong. I do not fix the UV mapping, but I do try and fix the scale:


After some further tweaking:


So now that the material is all set up, here is again the lighting with normal maps for the backside of the boulder:


One boulder won't do me any good, I need a lot of randomly placed boulders. Dropping 400 boulders from a very low height makes them land in a very orderly manner creating a very artificial look. Dropping them from a greater distance improves a little the orderly look, but some boulders are rolling around a lot after creation: 


Now let's try to add 6400 boulders. This kind of breaks physics, with thousands of boulders trying to settle while the game map is still loading up. Loading and settling now takes minutes and makes the game unusable during that period:


So life boulders are out for now. I have multiple types of physics objects and I try a less CPU intensive and static kind of object, that once placed can't be moved by normal physics interactions and this loads up nice and fast:


But why is it untextured? Because I am running out of RAM. My static object implementation eats up tons of RAM. For now I'll disable grass, another huge RAM hog:


This does not help a lot, because I can't go up well over 1600 stones RAM wise. This has nothing to do with rendering. Even if I cull most of these stones, the RAM use is still there. Here is a shot with 1600 regularly placed stones without grass:


This concludes part one, because I need to find a solution that drastically reduces memory consumption. I'll be going over the entire code base and in part two we'll see if I managed to improve the situation and thoughts on possible solutions.

I'm finishing my landscape, even if I have to use the CryEngine for it!

Tuesday, February 5, 2013

Goals for 2013

Hello there everyone! Long time no see! I am back and ready to announce my goals for 2013, together with a short recap of what I've been up to since last time I wrote.

Let's break the mold by starting with the goals, and not with the recap:
  • Opensource the physics enabled-terrain code. I said in the past that I would use opensourceing as a means of preventing code loss in case that something happens with my projects or my commitment to them and I was no lying. I still want to do this, only it does not mean having a public repository where you see every single commit that that I make. Instead I want to release a single very well polished version that is pretty much identical to the code I use, only having all the unnecessary stuff cut out.
  • Release a more simple one hour game/full featured demo as preview of a game that I can finish in 1-2 months. Probably in 2D. This project, while at least an order of magnitude less ambitious than the last one, is still quite the complex beast that I won't finish in 2013. 2015 the earliest at the rate I am progressing. So in the meantime I need something to "pay the bills". While I won't necessarily produce something destined to be sold, I'll approach it as a full retail product, making sure it is finished 100%. This "real" product will get it's own post, but I'm thinking of creating a new blog for it.
  • Find a replacement for XNA. Sigh. I just read the news these days. Microsoft is retiring XNA support in 2014. XNA had an uncertain future for quite some time now, but I was hoping that they would announce a new version with Windows 8 features, not the end of support for XNA 4.
  • Return the donations. Another task that I just didn't get to. I still need to cross-reference the blog, my mail and PayPal to determine who and when donated, but it is going to happen hopefully in February. I will also bump down the donation margin for return, from 10 euros to 5 euros.
In December, as usual, I did not work. Still, I manged to get some progress with grass. Since I want grass to be everywhere but you are limited on how much you can render, I wanted a solution to fade distant grass out. Grass buffers are filled and then I don't want to touch them, so fading should be done in shaders, like the animation is. The first step was to compute a displacement value based on distance from player. I applied it universally to grass  Y position, getting this:



Then I needed a way to ground the grass on its initial location (ignoring the displacement). This was not as straightforward because the vertex shader has no idea which vertex is up and which is down:


Then I inverted the displacement, making it start from zero and making the grass negative height in the distance. Here you can see without textures that grass fades out in the distance:


And here is the scene with textures:


OK, grass density needs to be increased. The problem is that with a 2k map, terrain + grass eats up 1.6 GiB of RAM.

So I implemented single-threaded streaming. The game now start almost instantly, loading up only physics. Then one by one it pulls in the chunks and data it needs. Grass is not streamed, but still, memory is down to 750 MiB, so a massive improvement. Once I finish grass streaming and maybe add multi-threaded streaming, I'll be very close to the point where I can opensource the terrain.

I also enhanced the shaders to support fog for all objects (except for grass, but grass already fades out before being effected by fog).

And I spent about two weeks studying animation. I learned about bones and vertex weights and even riged a few simple models. Still, I don't know enough about bones to make my shaders start using them. I don't fully understand the hierarchical transformation system yet and what calculations must be done from parent to child bone. I found some pretty good animation libraries for XNA, but I don't know what to say right now since XNA is dead.

Friday, November 16, 2012

92 – Almost 4 months, feature freeze

I uploaded a video with a preview of post processing effects, including the ink border used in cell shaded graphics:



Actually, cell shading is related to the way you shade items, the way illumination works, but most people wrongly consider cell shading to be related to black ink borders, so who am I to correct them. Anyway, in this video the borders were far too artifact prone.

I also released the third video from the "Modular city" series, again amply annotated:



Changelog:
  • Implemented a new tabbed interface for item selection. The first rudimentary version of it can be seen in the video.
  • Implemented item creation feedback. Now for most items, if you hold down the muse and move it around, you will get a color coded preview of what item will be added where. Red means insertion is not possible, green/yellow means it is okay.
  • Enhanced the pipeline to better process assets and also support the special data set needed for shattering  This shatter data is saved, but not yet loaded from disk. Next time. Until then the data is calculated at every run.
  • Tried to fix the cell shading artifacts. I added another model from another game and this one behaved a lot better. Comparing the two, I managed to normalize some data and get consistent output from any input source, so now artifacts are consistent for all meshes. I also tried to fix the borders directly, and I managed to get a compromise version working, which has very weak borders, but almost zero artifacts. I like the effect, but I'm not sure it is worth it, especially since it requires another pass over geometry.
  • A ton of small bug fixes.

And with this I decided to do a code freeze. No new major feature will be added (but a few smaller ones might be needed) before the code base is stable and feature wise everything that kind of works or is at least hinted at will work perfectly. This means also trees and grass which have been disabled for quite a while. Need to refactor and finish that code. Trying to get a RC0 finished.

After a few RC's, I'm hoping to get the pre-alphas going again.

Also after the first RC, I'll do a retrospective and engine feature list since on the 26th four months of development on the new project would have passed. Generally I am happy with the progress. The engine is shaping up. Considering that I started from scratch under a new programming language, new technology and this time I have physics, I think I came a long way from my first real screenshot of the new project:


Thursday, November 8, 2012

91 – I hate shadows

Last week I've been hard at work and managed to put out 3 videos! Now it's time to write about them, because YouTube videos generate just an insignificant amount of blog views. New posts and referrals from actually popular blogs and itesbare where the meat is at!

So I started a new video series, "Modular City". I experimented with procedural, now I'm trying modular. Can't tell yet which one will work out better. Not without a consultant at least. This will be a long running series that started with me putting down a single building foundation and it will end when the engine is capable of creating a city of roughly the size and complexity of Balmora from Morrowind (including persistence). So pretty long. Here are the first 2 videos from the series, generously annotated, so please make sure you watch them under a video player that supports annotations:





During and after the creation of these two videos I did a ton of polishing of the engine and adding new features that I realized I needed, so I did not have time to create the third video in the series as I hoped. So I quickly added a few effects and kinetic responses to everything and put out this video:



The next video in the series will be about placing exterior decorations.

There are literally tons (I know what literally means, this is a joke) of changes and improvements to the engine and I am not going to list them all as I originally wanted. To tell you the truth, I forgot at least part of them. I should keep a changelog updated life, not trying to remember code I wrote a week ago. So I am going to list only the major new features.

Material system
While adding a bunch of new items I realized that keeping the material information coupled with the mesh information was not a good idea. So I created standalone material representations that are loaded asynchronously from disk when needed. Now if I wish to assign the barrel material to a stool, this is all I need to do (in pseudocode): stoolMesh.Material = barrelMesh.Material. Materials react to life update, so changing a material changes the way every single item that uses the material is rendered without having to inform the item of the change. This made the texture streaming algorithms significantly simpler. It is now only a few lines of code.

Active entities
I created a system where entities can be given quite complex behaviors and the system transparently manages the activity of these entities. Each such behavior has its own class. I really like the design! This is what I used for the third video. I created a general timed entity. This entity has a duration, and once it expires, it becomes inactive and triggers some events. From this I created two entities: one that translates a model and once it reaches its destination it optionally triggers a camera shake and it deactivates so the CPU no longer computes translations, and another that shatters an input model in place and once it finishes it will remove itself from the game-world completely so there are no leftover shattered object pieces under the geometry.

Now theoretically I can create an active entity for anything I can imagine. The best part is that even if you have an incredibly CPU intensive active entity, if you just render if without triggering its action, it has the same performance as a simple model render, so you can have any number of active entities that are currently inactive.

Polish and performance
The engine is starting to become pretty polishes and there is an alarmingly low number of hacks. I am starting to find the code base pleasing.

I also improved rendering performance, both in general and especially for terrain. Terrain is currently the biggest GPU power eater, so a quick fix was to make it be drawn last. In active scenes where a lot of objects are obstructing the terrain, this gives a ton of extra FPS.

I also optimized the low quality terrain renderer created for weaker machines and it is now two times as fast. If you have a scene only with terrain and you turn on the LQ renderer, you framerate will pretty much double.



But not everything I did was a success. I noticed some lighting oddities:





This looked to me normals were not exported properly, so I added a test cube and this is what I got:


And this is what I was expecting:


It took me quite a while to track down what was causing this. Eventually I narrowed it down to Blender. It is not an export bug as I first thought, but Blender interprets its vertex information differently based on the format you export to, and the format I am exporting to right now plus the more complex lighting scheme causes these issues. Marking edges as sharp and applying a modifier in blender fixes this. Or exporting as *.obj, which works exactly as expected. Speaking of which, the engine can now import *.obj meshes, a thing that XNA does not support out of the box.

But the main thing that caused problems were shadows. Again! I had not one, not two, not three but four failed attempts at creating shadow mapping, and one semi successful one. And not in this order.

I based my implementation on a Microsoft SSM sample and a Deferred CSM implementation  The SSM one is so ugly that it is unusable, but I started with this and had 3 failed attempts of adding it to my game. Each time I failed I started from scratch, with a new approach. But no matter what I tried, I couldn't get one that works from all angles. As long as the light was behind you, it worked. As soon as you turned toward the light, the entire scene got covered in darkness. The MS SSM example does not have this problem and the code was the same, so I couldn't figure out what is wrong.

Then I tried experimenting with the Deferred CSM sample. This was written for XNA 3 so I had to convert it, fixing a ton of compilation issues, but I got it to work. It had reasonable shadow radius  pretty good quality and was a solid implementation. There were just two problems:
  • Not even cascades can make shadow mapping look good. It looks like ass. At a glance it is fine, but as soon as you stop and check out the fine detail it makes your inner graphics nazzi pop a vein.
  • It is deferred. I don't want deferred. Naturally it killed MSAA instantly and it has the same disadvantages as normal deferred techniques.

So I took the view frustum calculations from the CSM sample an plugged them into the SSM sample and all the problems the SSM one had (except the ugliness) were fixed. Then I plugged it into the port of the SSM into my engine, and it worked just as good, only with massive shadow jittering.

I wasted hours trying to find what was the problem and I finally narrowed it down to world coordinates. As long as you are around point (0, 0, 0) the jiterring compensation works. My world center on a small map is around (2560, 0, 2560), so jittering is a huge problem. I have no idea how to fix this.

You have no idea how much I worked on shadows! Saturday night I even missed a concert because after working the entire Saturday on shadows, I continued working until 4:40 in the morning  Then i woke up at 9:30 and continued working on shadows until the afternoon.

I think it is time to face the facts: I will never have good looking shadows! That is very depressing. Even if I fix the jitterring and go with a good CSM implementation, CSM is still full of artifacts. It just looks like ass. Roof shadows are all wrong, having slivers of light in areas that should be completely in shadow and the foundation of a stool is so small that due to bias correction, it has a distant and unpronounced shadow. I'll probably add the option to have these ugly ass shadows.

I don't get it how shadow mapping is the industrial standard for shadows today! Not only are they universally fucking ugly, but they have ruined me for all games. There is not a game out there with shadow mapping where I don't notice the artifacts and get irritated by them.

And stencil shadows are pretty much dead. There is pretty much zero stencil shadow information out there for XNA 4.0, but I am still going to try and get an implementation working. I know they are dead. I know they are not scalable. But Doom 3 managed to pull it off years ago. And Doom3 got a HD re-relase recently. But I'm no John Carmack. Maybe they don't work with large outdoor scenes. I don't know. But sill, my game will not look as good as Doom 3 or be as demanding on the system, so if Doom 3 pulled it off, there is a chance that theoretically stencil shadows won't be prohibitively expensive.