Monday, January 19, 2009

Improving Shadows System

recently i'm working on performance issues and such, one of the biggest challenge in this area is shadows and how do we make them faster?
the answer to this question is another question:
how much cpu and memory can we sacrifice for this?
the answer to this one is: it depends on other process in the game and your platform.
so here i'm working on pc platform where memory is not an issue (like in ps2/xbox/ps3) so the main question is:
what do you do if you have big scene with 5000 point lights that need to cast dynamic shadows?
lets say we can determine their visibility very fast, so the bottleneck is the shadows generation and not the visibility of the lights. another assumption is that the lights is mostly static but if at any time we want some of the lights become dynamic, we could.
so the answer to this question can be one of those:
1. use light maps
2. compute all shadow map at load time
3. other
well, lets check those methods one by one:
1. light maps is out of the question because they are static and if a light is moving, his shadows wont be updated. also if we have a bridge and we walk beneath, the shadows of the bridge wont cast on our character, because when we compute those light maps, our character wasn't there.
2. this sound promising, it solved the problems in 1 but has another problem.
each point light using cube map for his shadow maps, that means 6 maps for each side, so if we take for example 256 by 256 cube map for all the lights we get something like 1.5 mb per light (assuming 4 bytes per pixel), this become 1.5*5000=7500 mb!! = 7.5 gigs, this is huge and of course cant be done in the real world when we have limited memory.
3. so what we need is a solution that solve both 1 and 2 and take reasonable memory, so the solution i come with is to use a cache system for shadows.
this system will have specific amount of memory dedicated to shadows and based on priority value we computed for each light the system will give cache entry for the most important lights (that is for example: the lights with large range and closer to the viewer).
so basically the system work with x entries that keeps shadow maps and set them as needed to the right lights, when we set shadow entry, we flag the light that he needs to update his shadows, this way we wont spend memory for far lights the viewer wont even see.
so when the scene loaded, the system set shadow entries for the most important lights and when the light tries to generate his shadows we check:
a. is this light have shadow entry? if not it means this light is not important and wont generate shadows.
b. is this light already updated his shadows or not? if not it means the light become dirty or someone flag the light that he need to update shadows (see below)
if we have shadow entry and the light needs to update his shadows, we get the shadow maps from cache entry and generate shadow maps for shadow maps stored in this entry.
when we done, we flag this light that he update his shadows so next frame we wont update it again.
few thing we need to consider:
1. if entity that cast shadows get inside light bounds, the light must update his shadows so the shadows of the entity will get in.
2. if the light is moving we need to update his shadows maps.
ok, so this is the main idea of the system and it is very fast and generic and as any algorithm, this have some tiny things that need to be done correctly so it works smoothly and handle all the special cases.
before i finish, here is another question:
what if we have a room with 500,000 faces with one point light that cast shadows?
we use this great shadows cache system :)
right, but what if an entity, lets say a small ball with 16 faces get inside that room?
hmm, the light need to update his shadows so the shadows of the entity will get in.
well, thats right but that means that for this tiny entity with only 16 faces you need to generate shadows from 500,000+16 faces???
think about it until next time...


Marcus said...

To your last question, is that really an issue ? Because having a room with 500,000 faces should not happen that often right ? Also, maybe the cache system has to take into account not only if some object enters, but also whether that object + light is visible by the User too. Though that may be easy to solve by only updating "if(dirty && visible)"

Marcus said...

Another thing just came to me. Maybe you should do 2 Shadowmaps per light, one with only static geometry in it. Should something dynamic enter the light you can copy that image and render the dynamic stuff into it as well.

orenk2k said...

hi marcus
1. the 500,000 faces room is just an example to show that we are wasting a lot of time when dynamic entity get into the room, this 500,000 faces room could replaced to some real world problem,so we have a room with few high detail models, so the room will get to 50,000 faces very easy.

2. the cache system does not care if the light is visible or not, this work done by the visibility system.
invisible lights wont have the chance to call generate shadows because the lighting pass is skipped for those.
also, note that if you have a light that need to generate shadows for few entities (static ones) inside a room, but the entities placed at the corners so when you look through you can't see those entities, this light should know to put those entities inside the shadow map - even if they are not visible to the viewer!
if not, the light will need to generate shadows for every entity become visible to the viewer. that means, if you enter the room and see entity 1, the light need to update his shadows, then, if you see entity 2 the light need to update his shadows again! and this is for each entity becoming visible to the viewer. so for 10 entities you will get 10 shadows updates!
also those entities are static so you need to track when they become visible to viewer and which ones the light already generate shadows for them which means we need to maintain another data structure for this.

3. your idea of using 2 shadow maps is the way to go, but this raise another problem of copying those shadow maps into one (the ping pong method) which is very slow.
i can tell you how i think it should be solved, but i want to save it for the next post ;)
bye and thanks for the replay

Ashkan said...

Interesting read. Speaking of shadows, what's your take on indirect lighting and global illumination? GI contributes to the look and feel of a game much more than simple shadows. Without some method of computing the ambient term, scenes would become too dark and flat like games that are based on id tech 4, which is clearly what you've been inspired by. GI might be a good topic for one of your next posts. :)

orenk2k said...

hi ashkan
gi will definitely add more realism to the scene but its not come for free, in most cases it will take a lot of gpu power that i'm not willing to sacrifice.
for ambient term, i'm using ssao, which is also not so cheap.
anyway, there are few new methods based on ssao that gives very nice results and they also approximates direct and one-bounce light transport also in screen space, for now gi and other effects is low priority :(
as for my inspiration, i'm inspired by many great game engines such as:
1. idTech 3/4/5
2. CryEngine 1,2
3. unreal engine 3
and many more...
bye and thanks for your replay

Ashkan said...

Hi Oren,

Thanks for the reply.

Have you seen LIGHTSMARK? Here's the link:

It's not flawless and it doesn't provide a solution to the rendering equation but it's marvelous nonetheless. Given that the demo runs upwards 200 fps on my no so powerful machine, it just might change your view of real-time GI.

Keep up the good work. :)

orenk2k said...

i saw their old demo and it wasn't so fast, however the new demo run fast.
few things i saw in their demo:
1. the scene i mostly static, only the robot rotates, and this could be done by one mul so its also consider to be static.
2. one dynamic light
3. shadows aren't so soft, only the penumbra shadows look good, but then the fps drop down dramatically.
real-time gi is something that needs a lot research and checks before it could be integrated into real game engine.
anyway, there are few more real time gi, both have pros and cons but for now, i think i need to focus on other things other the rendering, maybe after i fixed the important things in my list, i will have time to dig into some more interesting rendering effects.
thanks and bye