Sunday, January 30, 2011

Lens Effect / Bokeh - Getting Into Details

last time i talked about lens effect/bokeh, this time i talk about how to do it using FFT with information of how to implement it on the GPU.
i will not get into the theory of FFT, if you want to, check out this:

what i will explain here is how to do image processing using FFT and how to achieve a lens effects/bokeh filter.
if we have an image and we apply FFT on it we get this image in something called frequency domain.
after applying FFT every pixel in the image have its own frequency in the frequency domain that compose of a real and imaginary parts (a complex number)
after getting the image in the frequency domain, all we have to do is to decide which frequencies we want to "filter" out and then, apply the inverse FFT to get back to our image domain again, without the "filtered pixels"
that's the main idea.

nice thing with FFT is that its separable, what this means is that to apply FFT in 2D, we can simply apply 2 1D FFT, the first one goes on the rows (horizontal) and the second one goes on the columns (vertical) of the result.
another thing to know is that transforming N points can be done with two N/2 transforms, that is a dived and conquer approach, which helps to reuse computations.
so knowing that, we will use a divide and conquer algorithm called butterflies:
more info can be found here:

you can see really nice diagrams which helps to understand how it is works and how we can create and encode indices and weights for doing FFT on the GPU using textures and ping/pong method.

so if for example we have N=8 points to transform, we have log2(8) =3 steps of butterflies to perform, beginning with computation on four 2 points DFT, then two 4 points DFT and finally one 8 points DFT.

so, assuming you understand what the butterfly does, the steps to do it on the GPU is a matter of doing those steps:
1. create 2 textures (render targets) used for the ping/pong operations, those needed to be with high precision float point because the pixel on the frequency domain is a complex number with greater range than the pixel in the image spatial domain.
2. encode indices & weights into texture (also high precision)
3. do log2(Width) horizontal butterflies steps (using RT's from 1 and bf from 2)
4. do log2(Height) vertical butterflies steps (using RT's from 1 and bf from 2) on the result from 3
lets call those steps GPFFT
note: the log2 in steps 3,4 - this needs your image to with power of two size.
so how all of this nasty things helps us to do image processing on an image? and where is the bokeh stuff?
so here is the beauty:
suppose we have 2 images, one of them is our kernal, lets call it K, and the other is our source image that we wish to apply the effect on, lets call it I
apply GPFFT on K,I (transform them into frequency domain), multiply them together (complex multiply, remember we are using complex number in this domain), and then apply inverse GPFFT (to inverse transform the result) on the result to get a new I' convolved with K.
and that's it, all it matter is how your K looks like (the size and shape: triangle, square, octagon etc) and whats I.
as for bokeh/lens effect, basically it is working on bright pixels so simply filter the brighter pixels from your image with some threshold into other image (this is your I) and do the process on it.
this method works on low systems, DX9 with MRT support.
we can optimize the algorithm a lot using DX11 compute shader and get rid of the log(N) passes of steps 3,4 and perform only on pass each.
hope this post didn't make you headache' i'm starting to get one ;P
that's it for now
cya next time...

No comments: