Wednesday, February 20, 2013

This is not Motion Blur






This motion blur approximation provokes me motion sickness so hard I have to disable it. The problem is, often these games only give the option to disable all kinds of motions blur (radial blur, ping pong motion blur, artistic blur) while all I want is to only disable per-pixel motion based/velocity based motion blur


The origin

Velocity based motion blur probably became famous after Crysis implemented it. They weren't the first though, this kind of technology was already showcased by the ill-fated Offset engine. And they weren't the first either.

How it works

The standard way of doing it is described in GPU Gems 3 using textures containing velocity vectors. It is also explained on D3D10 Book "Programming Vertex, Geometry, and Pixel Shaders", and also there's a demo in MJP's blog.
Basically a textures holds where and how much the pixel moved from the previous frame, and start blending this pixels with all other pixels along the line traced by this vector.
I don't have actually anything against this approximation technique. Where I differ though, is in the blending formula.


A look at a real camera's blur

How cameras work

Motion blur on camera's is mainly controlled by the ISO speed & shutter speed (and to a lesser extend, aperture).

The shutter speed

The name is misleading because it should be called “shutter time”. The shutter speed controls how long the camera sensor is exposed to incoming light. Low shutter speeds can reach 1000th's of a second, while high values can be sustain more than half a minute.
Low shutter speeds produce very crisp images images, while high shutter speeds produce very blurry pictures. Extremely long exposure times are often used to produce “light paintings” or overworldy, surreal effects. There is a whole “Long exposure photography” sub-culture that can absorb your attention for a few hours.
Low shutter speeds on the other hand, are very useful for action scenes or for focusing on fast moving objects.



High shutter speed time, Source: ForestWander Nature Photography, www.ForestWander.com Under CC SA license.

The ISO speed

The ISO controls how sensitive the camera is to incoming light. Low ISO is very useful in very bright scenes, while higher ISOs are needed in low light conditions to get good shots. However high ISOs may result in noisier images (those coloured dots in digital cameras) or images with noticeable film grain (analog cameras).

In old analog cameras, the ISO was directly proportional to the density of the layer of silver salts in the photographic film. Lower ISO equals less layer density, which results in a film that needs less photon beams to produce a meaningful reaction to leave a “print” (called latent image). It's like trying to sculpt on paper with light beams. The bigger the density (i.e. “thickness”) the more beams you need to sculpt it.
In digital cameras, the ISO speed is controlled by how the CCD is calibrated. An engineer specialized in photography could probably give a more technically detailed response, but this should suffice.

Picture with noticeable film grain (high ISO). Watch the original here.

Picture with noticeable noise (high ISO 3200, 6.25ms shutter speed, digital camera)
Same picture with almost no visible noise (low ISO 100, 200ms shutter speed, digital camera)
You may have realized by now, ISO & Shutter speed are tied. Low shutter speed time will allow less light to go through. As a result, a high ISO is required. ISO & Shutter speed are in fact so tied together, that this is why high ISOs are called "fast".
High shutter speed times on the other hand, will allow plenty of light to go through the sensor/film. If you're not using a low ISO, it can quickly overexpose. This is why low ISOs are called "slow".


Aperture

I won't go into much detail on this one. Basically aperture controls how much the camera opens and lets the light go through. It is measured in f-stop. An f-stop of f/1 makes the lens wide open, while an apperture of f/22 makes the lens to be very tight closed (the area of the circle is smaller).
Apperture controls the depth of field in the camera. Close shots needs wide open lens (apperture close to f/1), while far shots require a tightly closed lens (f as smaller as possible, like f/22)

Needless to say, a wide open lens allows more light to go through, thus causing less film grain and reducing the need of having longer shutter speeds (reduces motion blur).
A closed lens instead, has very far depth of field but needs more time to gather enough light, thus getting more blurry pictures, or noisier images if you choose a bigger ISO and less shutter speed.

In terms of rendering, we could say the apperture is controlled indirectly by the value of the far plane, and directly by your depth of field shader effect, if you have one.

Apperture f / 13
Apperture f / 14
Both pictures were taken with the same ISO 400 & shutter speed 5ms. Notice how the noise (film grain for analog cameras) in the background is more noticeable with smaller appertures. Click on the picture to enlarge.


These three elements, Shutter speed, ISO speed, and Apperture are called the "exposure triangle"

What we've learned so far

  • Film grain and motion blur are inversely proportional.
    • Film grain is more frequent when using high ISOs.
    • Film grain is hence also more likely in low light conditions.
    • But motion blur appears when using low ISOs!
    • At least one of the games listed at the start is using too much of both effects at the same time!!
  • Close-focused DoF produces less motion blur and less film grain.
    • A wider apperture increases the amount of light going through, thus forcing us to reduce the shutter speed and the ISO without their disadvantages.
    • This is very important if you have a cutscene with DoF. Don't make a close up shot and have lots of blur and film grain.
  • Far-focused DoF produces more motion blur and/or more film grain
    • A smaller apperture decreases the amount of light going through, thus forcing us to reduce to either increase the exposure time needed (increase shutter speed time -> more motion blur) or increase the ISO (more film grain), or both.
Having all effects active can be an artistic touch, but if your goal is aiming towards photorealistic rendering, keep these in mind.
Furthermore we're living in an era where the idea seems to try to blind the player with bloom, slow hdr eye adaptation, gigantic lens flares, and almighty god rays (ok I'm guilty of that one).
So.. since DoF is rarely used and knowing that you should lower your motion blur when film grain is kicking in is a good hint, given the same apperture. You don't have to throw everything together to the player to show off how awesome your postprocessing shaders are and how ree3al!! it looks. It's tempting, I know. But please fight it. Turning on every effect you know doesn't make it more realistic.

VERY FAST rotating camera. High ISO 3200, 10ms shutter. Notice a lot of noise and some blur


Camera rotating at the same speed, low ISO 100, 333ms shutter. Notice the lack of noise/grain, and extreme motion blurriness.

The motion blur

Ok, enough with the rant. Back to the original topic, the point of all this is that motion blur is caused by light breaking it's way through the lens to the sensor and adding itself to what's there already.
Note the word "add". I didn't say "average". I said ADD
Motion blur is additive, not an average. But this is relative, because using a lower ISO is roughly the mathematical equivalent of dividing all images that are going to be added by a constant factor; in which case it starts to look more like an average.

Let's look at some real camera shots, purposely shaken to enhance the motion blur effect. Notice the light streaks are blurrier than the rest of the image (darker spots) and leave a larger trail.
Most of them were taken with ISO-400



Now let's try to mimic that effect. I will try to present a different blending formula that tries to emulate that behavior. Note: This is an empirical method based on observation, I haven't based the formula in some mathematical or physically accurate model.

Below is the typical motion blur postprocessing formula:

Where:
  • RGB is the final output.
  • HDR() is the hdr tonemapping operation
  • x is the current pixel location (whether in texels or pixels)
  • v is the velocity vector
  • n is the number of sample steps
  • pixel[] is the pixel being addressed.
  • f() is the motion blur operation


 While this is a more correct (IMHO) motion blur formula:

Where:
  • RGB is the final output.
  • HDR() is the hdr tonemapping operation
  • x is the current pixel location (whether in texels or pixels)
  • v is the velocity vector
  • n is the number of sample steps
  • p[] is the pixel being addressed.
  • C is a constant factor in the range (0; 1]
  • D is a constant factor in the range (0; 1], typically 1-C
  • f() is the motion blur operation
Note: I used the X and * signs indistinctly for multiplication. It was an inconsistency typo. There are no cross products involved.

Some notable differences & remarks:

1. HDR is done after the motion blur. This ensures that the motion blur result stays in the correct range when C != n. This also matches a real life camera closer.

2. C is a constant (arbitrary) factor to simulate the ISO speed. When C = n, it's a simple average.

3. D is a constant (arbitrary) factor to simulate the shutter speed speed. Normally it should be inversely proportional to C. So probably either D = 1 - C or D = 1 / (C + 1)

4. C is inside the loop. Although putting it outside could be considered a performance optimization, take in mind you're adding hdr values. If you're using real life values, the sky can easily have large numbers like "5000". Just using a R16FG16FB16F render target with a surprisingly not very high step count can easily overflow the result to infinity.
You'll most likely blur all samples in one pass and convert them to 32-bit floats, thus you can safely put C outside. But if you're in DX 9.0 hardware, beware of not using half, otherwise some hardware (G60, G70, Intel cards) will perform the arithmetic in F16 precision (overflow) while other hardware (G80+, Radeon HD 2000+) will always perform the arithmetic in F24 or F32 precision (no apparent overflow)

5. For similar reasons in 4.; you may want to clamp the result before outputting the float to the render target. Chances are, you're going to do another pass to calculate the average luminance in order to perform the HDR tone mapping, you probably don't want to overflow to infinity.
Because not all 16-bit float GPUs are IEEE compliant, clamp to sane high value, like 64992, or 32000 (should be multiple of 32).

6. Clamping to avoid overflow is not a bad thing per-se. If you notice the three pictures earlier,  one of them has a highly saturated light source the camera can't handle, unless I use a lower ISO. In a similar fashion, just use a lower C value to allow HDR to gracefully handle the saturation.

Nothing new

This isn't new, nor is rocket science. In fact, Crysis 2 talks about it in SIGGRAPH 2011 ("Secrets of CryEngine 3 Graphics technology") that their motion blur is done in linear space before tone mapping, using 24 taps for in PC, 9 taps in consoles; and says “Bright stakes are kept and propagated” citing Debevec 1998.
Obviously, I wasn't the first one to notice the order of operations was wrong.

The reason of why it works is quite obvious. In an extremely bright scene with some extremely dark objects, Blurring two bright pixels: 5000 + 5000 = 10000; while blurring one dark with one bright: 5000 + 10 = 5010; which is almost the same pixel it was before.
When blurring the two dark pixels, 10 + 10 = 20. Which is 100% different, very blurry. However if the scene is very bright, it will favour all pixels with >very high luminance because of the avg. luminance.
After tonemapping, the bright pixels 10000 will map to the value "1"; the not-as-bright pixels 5010 & 5000 will both map to roughly the same value (~0.5) and the dark pixels 10 & 20 will probably map to the exact same low value due to rounding, which you won't notice the difference (0.00392 ~ 0.00784 if you're using an RGBA8888 framebuffer)

Where's the sample code? Results?

Unfortunately, I'm too busy on too many projects right now to be testing this formula. I'm usually not keen on releasing something without proof (besides it convinces people quickly!) A test application would require me to implement vector based motion blur, DoF & HDR. I've done these several times in the past, but for different clients under NDAs, so I would have to make another one from scratch. And due to my motion sickness with this particular effect, I have never been been much interested in vector based motion blur to begin with.
I've only got into this when I saw that every game I've been shown lately is shipping with this horrible effect, so I felt I needed to write something about it. 5 years ago it could be turned off because it was too expensive for most GPU cards, so the option was there. But it appears today consumer cards are able to handle this effect with no problem, thus less games allow the effect to be turned off.
As much as I admire Wolgang Engel (who wrote the D3D10 book, and also worked on Alan Wake) and Emil Persson (who worked on Just Cause 2), I cannot stand that horrible motion blur effect going on there.

What Crytek is doing is probably not very different from this approach, and they do have pictures and a working implementation that may be worth watching.

I know someone who's recently implemented his own version of v-based motion blur, so I may be able to convince to try this approach instead and share the results.

If the math formula jargon scared you away, here's a Cg implementation so you can understand. Warning: Untested code.


//Remember to do the hdr tonemapping AFTER you've done the motion blur. rtTex is assumed to be sampled in linear space. Do *not* compile this code with a flag that turns all float into half; unless you move the multiplication against isoFactor inside the loop.
float4 motionBlur( float2 uv, uniform sampler2D rtTex, uniform sampler2D rtVelocities, int numSteps, float isoFactor, float shutterFactor )
{
    float4 finalColour = 0.0f;
    float2 v = tex2D( rtVelocities, uv ).xy;
    float stepSize = shutterFactor / numSteps;
    for( int i = 0; i < numSteps; ++i )
    { 
        // Sample the color buffer along the velocity vector. 
        finalColour += tex2D( rtTex, uv );
        uv += v * stepSize;
    } 

    //Clamp the result to prevent saturation when stored in 16 bit float render targets.
    return min( finalColour * isoFactor, 64992.0f );
}

4 comments:

  1. You wrote:
    float2 v = tex2D( rtVelocities, uv ).xy;
    But "v" is never used, am I missing something ?

    ReplyDelete
  2. Ooops. Thank you.
    "uv += stepSize;"

    Should've been:
    "uv += v * stepSize;"

    Fixed.

    ReplyDelete
    Replies
    1. And by the way, you set rtVelocities as a Sampler2D shouldn't it be float2 ?

      Delete
    2. Now that I see it it should be Sampler2D but I really don't understand what texture it should be set to

      Delete