A Technical Overview
60 frames per second vs. 30 frames per second has been one of the most contested ideas around the web and in print for the last year. Today we will look to see who is right, who is wrong, and who is just plain confused.
Chipmaker (and now boardmaker) 3dfx has been evangelizing gaming at 60 fps since the Voodoo 2 was released. Many have looked down upon 3dfx for this due to the common misconception that humans cannot distinguish framerates over 30 fps, so what is the point of having visuals running at 60 fps? Misconception you say? Yes. In this article we will look behind the technology of games, computers, movies, and television and the physiology and neuro-ethology of the human visual system.
I have seen film students write in to columns about how anything over 24 fps is wasted. Why 24 fps? Movies in theaters run at 24 fps. They seem pretty smooth to me, so why would we need more? Well, let's take a look at movies from the eyes' perspective. First off, you are sitting in a dark movie theater and the projector is flashing a really bright light on a highly reflective screen. What does this do? Have you ever had a doctor flash a bright light in your eye to look at your retina? Most of us have. What happens? A thing called "afterimage". When the doctor turns off the bright light, you see an afterimage of the light (and it is not real comfortable). Movie theaters do the same thing. The light reflected off the screen is much brighter than the theater surroundings. You get an afterimage of the screen after the frame is passed on, so the next frame change is not as noticable.
Screen refresh is also a very important factor in this equation. Unlike a television or a computer monitor, the movie theater screen is refreshed all at once (the entire frame is instantly projected and not drawn line for line horizontally as in a TV or monitor). So every frame is projected in its entirety all at once. This then leads back to afterimage due to the large neurotransmitter release in the retina.
Perhaps the most important factor in the theater is the artifact known as "motion blur". Motion blur is the main reason why movies can be shown at 24 fps, therefore saving Hollywood money by not having to make the film any longer than possible (30 fps for a full feature film would be approximately 20% longer than a film shown at 24 fps, that turns out to be a lot of money). What motion blur does is give the impression of more intervening frames between the two actual frames. If you stop a movie during a high action scene with lots of movement, the scene that you will see will have a lot of blur, and any person or thing will be almost unrecognizable with highly blury detail. When it is played at full 24 fps, things again look good and sharp. The human eye is used to motion blur (later on that phenomena) so the movie looks fine and sharp.
TV, Video Tape, and DVD
TV's run at a refresh rate of 60 Hz. This is not bad for viewing due to the distance we usually sit from the TV, and the size of the phosphors on your average set and the distance between phosphors (between .39 for a high end one, to .5 and higher for cheaper models). This is actually quite big and fuzzy for most of us, but as long as we are not doing any kind of productivity software (such as word processing) and just watching movies at least 6 feet from the TV, that is just fine.
Now TV transmissions, video tape, and DVD play at 30 fps. The increase from movies is due mostly to the environment that the TV is watched in. It is usually quite a bit brighter than in a movie theater, and most importantly a TV does not do a full screen refresh, rather each frame is drawn line by line horizontally by an electron gun hitting the phosphors in the screen. So basically each frame is drawn twice by the TV (60 refreshes per second, 30 frames per second). Now because the frame rate is * the refresh, transitions between frames go a lot smoother than if you had say a 72 Hz refresh and a movie playing at 30 fps. Don't ask me why, it is due to wave behavior, which is higher level physics, and I can't go into that without making this a 30 page paper. Needless to say, the physics behind this make video and DVD look very smooth.
Motion blur again is a very important part to making videos look seamless. With motion blur, those two refreshes per frame give the impression of two frames to our eyes. This makes a really well encoded DVD look absolutely incredible. Another factor to consider is that neither movies or videos dip in frame rate when it comes to complex scenes. With no frame rate drops, the action is again seamless.
Games on the Computer
This is the second toughest part of this article. TV and Movies are easy to understand, and the technology behind it is also easy to understand. Computers and the way games are projected to us is a lot more complex (the most complex is the actual physiology /neuro-ethology of the visual system).
First off, the hardware used for visualization (namely the monitor) is a very fine piece of equipment. It has a very small dot pitch (distance between phosphors) and the phosphors themselves are very fine, so we can get exquisite detail. We set the refresh rates at over 72 Hz for comfort (flicker free). This makes a very nice canvas to display information on, unfortunately because it is so fine it can greatly magnify flaws in the output of a video card. We will get into refresh in the section on the human eye.
Let us start with how a scene or frame is set up by the computer. Each frame is put together in the frame buffer of the video card and is then sent out through the RAMDAC to the monitor. That part is very easy, nothing complex there (except the actual setup of the frame). Now each frame is perfectly rendered and sent to the monitor. It looks good on the screen, but there is something missing when that action gets fast. So far, programmers have been unable to make motion blur in these scenes. When a game runs at 30 fps, you are getting 30 perfectly rendered scenes. This does not fool the eye one bit. There is no motion blur, so the transition from frame to frame is not as smooth as in movies. 3dfx put out a demo that runs half the screen at 30 fps, and the other half at 60 fps. There is a definite difference between the two scenes, with the 60 fps looking much better and smoother than the 30 fps.
The lack of motion blur with current rendering techniques is a huge setback for smooth playback. Even if you could put motion blur into games, it really is not a good idea whatsoever. We live in an analog world, and in doing so, we receive information continuously. We do not perceive the world through frames. In games, motion blur would cause the game to behave erratically. An example would be playing a game like Quake II, if there was motion blur used, there would be problems calculating the exact position of an object, so it would be really tough to hit something with your weapon. With motion blur in a game, the object in question would not really exist in any of the places where the "blur" is positioned. So we have perfectly drawn frames, so objects are always able to be calculated in set places in space. So how do you simulate motion blur in a video game? Easy, have games go at over 60 fps! Why? Read the section on the human eye.
Variations in frame rate also contribute to games looking jerky. In any game, there is an average frame rate. Rates can be as high as the refresh rate of your monitor (70+), or it can go down in the 20's to 30's. This can really affect the visual quality of the game, and in fast moving ones can actually be detrimental to your gameplaying performance. One of the great ideas that came from the now defunct Talisman project at Microsoft was the ability to lock frame rates (so the rate goes neither above or below a certain framerate). In the next series of graphics cards, we may see this go into effect.
The Human Eye
(and Visual Cortex)
Here is where things get a little interesting, and where we will see that humans can perceive up to 60+ fps.
Light is focused onto the retina of the eye by the lens. Light comes in a steady stream and not pulses (ok, so this is a little wrong, but we are not talking about the dual nature of light, where it acts as both a particle -photon- and a wave). Again, we live in an analog world, where information is continuously streamed to us. The retina interprets light in several ways with two types of cells. Rods and Cones make up the receiving cells for light. Intensity, color, and position (relative to where the cell is on the retina) is the information transmitted by the retina to the optic nerve, which then sends that info to the Visual Cortex for it to be translated to our conscious self (whoa, went from science to philosophy in one step!).
Rods are the simpler of the two cell types, as it really only interprets position and intensity. Rods are essentially color blind, and are referred to as transmitting in black and white. The black and white is not really true, but rather it is just intensity of the light hitting the cell. Rods are also very fast due to the basic nature of intensity. The amount of neurotransmitter released is basically the amount of light that is stimulating the rod. The more light, the more neurotransmitter. Rods are also much more sensitive than cones. How is this proven? We know by microscopic examination of the retina shows that there is a much greater concentration of rods on the outer edges.
A simple experiment that you can do yourself is to go out on a starry night and look at the stars out of your peripheral vision. Pick out a faint star from your periphery and then look at it directly. It should disappear, and when you again turn and look at it from the periphery, it will pop back into view.
Cones are the second cell type, and these are much more complex. There are three basic parts to them that absorb different wavelengths of light and release differing amounts of different neurotransmitters depending on the wavelength and intensity of that light. Basically there are three receptors in a cone that absorb red, green, and blue wavelengths of light. Each of these receptors release a different neurotransmitter for the color, with differing amounts of the neurotransmitter depending on the intensity of the wavelength. Purple is a combination of blue and red, so the red and blue receptors would release differing amounts of neurotransmitter, while the green wouldn't release any. This information then passes onto your visual cortex and we "see" purple. Cones are much more inefficient than rods due to their more complex nature. They also are a little slower to react to changes in light and are also not as sensitive as rods (see above experiment). Cones are what largely make up the center of the retina and fovea (focal point of the retina).
The optic nerve is the highway from which information is passed from the eye to the visual cortex in the brain. This nerve is just a pathway, and does no processing on its own. Its bandwidth is actually really huge, so a lot of information can be passed on. Nerve impulses also travel at over 200 mph to the brain, so it is nearly instantaneous for information to be received from the eye (since the optic nerve is only about 2 cm to 3 cm long).
The visual cortex is where all the information is put together. Humans only have so much room in the brain, so there are some tricks it uses to give us the most information possible in the smallest, most efficient structure. One of these tricks is the property of motion blur. We cannot get away from the phenomena because it is so important to the way we perceive the world. In the visual cortex we can theorize the existence of what I call the motion blur filter. Because the eye can only receive so much information, and the visual cortex can only process so much of that, there needs to be a way to properly visualize the world. This is where it gets tough.
Take for example a fast moving object. The faster it goes, the more it blurs (be it a snowflake or a train). Why does this happen? Let's take the example of a snowflake. At any time it has a fixed position in the universe, no matter what speed it goes at (unless it starts to get relativistic, then we go into some strange physics, but something that is not applicable to what we are talking about). Lets say at 5 mph, we see the snowflake in perfect detail as it falls to the ground. Now we hop into a car and go 55 mph. Can we see the detail of the snowflake? No, it is just a streak to us. Has the snowflake changed itself? Of course not. If we had a really fast camera with a fast shutter speed, it would see the snowflake in perfect detail. Now due to the speed in which our eyes/visual cortex can process information, we cannot see the snowflake in detail. A bird such as an eagle would be able to see more detail and not so much of a streak because it only has rods (it is color blind) and the distance from the eyes to its highly specialized visual cortex is 1/16th the distance of ours. This leads to more information being pumped into the visual cortex. So what would look like a streak to us would look like a fast moving snowflake to the eagle.
If we didn't have the ability to produce motion blur, we would see the snowflake pop in and out of existence at high speeds. We would first see it one place, then it would disappear and pop into existence several feet beyond depending on the direction it is going. Is this a good thing? No, we would have a hard time figuring out the direction of the snowflake and have many problems with perceiving movement in three dimensional space. With motion blur we get the impression of continuity where our hardware cannot distinguish fine detail while the object is moving at high speeds.
Contrary to the belief that we cannot distinguish anything over 30 fps, we can actually see and recognize speeds up to 70+ fps. How can you test this? You can quickly do this with your monitor at home. Set the refresh rate to 60 Hz and stare at it for a while. You can actually see the refreshes and it is very tiring to your eyes. Now if we couldn't see more than 30 fps, why is it that flicker free is considered to be 72 Hz (refreshes per second). You can really tell if the refresh is below 72 by turning your head and looking at the screen through your peripheral vision. You can definitely see the screen refreshes then (due to rods being much more efficient and fast).
We as humans have a very advanced visual system. While some animals out there have sharper vision, there is usually something given up with it (for eagles there is color, for owls it is the inability to move the eye in its socket). We can see in millions of colors (women can see up to 30% more colors than men, so if a woman doesn't think your outfit matches, she is probably right, go change), we have highly movable eyes, and we can perceive up to and over 60 fps. We have the ability to focus as close as an inch, and as far as infinity, and the time it takes to change focus is faster than the fastest, most expensive auto-focusing camera out there. We have a field of view that encompasses almost 170 degrees of sight, and about 30 degrees of fine focus. We receive information constantly and are able to decode it very quickly.
So what is the answer to how many frames per second should we be looking for? Anything over 60 fps is adequate, 72 fps is maximal (anything over that would be overkill). Framerates cannot drop though from that 72 fps, or we will start to see a degradation in the smoothness of the game. Don't get me wrong, it is not bad to play a game at 30 fps, it is fine, but to get the illusion of reality, you really need a frame rate of 72 fps. What this does is saturate the pipeline from your eyes to your visual cortex, just as reality does. As visual quality increases, it really becomes more important to keep frame rates high so we can get the most immersive feel possible. While we still may be several years away from photographic quality in 3D accelerators, it is important to keep the speed up there.
Looks like 3dfx isn't so full of it.