Futuremark has introduced its latest version of 3DMark, 3DMark 2003. Tag lined “The Gamers’ Benchmark”, 3DMark03 is intended to give direction to the computer gaming enthusiast concerning the performance that can be expected from modern graphics hardware. In this report, we’ll take a look at 3DMark03, and its ability to predict game performance.
The primary goal of any benchmark is to arm the consumer with the right information to make the best possible purchase decision. As the gamers’ benchmark, 3DMark03 must emulate as closely as possible the kind of experience that the gaming enthusiast will expect on their machine. It must exercise graphics hardware in the same manner that consumer’s games will. The graphics features, rendering paths, and effects must all emulate games, or the consumer will be misinformed and their expectations misguided.
3DMark03 combines custom artwork with a custom rendering engine that creates a set of demo scenes that, while pretty, have very little to do with actual games. It is much better termed a demo than a benchmark. The examples included in this report illustrate that 3DMark03 does not represent games, can never be used as a stand-in for games, and should not be used as a gamers’ benchmark.
The ultimate injury to the consumer of such a benchmark is three-fold. First, of course, the consumer is misguided. A purchase decision based on ineffectual data will lead consumers to wrong conclusions. Second, it causes graphics hardware manufacturers to focus attention and engineering resources on optimizing for artificially fabricated cases that are a-typical of games. Such optimizations generally do nothing to improve real game performance, and provide no benefit to the consumer. Finally, the extra engineering effort focused on such benchmarks reduces the effort available for activities beneficial to consumers—improving the actual gaming experience.
The examples below are illustrative of the areas where 3DMark03 deviates significantly from actual game behavior.
Game Test 1 – World War II airplane battle
The first game scene is intended to represent DirectX 7.0 (DX7) applications. As a result, it uses no pixel shaders, and uses vertex shaders only to mimic the vertex processing done in DX7-style fixed function transform and lighting. The majority of the geometry in the scene is used in creating the aircraft models. The remainder of the scene is comprised of a small number of triangles in the background elements of a “skybox” and ground plane.
Unfortunately, Futuremark chose a flight simulation scene for this test. This genre of games is not only a small fraction of the game market (approximately 1%), but utilizes a simplistic rendering style common to this genre. Further, the specific scene chosen is a high altitude flight simulation, which is indicative of only a small fraction of that 1%. For any given frame in the scene, up to 90% of the pixels in the frame are single textured. This occurs because the majority of the scene is the low poly-count, single textured “skybox”, painted to look like sky and clouds.
The documentation for 3DMark03 gives some detail about the four-layer multitexture used on the airplanes. Regrettably, all this effort is lost on the fact that the airplanes cover so few pixels on the screen to make these four layers of multitexture completely insignificant. Game Test 1 is, essentially, a single texture fill rate test. No modern games, even DX7 games, are completely dominated by this kind of simple rendering technique.
It’s curious that this test is so simplistic. All popular games such as Unreal Tournament, Shogo, Quake3, and even Quake2 make heavy use of multitexture throughout their scenes. Additionally, the last generation of 3DMark (3DMark 2001) used scenes that made extensive use of multitexture for both foreground and background elements. Here are some excerpts from last year’s 3DMark01 documentation for the DX7 style scenes (games 1 through 3):
- Game 1 (Car Chase)
- The red truck is gloss mapped (requires three texture layers)
- The landscape has two texture layers in low detail, three in high detail.
- Game 2 (Dragothic)
- The dragon has two texture layers.
- The village has two texture layers in low detail, three in high detail
- Game 3 (Lobby)
- The room has two texture layers everywhere, one color map and a multiplicative lightmap
This is not representative of modern games. It is certainly not what the “gamer” consumer needs for guidance. Using such a test in a purchasing decision is akin to buying a home audio system based on its AM radio capability. Every audio system includes an AM radio, but it is rarely, if ever, a key element of the listening experience.
Game Test 2 – First Person Shooter
Game Test 3 – Fantasy
For all intents and purposes game tests 2 and 3 are the same test. They use the same rendering paths and the same feature set. The sole difference in these tests appears to be the artwork. This fact alone raises some questions about breadth of game genres addressed by 3DMark03. These two tests attempt to duplicate the “Z-first” rendering style used in the upcoming first-person shooter game, “Doom 3”. They have a “Doom-like” look, but use a bizarre rendering method that is far from Doom 3 or any other known game application. This method makes for an interesting demo, but is so inefficient that no game would ever employ it. This is best exemplified by the shadow calculation method used in these tests. These tests attempt to use shadow technique used in Doom 3 called stencil shadow volumes. This is a multiple pass algorithm that is done for all objects in the scene. The passes in 3DMark03 look like this:
For every object:
- Pass 1 (Early Z)
- Skin Object in Vertex Shader
- Pixel Shader writes Z, RGB = ambient, and Alpha = perspective Z
- For every object:
- Pass 2 (Stencil Shadow Volume calculation)
- Set stencil to increment/decrement
- Skin Object in Vertex Shader
- Stencil extrusion calculation
- No Pixel Shader
- Pass 3 (Lighting)
- Skin Object in Vertex Shader
- Pixel Shader (lighting) write RGB = color
It’s unfortunate that 3DMark03 does not truly emulate Doom or any other game by skinning each object only once per frame, caching the skinned result, and using that cached result in the multiple passes required for shadows. This would have been a balanced approach that allows both the vertex and pixel/raster portions of the graphics engine to run at full speed. Designing hardware around the approach used in 3DMark03 would be like designing a six lane on ramp to a freeway in the freak case that someone might drive an earthmover on to it. Wasteful, inefficient benchmark code like 3DMark03 force these kinds of designs that do nothing to benefit actual games.
One last comment on the shadow code in game tests 2 and 3 – it appears to be incorrect. It generates artifacts on the shadowed characters. In the image below, note the dark triangles on the arm of the woman, and shoulder of the Troll. These triangles are being shaded as though they are in shadow, but are clearly in the light. This is an artifact of 3DMark03’s shadow algorithm, not the graphics hardware.
Finally, the choice of pixel shaders in game tests 2 and 3 is also odd. These tests use ps1.4 for all the pixel shaders in the scenes. Fallback versions of the pixel shaders are provided in ps1.1 for hardware that doesn’t support ps1.4. Conspicuously absent from these scenes, however, is any ps1.3 pixel shaders. Current DirectX 8.0 (DX8) games, such as Tiger Woods and Unreal Tournament 2003, all use ps1.1 and ps1.3 pixel shaders. Few, if any, are using ps1.4. It makes it difficult to conclude that these tests actually represent any of today’s DX8 games. Again, there’s no need to look any further than the previous version of 3DMark (3DMark 2001) as an example. There, we were introduced to the original nature scene, which was billed as a DX8 scene, and was based solely on ps1.1 and ps1.3 shaders. Last generation’s 3DMark had a reasonable link between its test and actual games. Unfortunately, the new generation of 3DMark is clearly headed in a different direction, and is misleading consumers by focusing on a feature that is virtually non-existent in DX8 games.
Game Test 4 – Nature Scene
This year’s 3DMark has a new nature scene. It is intended to represent the new DirectX 9.0 (DX9) applications targeted for release this year. The key issue with this game scene is that it is barely DX9. Seven of the nine pixel shaders in this test are still ps1.4 from DX8. The same issues about ps1.4 shaders described above apply here. Only two of the pixel shaders are the new ps2.0. Consumers believing that this test gives a strong look at the future of games will find it merely provides a brief glimpse, if at all.
The most dangerous aspect of this for consumers is that 3DMark03 attempts to give the perception that it’s a bona fide DX9 benchmark. Consumers will use the score generated by 3DMark03 as their gauge for DX9 applications. The reality is that the 3DMark03 score is generated from all four game tests, and game test 4 is only a small fraction of that score. Further, only a small fraction of game test 4 has any DX9 components to it. As a result, the amount of DX9 represented in the 3DMark03 score is negligible. It’s not a DX9 benchmark.
3DMark03 has combined some pretty artwork into a set of four nice demo scenes. But, these scenes are so distant from real game applications that 3DMark03 just doesn’t make it as a benchmark. Lacking any similarity to actual games, it misrepresents the gaming experience, and doesn’t arm the consumer with the right information to make a purchase decision. It also forces hardware vendors to waste valuable engineering resources focusing on artificially fabricated cases that will never benefit real games. In the end, the consumer loses.
So, where do you find a true gamers’ benchmark? How about running actual games? Most popular games include a benchmark mode for just this purpose. Doom3, Unreal Tournament 2003, and Serious Sam Second Encounter are all far better indicators of current and upcoming game performance. And, because the vendors of these games have licensed their game engines to other game developers, you can expect that the next generation of games will have these game engines at their core. Today’s consumers no longer have to rely on artificial attempts at mimicking games. In most cases, consumer’s favorite games already have a ready-made benchmark built right in.