<%BANNER[top_768x90]%>

<%BANNER[banner_468x60_h]%>

Shader Model 3.0: Crytek's FarCry Gets Acceleration

We just got a long-expected update for the most-popular and the eye-catching game today. The promising technologies introduced along with the GeForce 6-series graphics processors proved to really bring additional performance for the innovative architecture. Find out whether the Pixel Shaders 3.0 and Vertex Shaders 3.0 really can help NVIDIA GeForce 6800 to beat ATI RADEON X800 graphics cards in the most graphics-intensive application available today.
UPDATE: X-bit labs have added scores in FSAA and anisotropic filtering mode to the article.

by Anton Shilov , Alexey Stepin
07/02/2004 | 05:28 AM

UPDATE: Added section with performance numbers for the "Eye-Candy" mode with full-scene antialiasing and anisotropic filtering activated. More comments in "Conclusion" section.

Introduction

<%BANNER[article]%>

Whether you are a gamer or a technology fan, you most probably know the game called FarCry. The title that has been in development for about three years and revolutionized the way we treat high-quality computer game with exceptional graphics, breath-taking realism and stunning special effects. The heavy use of new DirectX 9 technologies clearly brought FarCry in the list of the best titles of the decade so far. Moreover, the game developed by a team of developers, who was not presently known for any big titles, can successfully compete with the upcoming Doom III and Half-Life 2 titles.


FarCry, "Pier" level

Unfortunately for some end-users, frequent use of pixel and vertex shaders had dramatic impact on performance for graphics cards that cannot perform certain operations really fast. Our experience with FarCry so far shows that the game clearly benefits from ATI-based graphics cards that have exceptionally fast pixel shader ALUs as well as high clock-speed, while the GeForce FX graphics cards typically show pretty low performance in FarCry. The newly introduced GeForce 6800-series of products were massively faster compared to predecessors, but were still behind competing RADEON products. However, NVIDIA’s GeForce 6800-series graphics processors have a number of trumps, dubbed Pixel and Vertex Shaders 3.0, that eventually can drive the GPUs into the leading position compared to the RADEON X800-series visual processing units.

X-bit labs managed to get FarCry patch 1.2 along with Microsoft DirectX 9.0c and a special ForceWare driver version 61.45. The patch that is to be released one of these days sports optimizations for NVIDIA GeForce 6-series graphics cards and implements Pixel Shaders 3.0 in order to maximize performance in the game, while the DirectX 9.0c along with new ForceWare drivers fully support Shader Model 3.0 and are also likely to emerge shortly. Using all of this, we ran an extensive array of FarCry benchmarks in an attempt to figure out what benefit can the Shader Model 3.0 provide. The results proved to be extremely interesting and performance gains pleased us much.


Shader Model 3.0 in FarCry: No New Effects so Far

The FarCry game was originally developed with the DirectX 9.0 in mind. However, nowadays there are so lot of different graphics cards on the market that engines for game titles have to be scalable enough in order to satisfy gamers with DirectX 8.1 graphics cards and run faster or produce higher quality image on graphics cards that sport specifications beyond DirectX 9.0.


FarCry with Shader Model 3.0 and HDR, screenshot from nVNews.net

Crytek, the developer of FarCry, recently showcased a special demo of FarCry with loads of Shader Model 3.0 capabilities, such as displacement mapping, exposed. While the screenshots impressed everyone much, it looks like we still have to wait a while until the game will be able to show us all of its glory with pixel and vertex shaders 3.0.


FarCry with Shader Model 3.0 and HDR, screenshot from nVNews.net

What the Crytek does now is implementing Shader Model 3.0 where possible in order to improve performance on hardware that sports the Shader Model 3.0. It is not clear whether the whole game already utilises the SM 3.0 capabilities for speed maximization purposes or just a number of selected levels boast the features, but the fact that developer of such complex game jumped on the SM 3.0 bandwagon means that, at least, pixel and vertex shaders 3.0 are unlikely to fall into oblivion.


Shader Model 3.0 in FarCry: Bring the Speed!

According to Crytek, FarCry uses instancing for trees and grass in open scenes of the game, particularly in “Training” and “Regulator” levels, as well as Pixel Shaders 3.0 instead of Pixel Shaders 2.0 in order to save rendering passes in indoor scenes, particularly on “Research” and “Volcano” levels.

 
FarCry with Shader Model 3.0 and HDR, screenshot from nVNews.net

Instancing is one of the Shader Model 3.0’s key features. Currently, games face limits on the number of unique objects they can display in the scene, not because of graphics horsepower, but often the CPU-side overhead of either storing or submitting many slightly different variations of the same object.  For instance, a forest is made up of trees that are often similar to each other, but each would be in a different position, have differing height, branch length, leaf colour, and so on.  In order to add the desired variation, developers have to choose between storing many separate copies of the tree, each slightly different, or making expensive render state changes in order to rotate, scale, colour and place each tree.

Instancing allows the programmer to store a single tree, and then several other vertex data streams to specify the per-instance colour, height, branch size and so on.  For instance, a single 1000-vertex tree model would contain the vertex positions and normals, and a 200-element vertex streams would contain positions, colours, heights, and branch length values.  Instancing allows the programmer to submit a single draw call, which renders each of the 200 trees, using the same data for the basic tree shape, but then vary it through the per-instance streams.

Another way to improve performance in some situations is to employ a complex Pixel Shader 3.0 instead of executing multi-pass Pixel Shader 2.0 rendering, saving the precious rendering passes. Crytek said that complex indoor per pixel lighting in FarCry is now done [in SM 3.0 path] using Pixel Shader 3.0 in a single pass instead of multiple passes using Pixel Shaders 2.0.

Most probably there is still quite some headroom for FarCry optimisation using Shader Model 3.0, but at this point only instancing and per pixel lighting models are reported to be implemented. Both do deliver additional performance and prove that Shader Model 3.0 is not a “tick feature”, but something that can be used and is likely to be used by game developers.


Image Quality Comparison

Before we proceed to the benchmarks, let us perform a brief image quality comparison between ATI RADEON X800 XT and NVIDIA GeForce 6800 Ultra using the latest patch and drivers.

We used 5 scenes from “Training”, “Pier”, “Catacombs”, “Archive” and “Volcano” levels, as shown below. Game quality settings were set to the maximum level, anisotropic and trilinear optimizations were enabled for NVIDIA-based graphics card.

ATI RADEON X800 XT vs. GeForce 6800 Ultra IQ comparison, FarCry 1.2

ATI RADEON X800 XT, Pier level

NVIDIA GeForce 6800 Ultra, Pier level

ATI RADEON X800 XT, Volcano level

NVIDIA GeForce 6800 Ultra, Volcano level

ATI RADEON X800 XT, Archive level

NVIDIA GeForce 6800 Ultra, Archive level

The brief analysis proves that there are no substantial image quality differences between ATI RADEON X800 XT and NVIDIA GeForce 6800 Ultra graphics chips. It means speed gains in the game were achieved by implementing more efficient shaders and other rendering mechanisms rather than by engaging image quality tradeoffs.

ATI RADEON X800 XT vs. GeForce 6800 Ultra IQ comparison, FarCry 1.2

ATI RADEON X800 XT, Catacombs level

NVIDIA GeForce 6800 Ultra, Catacombs level

ATI RADEON X800 XT, Training level

NVIDIA GeForce 6800 Ultra, Training level

The only thing that is differently produced by ATI and NVIDIA graphics cards is shadows.

ATI RADEON X800 XT

NVIDIA GeForce 6800 Ultra

ATI RADEON X800-series produces smoother shadows in FarCry when compared to NVIDIA GeForce 6800-series.

NVIDIA seems to have a driver bug that results in very rough shadows in the game. This is unlikely to affect performance seriously as shadows in FarCry hardly require any significant computing power, though, this is something negative we can tell about the new set of drivers.


FarCry 1.2: Testbed and Methods

In order to show our readers benefits the Shader Model 3.0 can provide to the GeForce 6800-series graphics cards we decided to run benchmarks using six demos made on “Training”, “Research”, “Regulator”, “Pier”, “Catacombs” and “Volcano” levels. Each demo was meant to reflect performance of actual gaming process.

Please keep in mind that NVIDIA’s drivers, FarCry 1.2 patch along with Microsoft DirectX 9.0c are not released commercially at this time. Therefore, the results achieved by us today may be different from those produced by the final versions of the software.

What you should keep your eyes on at this time is speed boost delivered by Shader Model 3.0 to the GeForce 6800-series. While not all games can be optimised using Shader Model 3.0, those that are complex and graphics intensive are likely to be optimized and get a speed increase comparable to what the FarCry got.


Graphics Cards’ Performance, Pure Mode

Training Level

Training level contains a lot of grass, trees and water representing a significant amount of similar levels available in the game.




Note: minimal fps are marked with white numbers on the diagrams, black numbers represent average fps.

While on this particular level using of Shader Model 3.0 did not bring a really big benefit to the GeForce 6800-series, the GeForce 6800 Ultra is still either in par or faster compared to the RADEON X800 XT, meanwhile the RADEON X800 PRO is seriously behind the GeForce 6800 GT in all resolutions except 1600x1200, where ATI’s $399 offering beats NVIDIA’s $399 offering.

Note that in 1280x1024 and 1600x1200 resolutions SM 3.0 optimizations added strong muscle for the GeForce 6800-series in terms of minimal fps, meaning that actual gaming process improved significantly.

It is necessary to point out that the GeForce 6800 available today at some $299 or higher price-point is significantly faster compared to the RADEON 9800 XT that costs practically the same money.


Research Level

Research demo contains battle in a cavern with multitude of light sources all requiring thorough per pixel calculations. ATI’s RADEON X800 chips render the scenes in a number of passes using Pixel Shaders 2.0. NVIDIA’s GeForce 6800 processors render in a single pass using Pixel Shaders 3.0.




Note: minimal fps are marked with white numbers on the diagrams, black numbers represent average fps.

Advantage brought by the Shader Model 3.0 to the GeForce 6-series graphics cards participating in the test is not large, but can be seen clearly. The result is that the GeForce 6800 Ultra solidly outperforms ATI RADEON X800 XT, the GeForce 6800 GT proves to be faster than the RADEON X800 PRO, while the GeForce 6800 beats the RADEON 9800 XT.


Regulator Level

On the “Regulator” level there is, again, complex per pixel lighting implemented indoors. As in the previous case, the RADEON X800-series requires a number of passes for the lighting, while the GeForce 6800-series requires only one.




Note: minimal fps are marked with white numbers on the diagrams, black numbers represent average fps.

In this case NVIDIA’s latest GPUs got a speed boost that allowed the chips to outperform the RADEON X800 competitors. We see the GeForce 6800 GT getting a very significant speed increase with the SM 3.0 and a get a lead over ATI’s RADEON X800 PRO.

The RADEON 9800 XT is again outperformed by the GeForce 6800 graphics card by a significant margin – that is what the new technologies are here for.


Pier Level

There are loads of plants and water on the “Pier” level. Crytek said that instancing is used on plants, while remained tight-lipped over any possible optimisations for the water shaders.




Note: minimal fps are marked with white numbers on the diagrams, black numbers represent average fps.

While the speed gain the Shader Model 3.0 provided for the GeForce 6800-series of graphics cards seems to be insignificant, it looks like the GeForce 6800-series beats competition more often than the RADEON X800-series manages to leave the rivals behind. As usual, have a look on the GeForce 6800 GT that succeeds in beating the RADEON X800 PRO in the majority of cases and the GeForce 6800 that outperforms the RADEON 9800 XT.


Catacombs Level

“Catacombs” is a level that NVIDIA and Crytek demonstrated as one that benefits from the Shader Model 3.0 in terms of image quality. While right now there is no difference between SM 3.0 path and SM 2.0 path in terms of visual excellence, there is some performance benefit the newer path has.




Note: minimal fps are marked with white numbers on the diagrams, black numbers represent average fps.

Surprisingly, average fps decreased as a result of using the SM 3.0 path, however, we see a clear improvement of the minimal fps, something that affects game-play dramatically and which is perhaps even more important than average fps.

In the “Catacombs” demo we see a clear lead of the RADEON X800 XT over the GeForce 6800 Ultra. At the same time the GeForce 6800 GT still manages to slightly outperform the RADEON X800 PRO, while the GeForce 6800 easily beats the RADEON 9800 XT.


Volcano Level

Volcano is a yet another level where loads of per pixel lighting calculations need to be performed. NVIDIA’s GeForce 6800-series benefits from single-pass rendering here, while ATI RADEON X800-series needs to do multi-pass rendering on the level.

 


Note: minimal fps are marked with white numbers on the diagrams, black numbers represent average fps.

This level represents a significant speed improvement the GeForce 6800-series graphics processors gain from using Pixel Shaders 3.0. If earlier NVIDIA’s latest GPUs were on par with ATI’s newest offerings, the boost provided by Pixel Shaders 3.0 brought performance crown to NVIDIA’s GeForce 6800-series when compared to ATI’s RADEON X800 visual processing units.

Still, ATI’s RADEON X800 XT managed to deliver higher minimal fps compared to the GeForce 6800 Ultra, while the RADEON X800 PRO managed to be in-line with the GeForce 6800 GT.


Graphics Cards’ Performance, FSAA + Anisotropic Filtering Mode

We run the same six demos in “eye candy” mode with FSAA 4x and anisotropic filtering 16x (8x for the GeForce FX 5950 Ultra) enabled from the drivers to find out the speed gain Shader Model 3.0 brings to the GeForce 6800-series graphics cards under high workload. Trilinear and anisotropic filtering optimizations were turned on for the GeForce graphics cards in order to match ATI’s High-Quality settings.

Training Level

As previously said, “Training” level features geometry instancing optimizations for grass and trees that are all around the level. Since the demo features loads of plants, graphics processors have to do a tremendous amount of work for antialiasing when enabled.




Note: minimal fps are marked with white numbers on the diagrams, black numbers represent average fps.

Speed improvement the GeForce 6800-series graphics cards got from the Shader Model 3.0 optimizations is not dramatic, but it exists without any image quality degradation. We should point out a rather tremendous minimal fps boost for all GeForce 6800-series chips, which greatly affects overall game-play.

However, geometry instancing did not allow the GeForce 6800 Ultra to conquer the leading position in the “Training” demo – the RADEON X800 XT demonstrated superiority in all resolutions. The RADEON X800 PRO and the RADEON 9800 XT seem to be slightly slower than competing offerings – the GeForce 6800 GT and the GeForce 6800.


Research Level

“Research” level got some per pixel lighting optimizations. Let’s see whether NVIDIA’s GeForce 6800 family of graphics processors’ single-pass per pixel lighting advantage can bring it to the top.




Note: minimal fps are marked with white numbers on the diagrams, black numbers represent average fps.

Pixel shaders 3.0 did not bring any major performance increase in “eye-candy” mode, still, there is a speed gain brought by the SM 3.0 path to the GeForce 6800-series based graphics cards.

As in the previous case, the RADEON X800 XT maintains its leadership position, while the RADEON X800 PRO lags behind the GeForce 6800 GT. Rather surprisingly, the RADEON 9800 XT manages to outperform the GeForce 6800 in all resolutions.


Regulator Level

On the “Regulator” level there is, again, complex per pixel lighting implemented indoors. As in the previous case, the RADEON X800-series requires a number of passes for the lighting, while the GeForce 6800-series requires only one.




Note: minimal fps are marked with white numbers on the diagrams, black numbers represent average fps.

Shader Model 3.0 gave pretty small performance boost for the GeForce 6800 array of graphics processing units. The advantage was only enough for the GeForce 6800 Ultra to take the lead in 1024x768 resolution, while in 1280x1024 and 1600x1200 the RADEON X800 XT managed to outperform its competitor at $499 price-point. The GeForce 6800 GT is slightly ahead of the RADEON X800 PRO, just as the GeForce 6800 is a bit faster compared to the RADEON 9800 XT.


Pier Level

As previously noted, Crytek implemented instancing on plants, such as trees and grass. While “Pier” level contains a lot of trees and grass, the level is very complex from geometry standpoint and also requires fast pixel processors to render water.




Note: minimal fps are marked with white numbers on the diagrams, black numbers represent average fps.

Even though increase the SM 3.0 path provided the GeForce 6800-series is not substantial, it can be clearly noticed.

The RADEON X800 XT is the clear winner here, while the GeForce 6800 GT performs as fast as the RADEON X800 PRO. Only the GeForce 6800 managed to outperform its rival from ATI.


Catacombs Level

Our demo of the “Catacombs” level was recorder inside catacombs and contains a battle with monsters. The result is that there is nothing to optimise for the Shader Model 3.0 path as there are practically no plants on the level as well as there are very few light-sources.




Note: minimal fps are marked with white numbers on the diagrams, black numbers represent average fps.

As in some previous cases, the RADEON X800 XT wins the competition with the GeForce 6800 Ultra, while the RADEON X800 PRO and RADEON 9800 XT cannot be as fast as the GeForce 6800 GT and the GeForce 6800 respectively.


Volcano Level

Volcano is a level that faces immense speed advantage with the implementation of pixel shaders 3.0.




Note: minimal fps are marked with white numbers on the diagrams, black numbers represent average fps.

Thanks to the SM 3.0 boost, the RADEON X800 XT does not outperform the GeForce 6800 Ultra by a substantial margin, while the GeForce 6800 GT and the GeForce 6800 are generally faster than ATI’s competitors at $399 and $299 price-points.


Conclusion

The main idea the new FarCry 1.2 benchmarks bring is that NVIDIA’s GeForce 6800-series graphics processors do have some internal speed reserves with support for Shader Model 3.0. Even though the reserves are not that dramatic at this time, in other games and on other levels of FarCry the speed increase may be more significant. With no sacrifice of image quality, Pixel Shaders 3.0 can boost the speed of certain graphics cards – this is what the benchmarks tell us today.


FarCry with Shader Model 3.0 and HDR, screenshot from nVNews.net

Unfortunately, we did not get any new effects with the Shader Model 3.0 path like those that were showcased earlier. Perhaps, Crytek will add the to the FarCry later, but maybe they are left for the upcoming FarCry Instincts mod.

The benchmark results achieved today in “Pure Mode” prove that the GeForce 6800 Ultra graphics cards is the fastest performer in FarCry in the majority of cases when it works using Shader Model 3.0 path. While the RADEON X800 XT delivers performance very close to that of the competitor, there are very few cases when the graphics card can outperform the GeForce 6800 Ultra with no full-scene antialiasing and anisotropic filtering enabled.

Slightly less powerful options – the GeForce 6800 GT and the GeForce 6800 – also prove to be extremely successful rivals for the RADEON X800 PRO and the RADEON 9800 XT respectively, according to FarCry benchmarks in “Pure Mode”. In fact, both GeForce 6800-series offerings at $399 and $299 price-points leave the rivals behind by quite a substantial margin and therefore can be considered as the best options for FarCry in terms of price/performance ratio.

The results of testing in “Eye-Candy” mode with full-scene antialiasing along with anisotropic filtering enabled tell us that the ATI RADEON X800 XT is more powerful option than the NVIDIA GeForce 6800 Ultra under intensive load. Nevertheless, the GeForce 6800 GT and the GeForce 6800 leave the RADEON X800 PRO along with the RADEON 9800 XT behind. Even though the speed advantage is not really big, it can be noticed and proves the thesis about superiority of NVIDIA’s $299 and $399 offerings in term’s of price/performance ratio for the FarCry game.

For an unknown reason RADEON X800-series graphics products’ performance slightly dropped in FarCry version 1.2 in “Pure Mode”. The reasons for the drop are not clear, but may indicate that ATI also has some speed headroom with its latest family of graphics processors and the final judgement is yet to be made…

<%BANNER[banner_468x60_f]%>