by Tim Tscheblockov
05/12/2003 | 07:04 AM
Although over half a year has already passed since the announcement of NVIDIA GeForce FX 5800/5800 Ultra, the first graphics cards based on these chips started selling only a few months ago.
The graphics cards based on NVIDIA GeForce FX 5800 Ultra helped the company to regain the title of the performance leader, but still didn’t become very popular. There were several reasons for that: first of all, they were very expensive compared to ATI RADEON 9700 Pro based graphics cards, and secondly, they featured a number of serious drawbacks: high heat dissipation of the graphics core as well as of the graphics memory chips, absolutely unbearable noise, severe requirements for the computer PSU...
The launching of the new ATI VPU aka RADEON 9800 Pro (R350), has even aggravated the situation for NVIDIA in the performance graphics cards sector. The company’s competitor looks much more attractive than NV30 from any viewpoint: performance, quality, price, noise, heat dissipation, etc. And if performance is just a question of time for NVIDIA, that means how fats they manage to optimize their drivers, then all other drawbacks of the NVIDIA GeForce FX 5800 Ultra (see our NVIDIA GeForce FX 5800 Ultra Review for more details) cannot be eliminated by any driver optimization. These particular drawbacks of the NV30 solution probably pushed NVIDIA to speed up the launching of their next graphics chip generation aka NV35.
So, today NVIDIA announces the next generation of its graphics processors. NV35 chips officially called NVIDIA GeForce FX 5900/5900 Ultra should replace NV30 in the performance graphics cards sector.
In our today’s article we will take a look at four graphics cards based on the latest ATI and NVIDIA chips: RADEON 9700 Pro (R300), RADEON 9800 Pro (R350), NVIDIA GeForce FX 5800 Ultra and of course, NVIDIA GeForce FX 5900 Ultra. The latter super heavyweight from NVIDIA will certainly get most our attention :)
Well, it is evident that you shouldn’t expect any revolutionary innovations to be introduced in NV35,a s NVIDIA had too little time to make any radical changes to the chip architecture. Besides, the mere name of the new chip indicates that the company didn’t aim at any goal like that: the new graphics processor is “just” a faster implementation of NV30 architecture, just like NV25 (GeForce4) is in fact just a faster version of NV20 (GeForce3).
So, what does the new NV35 look like?
At first please take a look at a features table we composed for NV35 and NV30:
NVIDIA GeForce FX 5800 Ultra
NVIDIA GeForce FX 5900 Ultra
Number of transistors
Graphics memory controller
Graphics memory frequency
Peak memory bus bandwidth
Max graphics memory size
AGP 3.0 4x/8x
AGP 3.0 4x/8x
Pixel pipelines, pixel shaders
Pixel pipelines [*1]
4, 8 [*2]
Texturing unites per pipeline [*1]
2, 0 [*2]
Max number of textures during multi-texturing
Texture filtering types
Max anisotropy level
Pixel shaders version
Vertex pipelines, vertex shaders,
Vertex shaders version
Full Screen Anti-Aliasing
Number of samples
2 (OGSS, OGMS),
2 (OGSS, OGMS),
Technologies aimed at higher memory bandwidth efficiency
Hidden Surfaces Removal (HSR)
The first thing that catches your eye is the fact that in the new NV35 the company has finally implemented 256bit memory controller having given up DDR II in favor of DDR standard. With this stone NVIDIA has killed the whole flock of birds. Let’s count them now.
Firstly, the doubling of the memory bus width increased its bandwidth. Although the bandwidth didn’t double, but grew up by “only” 70%, because there are still no DDR chips working at 1000MHz (500MHz DDR).
Secondly, replacing DDR II chips with less expensive DDR SDRAM will reduce the future production cost of the new graphics cards.
Thirdly, the common DDR SDRAM chips dissipate less heat, which will allow giving up huge active and passive cooling solutions. For instance, the graphics cards based on the newest ATI chips do not have any graphics memory cooling at all.
And finally, the changes made to the memory controller resulted into changes of the caching schemes, which definitely got even more fine tuned this way, and texture compression, frame buffer and Z-buffer algorithms closely connected with them got significantly enhanced. By the way, NVIDIA claims that the updated compression algorithms, which have been called Intellisample HCT (High Resolution Compression Technology), provide up to 50% performance increase in “hard” modes using full-screen anti-aliasing and anisotropic filtering.
Other differences between NV30 and NV35 are marked with [notes] symbol in the table above. Let’s discuss them here:
To be able to calculate the shadows faster, gaming engines should certainly be “aware” of this feature, because they will need to calculate and transfer to the accelerator the depth bound values. So far, the existing 3D games and benchmarks with dynamic shadows based on the stencil buffer do not use this feature of the new NV35.
Now please meet the new graphics card from NVIDIA.
The NVIDIA GeForce FX 5900 Ultra reference graphics card, which we managed to get for review, looks very impressive:
You can see at the pictures that NVIDIA has finally given up the odious FlowFX cooling system and replaced it with a more traditional though still impressively big cooler with a radial flow air pump:
This cooler works in two modes on NVIDIA GeForce FX 5900 Ultra. Each mode has its own fan rotation speed. Moreover, the card automatically turns of high-speed mode from time to time depending on the temperature shown by the integrated monitoring system. However, even in high-speed mode the new cooler is much quieter than the notorious FlowFX.
Nevertheless, even though they gave up the FlowFX cooling system the card will still take two PCI slots in your system: first, the cooler is quite high and second, the card is equipped with a double-side heatsink, which also takes the reverse side of the card:
The cooler is slightly shifted from the center of the graphics chip: the fan axis is located more to the left of the graphics chip, so that the fan blows the air stream to the heatsink ribs located right above the chip and hence cools the core more efficiently:
To cool down the graphics memory chips, there are separate passive aluminum heatsinks:
The heatsink on the front side of the card features special ribbed parts cooled down by the airflow from the active chip cooler. There are also special heat conductive pads, which fit between the heatsink and the memory chips surface:
There is a simpler looking passive heatsink at the bottom of the card:
The graphics memory heatsinks are connected with each other with the help of a tricky spring screws, which prevent you from damaging the graphics card in case you apply too much physical effort:
Without the heatsinks on the graphics memory chips the card looks pretty funny: the memory chips are located in pairs around the graphics core, just like by Matrox Parhelia. This layout allows reducing the signal lines length thus eliminating most distortions and EMI:
The graphics card components – the core and the memory – consume quite a bit of power thus requiring stable power supply and voltage. So, no wonder that their voltage regulators occupy at least one third of the PCB. It looks as if the PCB has become somewhat longer than NVIDIA GeForce FX 5800 Ultra reference graphics cards only because of these voltage regulators:
By the way, on the photo above you can actually see the already traditional connector for additional power.
The heart of this graphics card is the NV35 (NVIDIA GeForce FX 5900 Ultra) chip marked as NV35GL, the professional NV35 version:
The graphics processor works at 450MHz, which is a little lower than the core frequency of NVIDIA GeForce FX 5800 Ultra. It looks as if the company considered the performance potential of NV35 with 256bit memory bus so high, that they even dared to reduce the graphics processor frequency a little bit, thus having reduced its power consumption and heat dissipation and having increased the stability.
The graphics memory chips used on our card are 32bit DDR SDRAM pieces from Hynix in BGA package:
The clock cycle time of these chips equals 2.2ns, so that the nominal frequency makes 908MHz (454MHz DDR). By NVIDIA GeForce FX 5900 Ultra these chips work at a slightly lower frequency: 850MHz (425MHz DDR).
These memory chips dissipate much less heat than those DDR II ones from Samsung used in NVIDIA GeForce FX 5800 Ultra based graphics cards. As a result, when we ran the tests in an open PC case without any additional cooling solutions the memory chips remained just slightly warm even after two hours of work.
Another very interesting thing: the graphics card is equipped with 16 chips like that, which should make the overall bus width equal to 512bit (32bit x16). But this is not true, of course: the card used 2 memory banks, with the total size of 256MB!
The card is equipped with a D-Sub, DVI and a combo TV-In/TV-Out. The signal for digital monitors is formed by an external TMDS-transmitter from Silicon Image:
And the encoding/decoding of the TV-signal is performed by SAA7108 chip from Philips:
Well, it’s high time we installed the card and started testing it.
When installing the card you should connect additional power supply, otherwise the card working frequencies will drop down to 250MHz/500MHz and the “card protection system” implemented via NVIDIA System Sentinel utility will bring up the following window:
When we just started testing the card the newest driver version available was NVIDIA Detonator 43.80. Just like all previous driver versions, this driver also located all graphics card settings in Desktop Properties window. Most pages with these settings look quite ordinarily, however, a few of them are worth dwelling on here.
First of all, I would like to draw your attention to 3D settings:
Here you can choose between the quality and performance when using anisotropic filtering (you have Performance, Balanced and Quality modes, the latter offered by default). You can also enable full-screen anti-aliasing (2x, Quincunx, 4x, 4xS for Direct3D, 6xS for Direct3D and 8x). Besides, you can force anisotropic filtering (2x, 4x and 8x modes).
If you change the Coolbits key in the registry with the help of Rivatuner utility, you will get the page where the graphics chip and memory frequencies for NVIDIA GeForce FX 5900 Ultra can be adjusted. Just like by NVIDIA GeForce FX 5800 Ultra, the new card has different nominal frequencies in different working modes: in 2D they are 300MHz for the chip and 850MHz for the memory...
In 3D they are 450MHz and 850MHz respectively:
Finally, here is an absolutely new page: Hardware Monitoring. This panel shows graphics core and memory temperatures taken from the hardware monitoring diodes. You can set the maximum allowed core temperature here. On reaching this maximum temperature, the core frequency will be automatically reduced:
Almost before the very launch of the new NVIDIA GeForce FX 5900 Ultra a new driver version appeared: Detonator 44.03, which will also support the new upcoming Quadro FX professional graphics cards:
NVIDIA Quadro FX 3000 will evidently be based on the “professional” GeForce FX 5900 Ultra, and Quadro FX 500 – on NV31 or NV34 (GeForce FX 5600 or GeForce FX 5200 respectively).
Besides the new graphics cards support, the 44.03 driver also features a slightly different 3D settings page:
As you can see from the screenshot above, the quality-performance modes have been renamed again: Performance mode turned into High Performance, while former Balanced mode is now called just Performance.
Luckily, or unluckily, our express testing showed that 44.03 driver version doesn’t differ from the 43.80 version in either image quality or performance: the results were absolutely identical. This allowed us to complete the tests with any rush with the 43.80 driver.
Besides, we expect NVIDIA to announce a new driver shortly after the NVIDIA GeForce FX 5900 Ultra product launch. This will be a driver from 50.xx series: Detonator FX, which will ensure a performance growth for all GeForce FX graphics cards. And it means that all the results obtained with the previous driver versions as well as those we are going to discuss today will no longer be valid.
Our testbed was configured as follows:
We used the following software:
We used default driver settings in synthetic benchmarks.
As usual we will begin with the polygon fillrate. The first test is the Fill Rate test from 3DMark2001 SE package. The test displays 64 semi-transparent surfaces wit one texture on each:
To estimate the efficiency of caching systems and the efficiency of frame buffer and Z-buffer compression, we forced FSAA 2x and 4x during the tests.
The results show that ATI RADEON 9700 Pro and RADEON 9800 Pro do not react so strongly to the enabled full-screen anti-aliasing as NVIDIA GeForce FX 5800 Ultra and 5900 Ultra.
To draw each pixel the chips have to read the old color value from the frame buffer and write the new one into it. Of course, the situation may change in real applications, but in this synthetic benchmark ATI chips appeared better prepared.
Another interesting observation: despite lower core frequency, NVIDIA GeForce FX 5900 Ultra proves faster than GeForce FX 5800 Ultra. It should be the enhanced caching algorithms and higher memory bus bandwidth that help NVIDIA GeForce FX 5900 Ultra to beat the predecessor here.
During multi-texturing NVIDIA chips show higher results due to their ability to lay two textures per clock. ATI RADEON 9700 Pro and RADEON 9800 Pro feature 8 pixel pipelines but both of them need an extra clock to lay each texture. As a result, all four chips appear in equal conditions: in this benchmark they lay maximum 8 textures per single surface. NVIDIA chips using 4 pixel “pipelines” configuration process 4 pixels every 4 clocks, while 8-pipeline chips from ATI – 8 pixels every 8 clocks. In the long run, NVIDIA GeForce FX 5900 Ultra and NVIDIA GeForce FX 5800 Ultra win here due to their higher clock frequencies: 450MHz and 500MHz against 380MHz and 325MHz respectively.
Here the memory bus is not loaded too much, because the chips lay not one but 8 textures over a single semi-transparent surface that is why the frame buffer values should be read/written 8 times more rarely. As a result, higher working frequency of NV30 compared with NV35 played a crucial role here and NVIDIA GeForce FX 5800 Ultra outperformed the newcomer.
The next test displays a polygon “covering” the entire screen. There are from 0 (no texture, pixel color is obtained as an interpolation of polygon vertex colors) to 4 textures sized as 512x512 superposed over this polygon. This benchmark allows estimating which pixel “pipelines” configuration NVIDIA chips use in every case:
Both: GeForce FX 5800 Ultra and GeForce FX 5900 Ultra behave the same way: as graphics accelerators using 4 classical pixel pipelines with 2 TMUs per each.
It is remarkable that higher core frequency of GeForce FX 5800 Ultra allows it to outpace the newcomer only when there are no textures to be laid. In all other cases enhanced caches help NV35 to beat the predecessor.
Now let’s move to the most exciting part of this section: pixel shaders performance. First comes Pixel Shader Speed test from 3Dmark2001 SE package:
NVIDIA GeForce FX 5900 Ultra is faster than the predecessor here, however, there are no pixel processor enhancements to thank for that: the card owes its victory to higher memory bus bandwidth and more efficient caches.
The situation remains unchanged even with a more complex pixel shader: NVIDIA GeForce FX 5900 Ultra is slightly ahead of NVIDIA GeForce FX 5800 Ultra.
However, if ATI graphics cards were just a little ahead of NVIDIA’s competitors in the first test, then here RADEON 9800 Pro and RADEON 9700 Pro are much farther ahead. This situation has already been describe in our NVIDIA GeForce FX 5800 Ultra Review.
So, the DirectX8 Pixel Shaders performance tested with 3DMark2001 SE do not indicate that there have been any changes made to the integer unit of the NV35 pixel processor compared with NV30.
Now let’s see what happens with DirectX9 Pixel Shaders:
Wow! Even NV30, which used to be at least twice as slow as R300, is now running neck and neck with the rival: NVIDIA GeForce FX 5800 Ultra and ATI RADEON 9700 Pro are almost equally fast now! And NVIDIA GeForce FX 5900 Ultra working at a lower chip frequency outperforms ATI RADEON 9800 Pro!
Well, we can admire NVIDIA’s persistence in combating weak spots of its chips by optimizing the drivers and thus achieving much better results in 3DMark03.
No doubt that Pixel Shader 2.0 test from 3DMark03 is not an illustrative example of NVIDIA’s pixel processors performance. Luckily, the polygon fillrate test also allows checking the pixel shaders performance:
The simplest Pixel Shaders 1.0 and 2.0 are performed equally fast by all chips, with the results being in clear correspondence with the graphics chips working frequencies: NVIDIA GeForce FX 5900 Ultra falls behind NVIDIA GeForce FX 5800 Ultra proportionally to their frequencies difference, as well as ATI RADEON 9800 Pro outpaces RADEON 9700 Pro according to their core frequencies difference.
In case of a more complex Pixel Shader 2.0, ATI chips performance is quite predictable: forced lower precision of floating point calculations (16bit per component instead of 32bit) doesn’t tell on the result, because R300 and R350 feature fixed calculations precision, which is always equal to 24bit per component.
NVIDIA chips showed more interesting results. With forced 50% precision of floating point calculations, NVIDIA GeForce FX 5800 Ultra and NVIDIA GeForce FX 5900 Ultra perform according to their clock frequencies, i.e. the chips perform equally fast here.
The most interesting things start happening when we shift to 32bit per component: while NVIDIA GeForce FX 5800 Ultra naturally slows down, NVIDIA GeForce FX 5900 Ultra shows even better results!
Well, the results obtained let us suppose that NVIDIA has really increased the FPU performance in the new NV35, as they claimed. But the changes touched only upon 32bit precision: GeForce 5900 Ultra doesn’t slow down when shifting to full precision calculations.
The first test is High Polygon Count from 3DMark2001 SE package:
The results show that NVIDIA GeForce FX 5900 Ultra outperforms ATI RADEON 9800 Pro, but yields to NVIDIA GeForce FX 5800 Ultra. The performance difference between the two NVIDIA solutions corresponds exactly to the difference in their clock speeds, which indicates that there haven’t been any changes made to Vertex processors of the new NVIDIA GeForce FX 5900 Ultra.
Vertex Shaders tests prove this supposition: NV35 Vertex Shaders unit hasn’t been enhanced in any way and remained the same as that of NV30.
This way, the lag behind ATI chips got even worse now: NVIDIA GeForce FX 5900 Ultra works at a lower clock frequency than the predecessor, while ATI RADEON 9800 Pro runs at a higher frequency than RADEON 9700 Pro.
The use of texture anisotropic filtering allows achieving much higher textures clearness at a large distance from the object and at small angles of inclination. This way the scenes built with anisotropic filtering become more life-like and close to reality. It is remarkable that this function can be enabled almost in any 3D applications, because nothing is required from the gaming engine or system components: all additional work is done by the graphics processor.
One of the most pleasing peculiarities of the NVIDIA GeForce3 / GeForce3 Titanium / GeForce4 Titanium was excellent anisotropic filtering quality, and one of the most discouraging ones – extremely high performance drop when this function was activated.
No wonder, that NVIDIA paid due attention to anisotropic filtering when they developed new CineFX architecture and GeForce FX graphics chips trying to reduce the performance drop without too big quality losses.
As we have already seen during NVIDIA GeForce FX 5800 Ultra test session, the company managed to cope with this task very well: GeForce FX chips support three different anisotropic filtering modes offering different speed and quality. Even in the fastest mode, the performance drop is 2-3 times lower than it used to be while the image quality remains more than acceptable.
This progress was achieved due to the chip developers who introduced a few improvements into the anisotropic filtering algorithm (see our NVIDIA GeForce FX 5800 Ultra Review for more details) and software developers who prepared a number of driver optimizations. The general idea of the software optimization implies the reduction of the maximum anisotropy level for textures or separate polygons, which definitely leads to performance increase. Look here: why should the whole scene undergo anisotropic filtering, if we could leave aside those textures or polygons, which do not need it, as this texture is so blurred that no filtering will improve its clearness, and that polygon is at such a big angle that you cannot see anything on it anyway?
The implementation of anisotropic filtering by NVIDIA GeForce FX 5900 Ultra is very interesting from the “driver” point of view in the first place, because its hardware algorithms were completely borrowed from NV30 and all performance improvements can only result from software optimizations.
So, let’s start with tri-linear filtering. The screenshots below are taken from a small test program displaying a cone with the base in the screen surface and a remote vertex. The sides of the cone are covered with “chess” texture. Besides that MIP levels are highlighted:
And now here are a few enlarged fragments with forced 8x anisotropic filtering in the driver:
You can clearly see that in Balanced and Performance modes tri-linear filtering has degraded almost to pure bi-linear filtering: smooth transitions between the MIP-levels got down to narrow bands 2-3 pixels wide. Besides that, the level of detail in Performance mode has been partially lowered: the first MIP-level border has moved closer.
Since the tests of NVIDIA GeForce FX 5800 Ultra with 42.68 driver tri-linear filtering has become much worse. To prove this point we offer you a few more screenshots taken from our NVIDIA GeForce FX 5800 Ultra Review:
It is evident that further degradation of tri-linear filtering in 43.80 driver is a way to increase GeForce FX chips performance when both: anisotropic filtering and tri-linear filtering are used.
However, there will hardly be any more changes in the driver concerning tri-linear filtering, as this optimization resource has been completely exhausted. Even now you can sometimes see the MIP-levels borders in the Balanced and Performance modes. So there is only one option left: to disable tri-linear filtering as we see by ATI R300/R350 in Performance mode of anisotropic filtering. Hopefully, NVIDIA will not do the same thing.
Well, and now let’s take a closer look at the image quality provided by the new NV35 and the differences in Performance, balanced and Quality modes. As an example let’s take Serious Sam: The Second Encounter Scene:
Below you can see the enlarged fragments of the screenshot taken with OpenGL and Direct3D:
Quality mode, OpenGL:
Quality mode, Direct3D:
We see no differences between OpenGL and Direct3D modes. In fact there shouldn’t be any, because we know that all optimizations are disabled in Quality mode.
Balanced mode, OpenGL:
Balanced mode, Direct3D:
Here we also can’t notice any big differences between the screenshots taken in OpenGL and Direct3D. However, on the third screenshot from the left you can notice that the “stone” texture has become slightly blurred compared to the Quality mode.
Performance mode, OpenGL:
Performance mode, Direct3D:
On the first two screenshots taken in OpenGL you can see texture compression artifacts: in Performance mode with OpenGL the driver forces texture compression.
You can see on other screenshots that the level of detail has become much lower compared to the Quality and Balanced modes. However, in case of OpenGL the images are a little more clear-cut.
Now let’s estimate the quality of anisotropic filtering by NVIDIA GeForce FX 5900 Ultra compared with that by ATI RADEON 9800 Pro. Again we will use Serious Sam: The Second Encounter:
The game uses OpenGL API, tri-linear filtering is enabled, standard GFX: Extreme Quality settings are used. The only exception here was the anisotropic filtering mode, which we set manually for both graphics cards.
NVIDIA GeForce FX 5900 Ultra was tested with 8x anisotropic filtering in Quality, Balanced and Performance modes. ATI RADEON 9800 Pro was tested with 8x anisotropic filtering in Quality mode. The screenshots with highlighted MIP-levels are also provided for a more illustrative analysis.
In Balanced and Performance modes NVIDIA GeForce FX 5900 Ultra has clearly visible MIP-levels borders, which we surely do not see in Quality mode. However, even in Quality mode NVIDIA GeForce FX 5900 Ultra provides less clear-cut images than ATI RADEON 9800 Pro.
Have you noticed the “dirty” colors of the highlighted MIP-levels by NVIDIA GeForce FX 5900 Ultra? Where do they come from?
These “dirty” colors were formed by two differently highlighted MIP-levels. Despite our expectations, the “stone” surface, which we used to study the image quality, is formed by two textures: the base “stone” texture and one more texture with much lower level of detail adding some bright spots to the pavement, which prevents the surface from looking monotonous from a bigger distance.
It looks as if NVIDIA GeForce FX 5900 Ultra were using a different anisotropic filtering mode for this particular texture, so that MIP-level of the two textures do not coincide and in case of highlighted MIP-levels they overlap thus creating mixed colors.
Well, it does make sense to set lower level of anisotropy or to disable anisotropic filtering at all for such a stretched and inexpressive brightness texture, because ti will increase the performance without any quality losses.
What does the whole thing mean? It indicates that a new optimization method appeared since the times of NVIDIA GeForce4 and anisotropic filtering optimization based on geometric data analysis, which was know as polygon-wise determination. The new method is referred to as texture-wise determination.
When the texture is loaded, the driver must analyze its clearness and set the maximum anisotropy level for the texture depending on the results of this analysis. Moreover, Quality, Balanced and Performance modes use different levels of simplification.
This angle of inclination is very inconvenient for anisotropic filtering algorithms of ATI RADEON 9800 Pro, so that the new ATI solution shows even worse quality than NVIDIA GeForce FX 5900 Ultra in Performance mode.
Here ATI RADEON 9800 Pro and NVIDIA GeForce FX 5900 Ultra in Quality mode show almost the same results.
It is remarkable that anisotropic filtering optimization algorithms seem to be working only in OpenGL so far. To prove this point we would like to offer you two screenshots of the same Serious Sam: The Second Encounter scene taken with enabled 8x anisotropic filtering with highlighted MIP-levels. You can see OpenGL Performance mode on the left and Direct3D Performance mode on the right:
Performance mode in OpenGL
Performance mode in Direct3D
On the screenshot taken in OpenGL you can clearly see different colors for each MIP-level (the base textures are laid with 8x anisotropic filtering while less detailed textures use regular bi-linear filtering, as we can see from the picture). And in Direct3D the MIP-levels of all textures coincide and form an exact picture of 8x anisotropic filtering.
In Quality mode the optimization should be not so aggressive, or it should be completely disabled. Let’s check it out now. On the left - Quality mode in OpenGL, on the right – Quality mode in Direct3D:
Quality mode in OpenGL
Quality mode in Direct3D
Well, MIP-levels colors do get mixed together, but only at the most far away parts of the scene.
Well, now we have a good idea of some optimizations that may appear in Detonator FX. First, they will definitely continue improving the algorithms aimed at determining the maximum anisotropy level for textures and polygons in OpenGL, and second, these algorithms may also appear for Direct3D.
And in the meanwhile we can estimate how well the new algorithms help NVIDIA chips to perform anisotropic filtering with the smallest performance losses.
To measure the performance hit with enabled anisotropic filtering in OpenGL we ran the tests in Quake3 Arena:
The results are just brilliant! The performance drops twice as little in balanced mode compared with Quality mode, and in Performance mode the losses are ten times smaller now.
For a better comparison, here is the result of ATI RADEON 9800 Pro in Quality mode:
ATI RADEON 9800 Pro provides smaller performance hit in Quality mode compared with NVIDIA GeForce FX 5800 Ultra/5900 Ultra in Quality mode. However, ATI’s algorithm for anisotropic filtering features one serious drawback, which NVIDIA GeForce FX chips do not have: there are some “inconvenient” angles, where anisotropic filtering quality is lower.
To evaluate the performance hit during forced anisotropic filtering in Direct3D we used Unreal Tournament 2003:
There are no optimizations yet in the Direct3D part of the driver like the one we saw in the OpenGL part. Therefore, the performance hit hardly got that much lower here.
In Unreal Tournament 2003 with enabled anisotropic filtering ATI RADEON 9800 Pro “loses” more than in Quake3 Arena, but this value is still smaller than what we saw by NVIDIA GeForce FX 5900 Ultra in Quality and even in Balanced mode.
So, our investigation of anisotropic filtering quality and speed showed that the new Detonator FX is very badly needed. The anisotropic filtering optimization algorithms used in OpenGL part of the driver ensure a significant reduction of performance losses retaining acceptable image quality. Therefore, the implementation of the same optimizations in the Direct3D part could be very efficient.
The FSS part of NVIDIA GeForce FX 5900 Ultra is hardly that much different from what we saw by its predecessor.
That is why I decided to check only the quality of polygon borders smoothening in case of 2, 4 and 8 samples. The screenshots were taken in Serious Sam: The Second Encounter for OpenGL and Direct3D:
Here are enlarged fragments for your convenience. The top row stands for 2x, 4x, 8x in OpenGL, and the bottom row – for 2x, 4x, 8x in Direct3D:
For a better comparison have a look the quality provided by SMOOTHVISION 2.1 2x, 4x and 6x by ATI RADEON 9800 Pro:
It is evident that only 8x mode from NVIDIA can compete with ATI’s 4x and 6x modes more or less successfully.
By the way, it would be more correct to say “modes” instead of “mode”, because you can clearly see from the screenshots that 8x in OpenGL and 8x in Direct3D are different modes providing different polygon borders smoothening quality. And since these modes combine supersampling and multi-sampling, not only the polygon borders smoothening will be different, but also textures processing.
To prove this point here are a few more screenshots from Serious Sam: The Second Encounter:
These are a few larger image fragments with enabled 8x full-screen anti-aliasing in OpenGL:
And now the same fragments with enabled 8x full-screen anti-aliasing in Direct3D:
The screenshots shoed that the textures are much clearer-cut in OpenGL 8x FSAA.
If you look at the leaves of the tree created with the help of transparent textures, you will see that in case of OpenGL 8x anti-aliasing they were formed by 2x2 supersampling, while in Direct3D 8x anti-aliasing only 2x1 supersampling was involved.
Summing up our observations I would like to illustrate 2x, 4x and 8x anti-aliasing modes by NVIDIA GeForce FX 5900 Ultra with the following scheme:
The most “advanced” anti-aliasing mode is offered by OpenGL 8x: here we have supersampling of four 2x multisampling blocks. This is cool, and beautiful! :)
Now let’s estimate how big the performance hit will be with enabled FSAA:
NVIDIA GeForce FX 5900 Ultra boasts twice as small performance hit in case of the most widely spread anti-aliasing 4x mode. The reasons are more than evident: unlike NV30, NV35 features a faster 256bit memory bus and enhanced caching schemes, Z-buffer and frame buffer compression.
In fact, we could expect NVIDIA GeForce FX 5900 Ultra to show even higher advantages in 8x mode, however, the supersampling algorithm used in 8x mode doesn’t let these expectations come true. Unlike multi-sampling, supersampling loads the graphics core in the first place, and here NV35 doesn’t boast any advantages over the predecessor. As a result, NVIDIA GeForce FX 5900 Ultra is only a tiny bit faster than NVIDIA GeForce FX 5800 Ultra, while the performance losses in this case are far from being optimistic.
ATI RADEON 9800 Pro uses “pure” multisampling, that is why even in the toughest 6x mode no unpleasant surprises occur.
To estimate the performance hit during forced full-screen anti-aliasing in Direct3D we will again turn to Unreal Tournament 2003:
Very interesting: 4x full-screen anti-aliasing by NVIDIA GeForce FX 5900 Ultra provides almost the same performance hit as 2x FSAA by NVIDIA GeForce FX 5800 Ultra.
ATI RADEON 9800 Pro works almost as efficiently with enabled FSAA as NVIDIA GeForce FX 5900 Ultra.
So, with the launching of the new NVIDIA GeForce FX 5900 Ultra full-screen anti-aliasing with the help of multi-sampling turned almost completely “free”, as the maximum noticed performance hit with 4x FSAA notched only 30.7%.
However, we can’t say the same thing about the most desired 8x FSAA mode: 8x FSAA in OpenGL and Direct3D eats up from 60% to 80%.of performance. This way, NVIDIA GeForce FX 5900 Ultra hasn’t become any better than the predecessor in 8x full-screen anti-aliasing modes.
Quality settings in Quake3 Arena looked as follows: 32-bit texture color and frame-buffer depths, maximum amount of textures and objects, tri-linear filtering is on, texture compression is off.
Without full-screen anti-aliasing and anisotropic filtering the graphics cards show relatively close results, because they are limited by the CPU speed and the overall system performance. However in 1600x1200 we can already draw some conclusions: NVIDIA GeForce FX 5900 Ultra is much faster than NVIDIA GeForce FX 5800 Ultra, and ATI RADEON 9800 Pro gets right between them in Quality mode.
When we enable full-screen anti-aliasing, NVIDIA GeForce FX 5900 Ultra appears far ahead of the predecessor due to 256bit memory bus. Moreover, the higher gets the resolution, the bigger grows the gap between them, so that in 1600x1200 NV35 turns twice as fast as NV30. ATI RADEON 9800 Pro and ATI RADEON 9700 Pro get between NVIDIA’s solutions: ATI based graphics cards are faster than GeForce FX 5800 Ultra due to 256bit memory bus, but can’t compete with GeForce FX 5900 Ultra because of higher working frequencies of the latter.
When we enable anisotropic filtering, the memory bus bandwidth become less important, while the anisotropic filtering speed and chip frequency get to the forefront. Nevertheless, lower working frequency didn’t prevent NVIDIA GeForce FX 5900 Ultra from defeating the predecessor in the same anisotropic filtering optimization modes. Low performance hit in case of enabled anisotropic filtering by ATI chips allowed them to leave behind their rivals - NVIDIA GeForce FX 5900 Ultra and NVIDIA GeForce FX 5800 Ultra, when they use the Quality mode.
Finally, when both: anisotropic filtering and full-screen anti-aliasing are enabled, NVIDIA GeForce FX 5900 Ultra is stably ahead of GeForce FX 5800 Ultra. ATI RADEON 9800 Pro manages to outperform NVIDIA GeForce FX 5900 Ultra in Quality mode in 1024x768 and 1280x1024 resolutions due to fast anisotropic filtering algorithms, however in 1600x1200 NVIDIA’s monster takes the lead.
For our tests we used 32bit color and Quality settings. We also used standard GFX: Extreme Quality settings including maximum image quality, namely maximum supported level of anisotropic filtering.
Therefore the tests in Serious Sam: The Second Encounter were run only in two modes: without full-screen anti-aliasing and with enabled 4x anti-aliasing.
Here NVIDIA GeForce FX 5900 Ultra is just a little ahead of the predecessor, which was determined mainly by the anisotropic filtering speed.
ATI chips support the maximum of 16x anisotropy, which is twice as high as NVIDIA chips do. That is why ATI RADEON 9700 Pro and RADEON 9800 Pro appeared in harder testing conditions, which definitely told on their results.
Enabled full-scene anti-aliasing pushes forward NVIDIA GeForce FX 5900 Ultra with the highest memory bus bandwidth of all graphics cards tested. ATI RADEON 9800 Pro and ATI RADEON 9700 Pro performed quite well compared with GeForce FX 5800 Ultra, however, only 9800 Pro managed to beat the old rival.
Settings: Texture Detail: Highest, World Detail: Highest, Character Detail: Highest, Physics Detail: Normal, Character Shadows: ON, Dynamic Lighting: ON, Detail Textures: ON, Projectors: ON, Decals: ON, Coronas: ON, Decal Stay: Normal, Foliage: ON, Trilinear Filtering: ON. We ran the Antalus flyby-scene.
In 1280x1024 and 1600x1200 resolutions, where the CPU and the overall system performance do not tell that much on the graphics card speed, NVIDIA GeForce FX 5900 Ultra manages to get slightly ahead of GeForce FX 5800 Ultra. ATI RADEON 9700 Pro and ATI RADEON 9800 Pro fall behind.
When we enable full-screen anti-aliasing, the results are determined by the graphics bus bandwidth in the first place. That is why the cards with 256bit memory bus leave GeForce FX 5800 Ultra with its 128bit memory bus behind. The leadership belongs to NVIDIA GeForce FX 5900 Ultra, of course, due to its enormous memory bus bandwidth.
However, as soon as anisotropic filtering is on, the memory bus bandwidth no longer plays the crucial role. As a result NVIDIA GeForce FX 5800 Ultra catches up with the newcomer due to higher core frequency: 500MHz against 450MHz by GeForce FX 5900 Ultra.
ATI RADEON 9800 Pro appears just a little slower than NVIDIA GeForce FX 5800 Ultra and GeForce FX 5900 Ultra when they work in Quality mode, however, RADEON 9700 Pro with considerably lower clock frequencies falls quite far behind them.
When we force anisotropic filtering and full-screen anti-aliasing, the memory bus bandwidth starts influencing the results together with the anisotropic filtering speed. And the higher gets the screen resolution, the bigger grows this influence. As a result, NVIDIA GeForce FX 5900 Ultra gets farther ahead of the predecessor as the resolution grows up. ATI RADEON 9700 Pro and ATI RADEON 9800 Pro, which also feature high memory bus bandwidth manage to outpace NVIDIA GeForce FX 5800 Ultra in high resolutions.
This test ensures complex workload for the tested graphics accelerators: it uses DirectX8 pixel and vertex shaders, has scenes with a lot of polygons, and requires high fillrate.
As a result, the laurels are won by NVIDIA GeForce FX 5900 Ultra. Then follows GeForce FX 5800 Ultra, and ATI’s solutions take the last places.
With enabled full-screen anti-aliasing, NVIDIA GeForce FX 5900 Ultra is certainly very far ahead of the predecessor. However, strange as it might seem, but ATI based graphics cards with higher memory bus bandwidths still fail to surpass GeForce FX 5800 Ultra.
Forced anisotropic filtering didn’t work for some reason on NVIDIA GeForce FX 5900 Ultra and GeForce FX 5800 Ultra, because this is the only explanation I can find for the 0 performance hit when anisotropic filtering was enabled.
ATI’s chips were honestly performing anisotropic filtering, as we see from the results. However, in this case the comparison doesn’t make much sense, so we will not take this result into account.
The tests ran at 32-bit texture color and frame-buffer depths; Z-buffer color depth equaled 24 bit. We used dual buffering and Pure Hardware T&L mode.
There is a certain element of randomness in Game 1 scene that is why the results are quite diverse. The situation is aggravated by the fact that the CPU and the overall system performance do affect the results in car Chase a lot. Nevertheless, the tests with forced full-screen anti-aliasing allowed us to make a few conclusions.
First, NVIDIA GeForce FX 5900 Ultra is enormously faster than GeForce FX 5800 Ultra when tested with enabled anti-aliasing. ATI chips also outperform GeForce FX 5800 Ultra though not that greatly, because of a little bit lower memory bus bandwidth than by NVIDIA GeForce FX 5900 Ultra.
Second, the results gave us to understand that all Direct3D anisotropic filtering optimizations work for NVIDIA GeForce FX family only in Performance mode, and these are most likely to be the optimizations based on scene geometry analysis. Replacing tri-linear filtering with a combination of bi-linear and tri-linear filtering doesn’t make much sense in 3DMark2001, because this test set doesn’t use any tri-linear filtering at all, and textures clearness analysis works only in OpenGL so far.
In Dragothic test we don’t see much of a difference between Quality, Balanced and Performance modes by NVIDIA GeForce FX 5800 Ultra NVIDIA GeForce FX 5900 Ultra. So the major advantage over the predecessor is granted only due to higher memory bus bandwidth. By the way, we can’t help mentioning excellent results of ATI RADEON 9800 Pro here: it performed very well with enabled full-screen anti-aliasing and yielded just a tiny bit to the leader, GeForce FX 5900 Ultra.
Here NVIDIA GeForce FX 5900 Ultra, ATI RADEON 9800 Pro and ATI RADEON 9700 Pro turned far ahead of NVIDIA GeForce FX 5800 Ultra in case anti-aliasing was enabled.
In the last gaming test of 3DMark2001 SE test set we would like to dwell on the work of NVIDIA solutions in Performance mode with forced anisotropic filtering. It is remarkable that 3DMark2001 SE Nature appeared the hardest test for ATI RADEON 9800 Pro and RADEON 9700 Pro: having surpassed NVIDIA GeForce FX 5800 Ultra when full-screen anti-aliasing was enabled, both ATI solutions didn’t retain their advantage on forcing anisotropic filtering. As a result, they didn’t perform that greatly in the “hardest” test.
All in all, 3DMark2001 SE results once again confirmed the advantage of NVIDIA GeForce FX 5900 Ultra over the predecessor, which we have already seen in the previous tests.
ATI RADEON 9800 Pro and RADEON 9700 Pro graphics cards appeared more successful in 3DMark2001 SE than in other gaming benchmarks. One of the possible reasons for that is the fact that 3DMark2001 SE doesn’t use tri-linear filtering, which means that in case of forced anisotropic filtering ATI chips worked in Performance mode, which is faster than Quality.
However, we wouldn’t disregard other factors as well, such as well-polished Direct3D part of the drivers, or the availability of some special driver optimizations for 3DMark2001 SE
The tests were run with the following settings: Pixel Processing: None, Texture Filtering: Optimal, Max Anisotropy: 1, Vertex Shaders: Optimal, Repeat Tests: Off, Fixed Framerate: Off).
The results are absolutely identical to the theoretical expectations for NVIDIA GeForce FX 5900 Ultra and NVIDIA GeForce FX 5800 Ultra: the performance with enabled anisotropic filtering is determined by the graphics chip working frequency, and the performance with anti-aliasing – by memory bus bandwidth.
ATI based graphics cards performed not bad at all. ATI RADEON 9800 Pro pleased us most: even though its clock frequency is much lover than that of the new NVIDIA solution, it managed to fall only a few fps behind NVIDIA GeForce FX 5900 Ultra in Quality mode.
Here the influence of the memory bus bandwidth on the testing participants’ performance turned out much higher. The performance difference between NVIDIA GeForce FX 5900 Ultra and NVIDIA GeForce FX 5800 Ultra is evident even without the FSAA.
NVIDIA GeForce FX 5900 Ultra and GeForce FX 5800 Ultra have finally managed to take advantage of their architectural peculiarities: the tests uses dynamic shadows calculated with the help of the stencil buffer, and NVIDIA chips can build this buffer twice as fast using the 8-“pipeline” configuration.
ATI RADEON 9800 Pro managed to please our eye only when we enabled anisotropic filtering: the performance hit by ATI chips in case of forced anisotropic filtering appeared very low here.
This test is based on the same engine as the previous one that is why the overall picture hardly changed. However, there is still one exception: ATI RADEON 9800 Pro and RADEON 9700 Pro performed even better with enabled anisotropic filtering.
ATI RADEON 9800 Pro and RADEON 9700 Pro retain their excellent performance level with enabled anisotropic filtering and not very high performance level with enabled anti-aliasing.
However, when both these functions are enabled simultaneously, ATI solutions perform as fast as NVIDIA solutions.
As for the performance difference between NVIDIA GeForce FX 5900 Ultra and NVIDIA GeForce FX 5800 Ultra, nothing new can be found here. NVIDIA GeForce FX 5900 Ultra is much faster than the predecessor when FSAA is enabled. However in case of anisotropic filtering the gap is no longer that huge.
If we set the well-known Coolbits key in the registry to 3 or enable graphics card overclocking on the driver level in RivaTuner, then among the available pages with graphics card settings there will appear “Clock Frequencies” page. With these settings we can try increasing the core and memory frequencies of our GeForce FX 5900 Ultra.
The maximum frequencies when the card still worked stably equaled 490MHz for the core and 900MHz (450MHz DDR) for the memory. Compared with the nominal frequencies in 3D mode – 450MHz/850MHz (425MHz DDR), this result is more than modest: the frequency grew up by only 8.9% for the graphics core and 5.9% for the graphics memory.
The benchmarks results are also not that impressive:
Well, no ones had expected the first NVIDIA GeForce FX 5900 Ultra reference graphics card to work overclocking wonders. Moreover, bearing in mind that graphics memory as well as graphics core of the new NVIDIA solution work at frequencies close to the maximum ones, we can suppose that the mass graphics cards based on the new chipset from NVIDIA will boast just a little bit higher overclocking potential.
Well, it is evident that NVIDIA GeForce FX 5900 Ultra is not only a wonderful replacement for GeForce FX 5800 Ultra, but also the today’s most powerful gaming accelerator. The launch of NV35 helped NVIDIA to win back the title of the gaming 3D graphics leader, because the fastest solution from ATI - RADEON 9800 Pro – appeared slower than NVIDIA GeForce FX 5900 Ultra in most tests.
The shift to 256bit memory bus and return to DDR SDRAM provided NVIDIA GeForce FX 5900 Ultra with a number of important advantages over the unlucky predecessor. First, its high memory working frequency equal to 850MHz (425MHz DDR), GeForce FX 5900 Ultra can boast the today’s fastest memory bus with the peak bandwidth 70% higher than the bandwidth of GeForce FX 5800 Ultra memory bus: 25940MB/sec against 15258MB/sec. Together with the improved caching algorithms and enhanced frame buffer and Z-buffer compression, this fast memory bus doubles the performance of GeForce FX 5900 Ultra when FSAA is enabled compared with GeForce FX 5800 Ultra.
Second, DDR SDRAM memory chips from Hynix used in NVIDIA GeForce FX 5900 Ultra feature lower heat dissipation than DDR II chips used on GeForce FX 5800 Ultra, which means that the mass graphics cards will be able to do without massive memory heatsinks, or completely without any heatsinks for the memory.
And finally, DDR memory chips are cheaper, which will definitely reduce the graphics cards production cost.
The new NVIDIA chip boasts one more very important enhancement compared with NV30: NV35 performs much faster DirectX9 Pixel Shaders using floating point calculations with 32bit precision. Of course, this improvement is of no use for contemporary games, because DirectX9 shaders are not involved yet. However, this is a very good start for the future, and in NV35 the company eliminated the weak spot of their NV30.
Of course, NVIDIA GeForce FX 5900 Ultra based graphics cards are intended for 3D gaming enthusiasts: the expected pricing for these solutions will be $499. However, unlike NV30, NV35 chips will be manufactured in mass quantities (if a High-End product can ever be mass), that is why the new graphics cards prices will go down little by little.
This way, the launching of NVIDIA GeForce FX 5900 Ultra should please not only hardware enthusiasts but also the whole lot of users who do not demand a lot from the graphics accelerators. The prices of NVIDIA based solutions will probably go down after this launch and the same thing may also happen to ATI based solutions now.