<%BANNER[top_768x90]%>

<%BANNER[banner_468x60]%>

There Will Be Speed: The Clash of Modern Multi-GPU Technologies

This article is devoted to the situation with contemporary multi-GPU systems. We are going to introduce to you Nvidia’s response to ATI Radeon HD 3870 X2 - the new dual-chip GeForce 9800 GX2 graphics adapter that should replace the long-term leader of the high-end 3D segment - GeForce 8800 Ultra. We take a look at 4-way ATI CrossFire and compare it against Nvidia's new implementation of Quad SLI. Which quad is better? Let's find out!
UPDATE: Benchmark results of 3-way ATI CrossFire added.

by Alexey Stepin , Yaroslav Lyssenko, Anton Shilov
03/24/2008 | 05:01 PM

UPDATE: Benchmark results of 3-way ATI CrossFire added.

<%BANNER[article]%>

The concept of a top-performance multi-processor graphics subsystem is one of those ideas that get completely forgotten only to emerge again after a while. 3dfx’ SLI, ATI’s Rage Fury MAXX, S3’s MultiChrome were just a few attempts to realize that concept. Some of them were successful, some were a complete failure like the XGI Volari Duo, but the evolution of multi-GPU solutions eventually gave birth to two viable technologies, ATI CrossFire and Nvidia SLI, which took a small, yet stable, niche on the market of top-class gaming systems.

As we have repeatedly written in our reviews, multi-GPU technologies could not take root in the inexpensive solutions sector. It is generally better to buy one top-end card instead of two mainstream ones. One of the reasons is the specifics of today’s game engines (HDR, complex shaders with loops, branches and dependencies, multiple-pass or deferred rendering, post-processing, etc) that don’t allow getting the highest performance from multi-GPU systems without application-specific software optimizations. But the user doesn’t want to know anything about these difficulties. He wants the graphics card he has bought to deliver its maximum performance in every game. And that’s something that today’s multi-GPU solutions can’t offer. Consequently, they can’t avoid the status of an “expensive toy for enthusiasts”. Well, the most popular games do get the necessary optimizations quickly, which is an additional factor that keeps multi-GPU technologies afloat. There is another factor, though: speed. For those who do not mind the price or stability concerns.

Sometimes the performance of GPUs doesn’t grow up fast enough to keep up with system requirements of newest games. We can recall Crysis that cannot run at an acceptable speed in any of contemporary systems if you select the highest level of detail in it. Creating a new GPU is a complex, long and expensive process, and there can be periods when the GPU developer doesn’t have a graphics core with sufficient performance just at the moment. On the other hand, the current market of 3D graphics forgives no delays. If you delay today, you will suffer huge losses or will even have to leave the market altogether tomorrow. The developers have to find ways to maintain the performance growth and win some time. That’s when the multi-GPU concept comes in handy. As you know, the graphics department of Advanced Micro Devices, the former ATI Technologies, did not have a graphics core capable of matching Nvidia’s G92 and had to resort to CrossFire technology, creating the ATI Radeon HD 3870 X2 graphics card with two RV670 chips on board. Our tests showed that the card had a serious potential. Even though it did not win all the tests, the ATI Radeon HD 3870 X2 became the fastest single graphics card available, allowing ATI to claim technological superiority after a long while.

The competition on the consumer 3D graphics market never ends, though. And Nvidia has introduced a new solution to knock the Radeon HD 3870 X2 off the throne. Not having a new-generation graphics core right now, the company followed the same route as ATI when creating its new GeForce 9800 GX2. After all, Nvidia had already had an experience developing dual-processor graphics cards.


Multi-GPU Solutions in the Past, Present and Future

Again, this multi-GPU approach must be a temporary thing until ATI and Nvidia are prepared to launch their next-generation GPUs. We produced arguments in favor of this point of view in our Radeon HD 3870 X2 review. In brief, the developer company ensures the freedom of price and market maneuvering for itself. On one hand, the company can utilize defective cores by disabling some subunits in inexpensive graphics cards. On the other hand, as the tech process improves, the frequency potential and, accordingly, performance grow up. This helps extend the product range to better serve the end-user. A transition to homogeneous multi-chip solutions is no good for both parties: for the manufacturer because such graphics cards are expensive to manufacture and support, and for the buyer because there would be few solutions to choose from, and they would be more expensive in comparison with single-chip solutions with similar characteristics.

Well, the single-core concept has a serious drawback, too. The manufacturers cannot increase the performance of homogeneous graphics cores infinitely because it makes the chip too complex and too hungry for power. We’ve talked about the downsides of homogeneous multi-chip solutions a lot in our reviews. Perhaps the future belongs to heterogeneous Multi-Chip Modules. The concept isn’t new, actually. Texture processors and rasterization units were designed as individual chips back at the times of the 3dfx Voodoo and Voodoo 2. This approach was also employed in professional 3D accelerators from 3Dlabs in which, for example, a geometrical coprocessor could be an individual chip.

On a new level, this could be a set of dies of varying functionality combined in a single package and connected with internal high-speed interfaces. For example, one die contains a command processor, ROPs, memory and system bus controllers, and DVI/HDMI/DisplayPort controllers, and the other die contains TMUs and universal execution units. Another possibility, the basic die may only contain a command processor and controllers of I/O interfaces while everything else is implemented as additional dies. This would provide even more flexibility of configuring such a solution. Performance can thus be increased easily by installing additional execution units into the chip package. This approach is less profitable than the single-die architecture, but looks preferable to the homogeneous multi-GPU concept irrespective of implementation (MCMs or several chips with different functionality). The main advantage of a heterogeneous MCM/Multi-GPU product over homogeneous MCM and classic multi-GPU ones is that it always delivers maximum performance and does not depend on software optimizations. No wonder as it is actually an ordinary graphics processor whose functional units are distributed among several physical chips. This approach seems to be the most progressive, but it’s too yearly yet to predict its future.

In this review we’ll describe the Nvidia GeForce 9800 GX2 graphics card and 4-way SLI configurations, but we’ll also cover the current state of multi-GPU technologies to see what benefits they can bring to the gamer. We’ll discuss classic dual-chip solutions as well as graphics subsystems with multiple GPUs, including quad-processor Nvidia Quad SLI and ATI CrossFireX configurations.


Nvidia GeForce 9800 GX2

Work Principles and Technical Specifications

We can’t expect anything particularly special from the new card as uses the well-known G92 core. The specs of the GeForce 9800 GX2 are as follows:

It’s clear that Nvidia’s new card is just a SLI tandem made out of two GeForce 8800 GTS 512MB with different clock rates and with a PCI Express switch to ensure that SLI technology will work on any mainboard. The GeForce 9800 GX2 is no different from the ATI Radeon HD 3870 X2 from this aspect. Interestingly, the clock rates of the GPUs are lowered relative to the 8800 GTS 512MB: from 650/1650MHz to 600/1500MHz. The GPU clock rates of the ATI Radeon HD 3870 X2 are, on the contrary, increased in comparison with the single card. It should also be noted that ATI’s card has a single-PCB design but the 55nm RV670 is more economical and cool than the G92 in full configuration.

The main advantage of the GeForce 9800 GX2 over the ATI Radeon HD 3870 X2 is that is has two times the amount of TMUs and features a scalar architecture of the ALU array, which is more effective in today’s games due to support on the game developers’ part. We can also note the card’s support of PCI Express 2.0 and the lack of support for DirectX 10.1: AMD’s solution is just the opposite. Both standards are not yet widespread, so the availability or lack of such support doesn’t matter much for today’s games. Theoretically, Nvidia’s card should be faster, especially as it is positioned in a more expensive price category than the ATI Radeon HD 3870 X2 ($599-649 against $449) but we should first see how perfectly SLI is implemented. If the technology is not supported properly, the user will only get the performance of a reduced-frequency GeForce 8800 GTS 512MB our of his GeForce 9800 GX2 at a much higher price. We’ll check this out in our gaming tests, of course.

The Nvidia GeForce 9800 GX2 works like any other SLI tandem but the GPUs are located within the same graphics card and share one PCI Express x16 slot. That’s the way the GeForce 7950 GX2 used to be designed:

GeForce 9800 GX2 has only one MIO ports laid outside that is why the maximum configuration is a 4-way SLI system with two 9800 GX2 cards. ATI CrossFireX has the same limitations these days.

As for the way the multi-GPU technology operates, we don’t have accurate info about it. Nvidia seems to have followed ATI in abandoning all rendering modes other than AFR. SLI technology used to support SFR mode in which each GPU would render one half of the frame, and a Quad SLI system could use a hybrid SFR on AFR mode, in which each pair of GPUs would render one frame using SFR and the frames were output alternately in AFR mode. The use of SFR is currently obstructed by the spread of sophisticated rendering techniques, so the transition to AFR looks reasonable. This mode doesn’t require meticulous software optimizations and can often do without them altogether. However, some reviewers don’t think that SFR mode is forgotten completely like Scissor and SuperTiling modes of ATI CrossFireX.

Among the drawbacks typical of AFR, the delayed response of the game to the user’s actions can be mentioned. This is going to be almost imperceptible with two GPUs, though. With four GPUs, the delay may be conspicuous, affecting your playing comfort. We’ll check out this problem below.


HybridPower

Hybrid CeossFire and HybridPower are logical additions to traditional power-saving technologies such as ATI PowerPlay supported by ATI Radeon HD 3000 family. While the point of PowerPlay is in dynamic management of the frequency and voltage of the GPU and in turning off functional subunits that are idling at the given moment of time, then HybridPower and Hybrid CrossFire work completely differently. If supported by the mainboard, this technology allows using the graphics core integrated into the chipset to output 2D image, disabling discrete graphics cores and thus reducing the power consumption of the platform.

Although the power draw of one G92 chip is rather low in 2D mode, this capability may come in handy if the system includes two GeForce 9800 GX2 cards because they carry a total of 4 such chips. We estimate the power draw of a GeForce 9800 GX2 Quad SLI system at 140W and more even in idle mode. If the idle chips are disabled, the platform may get more economical.

HybridPower itself is a subset of Hybrid SLI technology, but the ability to work together with an integrated graphic core is not very important for such a powerful graphics card as GeForce 9800 GX2. That’s why this new interesting card doesn’t support the other part of the technology, GeForce Boost.


Gainward Bliss 9800 GX2 1024MB

Package and Accessories

Nvidia’s new dual-processor solution is represented by the Gainward Bliss 9800 GX2 1024MB graphics card in this review. We’ll now tell you about its packaging and accessories. An important thing, the number 1024 shouldn’t mislead you. The card does not have two 1GB sets of memory chips. The total amount of graphics memory available to 3D applications is 512 megabytes like with the standard GeForce 9800 GX2.

Gainward didn’t develop a new package design, and the box the card comes in is quite familiar to us:

It is a mild and even plain design, but we don’t think it’s appropriate for a top-class expensive product. The package protects the card against possible damage during transportation and storage well enough, though. Besides the card proper, the box contains the following:

The accessories are scanty. There are no additional cables or adapters. You will find a copy of Tomb Raider: Anniversary in the box – it was included with every product from Gainward we reviewed recently. The lack of DVI-I → HDMI and 8-pin PCI Express adapters is quite a serious problem. Not all modern PSUs offer an 8-pin power adapter for graphics cards, but the GeForce 9800 GX2 doesn’t work if you plug a 6-pin connector into its 8-pin header.

So, while the package of the Gainward Bliss 9800 GX2 1024MB is rather good, the accessories are no good at all especially as the product belongs to the highest price category. We guess the vendor should have made his flagship product more appealing to the potential buyer.


PCB Design

The new card is as large as the Nvidia GeForce 8800 GTX and ATI Radeon HD 3870 X2 but occupies two slots, like the Nvidia GeForce 7950 GX2.

You may remember that the GeForce 7950 GX2 used to be criticized for certain design flaws, particularly for the poor cooling of the GPU installed on the bottom PCB. The developer has corrected those flaws in the new GeForce 9800 GX2. As opposed to the predecessor card, the PCBs of the new one face each other, and the chips give their heat away to the single heatsink in between them. Nvidia thinks this design most optimal as it simplifies the wiring of the PCBs and reduces the thermal load on the graphics card to achieve higher GPU clock rates.

We disagree with this statement, though. Yes, with this design, each GPU heats up each PCB only, but they also heat up each other through the common heatsink, which negates the possible positive effect. As indirect proof of our point, the new card’s clock rates are reduced considerably relative to the GeForce 8800 GTS 512MB, let alone GeForce 9800 GTX. The achievement of a higher memory frequency through optimal PCB wiring doesn’t sound like a hefty argument, either. Yes, the memory frequency of the reference Radeon HD 3870 X2 is clocked at a lower frequency, but PowerColor’s version is this card features GDDR4 clocked at 1125 (2250) MHz, and the single-PCB design was not a problem for it.

It took us quite a lot of time to take our GeForce 9800 GX2 apart in order to see its internals.

The PCB being so large, we guess both the graphics chips could be accommodated easily on one PCB, especially as the PCI Express switch employed by Nvidia is far smaller than the PLX PEX8547 installed on the ATI Radeon HD 3870 X2. Each PCB has a large shaped hole for the cooler (located inside the sandwich of two PCBs) to get fresh air from. So, the theoretically available area is not utilized very efficiently notwithstanding Nvidia’s claims about optimized wiring. The PCBs are connected with two flexible cables and fastened together with metallic poles.

Each PCB uses a dedicated high-frequency three-phase power circuit based on a Volterra VT1165MF controller. Such controllers are also installed on the ATI Radeon HD 3870 X2. Tiny Intersil ISL6269CRZ chips are responsible for the memory. The use of a high-frequency digital power circuit helped do without large electrolytic capacitors. It is important for the GeForce 9800 GX2 considering its high component density. Each PCB is equipped with a power connector: the bottom one carries a 6-pin plug and the top one, an 8-pin PCI Express 2.0 plug. As opposed to the Radeon HD 3870 X2, you cannot do with two 6-pin cables – the card won’t start up. Near the 6-pin power plug there is a 2-pin connector for the audio-over-HDMI feature which has already become standard.

The plastic casing makes it difficult to plug the power cables into the GeForce 9800 GX2 because the slit for the connector is very narrow. A connector with a large lock just won’t fit into the plug. This refers to our Enermax Galaxy DXX EGX1000EWL power supply, for example. We just had to cut the locks off the PSU’s connectors. It is also impossible to unplug a connected power cable without a thin-tipped screwdriver for the same reason. This problem is mentioned on the Nvidia's official website in the list of PSUs certified for the use with the GeForce 9800 GX2.

The PCI Express switch installed on the bottom PCB is marked as BR04-300-A2.

We couldn’t find its specs, but we know it supports PCI Express 2.0 and can switch 48 lanes. The GPUs are marked as G92-450-A2 whereas the GeForce 8800 GTS 512MB carries chips marked as G92-400-A2. This might mean a new version of the G92 with a higher frequency potential if it were not for the same revision number. Both chips are dated the 52nd week of 2007, i.e. late December. As we mentioned above, the GPUs have a main domain frequency of 600MHz and a shader domain frequency of 1500MHz. The GPU configuration is standard with 128 ALUs, 32 (64) TMUs, and 16 ROPs.

The PCBs of our GeForce 9800 GX2 carry GDDR3 memory manufactured by Samsung. These K4J52324QE-BJ1A chips have a capacity of 512Mb (16Mbx32), a voltage of 1.9V and a rated frequency of 1000 (2000) MHz. The memory is indeed clocked at this rated frequency, proving a bandwidth of 64GB/s to each GPU.

The total amount of graphics memory is 1024MB, but 3D applications can only use 512MB due to the specifics of homogeneous multi-chip architecture. As a result, the new card may prove slower than Nvidia’s GeForce 8800 GTX/Ultra and GeForce 8800 GTS 1024MB in some games that need more than 512MB of memory.

The reference GeForce 9800 GX2 is equipped with two DVI-I and one HDMI port, but the latter can be replaced with a DisplayPort. The top PCB has a seat for a translator chip of that interface. The card doesn’t support analog video output, which is normal for today. However, there is an empty seat for some connector on the bottom PCB, near the DVI port. Perhaps it can be used to implement the support of S-Video/RCA interfaces.

Besides the connectors, there are two LEDs on the card’s mounting bracket: a dual-color and a blue one. The former indicates power-related problems. It is green when both power cables are attached and red when one or both cables are missing or when both cables have 6-pin connectors. The second indicates a significant drawback of the GeForce 9800 GX2 in comparison with the ATI Radeon HD 3870 X2: it doesn’t support more than one monitor in SLI mode. This LED highlights the Master DVI port you should connect your monitor to in SLI and Quad SLI modes. Nvidia is expected to do away with this limitation in the next versions of ForceWare.


Cooling System

As we noted already, the GeForce 9800 GX2 differs fundamentally in its design from the ATI Radeon HD 3870 X2 as well as Nvidia’s previous-generation dual-chip solutions. A special dual-sided cooler had to be developed for it.

We didn’t take it apart completely because parts of the cooler seemed to be fastened not only with screws but also with thermal glues. We didn’t apply force to avoid damaging the card. Anyway, it is clear that the cooler uses copper soles to take heat off the GPU dies. It also incorporates flat heat pipes and a common heatsink consisting of thin aluminum plates. Dark-gray thermal grease is used as a thermal interface between the copper soles and the GPUs. There are fiber pads socked in white thermal grease between the cooler’s aluminum bottom and the memory chips. This is all traditional for Nvidia’s products.

A 5.8W Delta BFB1012L blower, familiar to us by other products from Nvidia, takes the air from both sides of the graphics card through the figured holes in the PCBs and cools the heatsink. The cooler’s casing is made from aluminum and makes the card considerably heavier.

The card’ mounting bracket is populated with DVI and HDMI ports, and there are but few slits for exhausting the hot air out of the system case. These slits are an aesthetic feature rather: they are highlighted with the bright green LEDs at work but the air flow that goes through them is rather weak. Most of the hot air is exhausted sideways as the heatsink’s ribs are placed at an angle to the mounting bracket. That’s not the best solution because this hot air will stay in the system case, worsening the thermal conditions in it: the GeForce 9800 GX2 generates quite a lot of heat in 3D mode.

More effective would be the classic positioning of DVI ports in the bottom part of the mounting bracket while the top part could be all dedicated to an exhaust hole like in the GeForce 8800 GTS 512MB or GeForce 8800 GTX. The HDMI port is not a serious advantage because its functionality is easily realized with an adapter. In SLI mode, there is only one active port and the card supports one monitor only.

The above-described cooling system developed by Nvidia for the GeForce 9800 GX2 seems to be not quite good for a card with a heat dissipation of 180-200W. A high speed of the fan is likely to be called for in order for this cooler to be effective and this will affect the card’s noise characteristics. We’ll check this out in the next section.


Nvidia GeForce 9800 GX2: Noise, Temperature, Overclockability, Compatibility

Temperature and noise are very important parameters for every top-league graphics card. It is especially so with the GeForce 9800 GX2 which features a non-standard component layout but is expected to generate more heat than any other card we’ve seen, including the Radeon HD 2900 XT.

The level of ambient noise in our lab was 36dBA and the level of noise at a distance of 1 meter from the working testbed with a passively cooled graphics card inside was 43dBA. Here are the results:

When idle, the GeForce 9800 GX2 is almost silent thanks to the high-quality fan rotating at a reduced speed. In 3D mode the speed management system spins the fan up to keep the temperature within reasonable limits. The card becomes audible even in a closed system case. It can match the notorious reference cooler of the Radeon HD 2900 XT in terms of noise except that you don’t hear the rattle of the fan motor, which makes Nvidia’s cooler somewhat more comfortable.

In 2D mode the GPUs are as hot as 65°C. When you launch a game, the temperature grows up quickly to 81°C even when the side panel of the system case is removed. When we closed our Chieftec LCX-01 case (with a 120mm fan on the rear panel), the GPUs got as hot as 90-91°C.

Our attempt to overclock our GeForce 9800 GX2 was a success. We used the nTune tool to increase the GPU frequency to 710MHz. The shader domain frequency was 1750MHz. The memory chips could be overclocked to 1100 (2200) MHz – that’s a good result for chips with a rated frequency of 1000 (2000) MHz. The card was stable at these frequencies, but we didn’t test it in the overclocked mode due to time constraints.

As for compatibility, the GeForce 9800 GX2 refused to start up on a Intel Desktop Board D925XCV, which is in fact one of the first mainboards with PCI Express support. Unfortunately, this incompatibility prevented us from measuring the new card’s power consumption. The GeForce 9800 GX2 worked normally on every other mainboard with tried it on.


Multi-GPU: Problems and Limitations

When we tested systems with a single dual-chip graphics cards, we didn’t face any serious problems. Both: ATI Radeon HD 3870 X2 as well as the new Nvidia GeForce 9800 GX2 worked fine, except for the standard issue of all homogeneous dual-chip graphics solutions – no performance improvement in games that do not support multi-GPU in the drivers on the software level.

However, we had the whole bunch of issues pointed out for ATI CrossFireX and 4-way SLI systems. Among them are the following ones:

Besides, we have witnessed Nvidia 4-way SLI and ATI 4-way CrossFireX systems overheat multiple times, so it is not only the drivers that should be blamed for unstable operation.

As you see, the performance of contemporary multi-processor graphics systems in the consumer segment is still far from ideal. The developers managed to achieve acceptable stability for dual-chip configurations, but in case of more than two graphics cores the situation leaves much to be desired. These systems suffer not only from performance and compatibility issues but also have a lot of stability problems. At this time, they are mostly an expensive toy for computer enthusiasts with unlimited budget who love playing with computer hardware, rather than stable and reliable elite gaming platform for everyday gaming fun.


Testbed and Methods

To test the gaming performance of contemporary multi-GPU platforms we put together the following testbeds:

According to our testing methodology, the drivers were set up to provide the highest possible quality of texture filtering and to minimize the effect of software optimizations used by default by both: AMD/ATI and Nvidia. Also, to ensure maximum image quality, we enabled transparent texture filtering - Adaptive Anti-Aliasing/Multi-sampling for ATI Catalyst and Antialiasing – Transparency: Multisampling for Nvidia ForceWare. As a result, our ATI and Nvidia driver settings looked as follows:

ATI Catalyst:

Nvidia ForceWare:

For our tests we used the following games and benchmarks:

First-Person 3D Shooters

Third-Person 3D Shooters

RPG

Strategies

Synthetic Benchmarks

We selected the highest possible level of detail in each game using standard tools provided by the game itself from the gaming menu. The games configuration files weren’t modified in any way. The only exception was Enemy Territory: Quake Wars game where we disabled the built-in fps rate limitation locked at 30fps. Games supporting DirectX 10 were tested in this particular mode.

Besides Nvidia GeForce 9800 GX2 we have also included the results for the following single graphics accelerators:

Since this article is devoted not only to Nvidia GeForce 9800 GX2, but also to contemporary multi-GPU platforms in general, we have also tested the performance of the following configurations:

The tests were performed in the following resolutions: 1280x1024/960, 1600x1200, 1920x1200 and 2560x1600. If the game didn’t support 16:10 display format, we set the last resolutions to 1920x1440 and 2048x1536 respectively.

Single graphics cards were not tested in resolutions beyond 1920x1200 because they are evidently unable to ensure acceptable performance in most cases. At the same time, we didn’t test multi-processor configurations in 1280x1024, since they will not be able to reveal their potential in this resolution because of CPU limitations.

We used “eye candy” mode everywhere, where it was possible without disabling the HDR/Shader Model 3.0/Shader Model 4.0. Namely, we ran the tests with enabled anisotropic filtering 16x as well as MSAA 4x antialiasing. We enabled them from the game’s menu. If this was not possible, we forced them using the appropriate driver settings of ATI Catalyst and Nvidia ForceWare drivers. Performance was measured with the games’ own tools and the original demos were recorded if possible. Otherwise, the performance was measured manually with Fraps utility version 2.9.1. We measured not only the average speed, but also the minimum speed of the cards where possible.


Performance in First-Person 3D Shooters

Battlefield 2142

This game doesn’t support display resolutions of 16:10 format, so we use resolutions of 1920x1440 and 2048x1536 pixels (4:3 format) instead of 1920x1200 and 2560x1600 for it.

The new card from Nvidia isn’t brilliant in this test: it is just slightly ahead of the GeForce 8800 GTS 512MB and even finds itself behind the latter at 1280x1024. The ATI Radeon HD 3870 X2 is not that fast, but stays les than 10% behind the GeForce 9800 GX2. At resolutions of 1600x1200 and 1920x1440 it ensures the same playing comfort, but generates less heat, doesn’t require an 8-pin power connector, and supports multi-monitor configurations.

As for the multi-GPU solutions based on more than two GPUs, ATI’s CrossFireX technology looks advantageous. Having worse technical specs, the quad-processor system from ATI is ahead of Nvidia’s configuration at every resolution save for 2048x1536. In the latter case the difference is only 6%, though. The minimums of speed we’ve recorded should not be taken seriously because they can vary wildly when it comes to multi-GPU technologies. We didn’t spot a serious delay of response but sudden fluctuations of speed are perceptible and even annoying at times. Even if this behavior is due to insufficiently optimized drivers and can be corrected in the future, it is no good for systems that claim to be elite gaming platforms with unprecedented level of performance.

BioShock

BioShock doesn’t support FSAA when running in Windows Vista’s DirectX 10 environment. That’s why we benchmarked the cards without FSAA.

SLI technology works right for the single GeForce 9800 GX2 ensuring a good performance increase at resolutions of 1600x1200 and higher. The new card is the new leader among single graphics cards. The ATI Radeon HD 3870 X2 is only 15% behind the GeForce 9800 GX2 in terms of average performance while having the same minimum of speed. Considering the lower price (by $150 at least), the ATI solution is a preferable buy.

We can’t say anything good about the Quad SLI system: the pair of GeForce 9800 GX2 cards delivers lower performance than one such card. CrossFireX technology works perfectly: the system with two Radeon HD 3870 X2 cards is over 30% faster than the single card at 1920x1200 and ensures excellent performance for 2560x1600.


Call of Juarez

The in-game benchmarking tools do not support 2560x1600 resolution. We have to limit our test to 1920x1200.

It’s a difficult situation. On one hand, the Nvidia GeForce 9800 GX2 becomes the first single graphics card to allow playing Call of Juarez DX10 Enhancement Pack with the highest level of detail at least at 1280x1024. On the other hand, the specifics of the memory controller and/or inefficient memory management in the driver make Nvidia’s solutions, including the tandem of two GeForce 9800 GX2, fail at the higher resolutions.

ATI’s cards are free from that problem, and the CrossFireX subsystem feel all right at any resolution. The performance of the tandem of two Radeon HD 3870 X2 is especially impressive: the average frame rate of over 40fps at 1920x1200 is fantastic for such a resource-hungry application as Call of Juarez. The low minimum speed spoils the impression somewhat, but like in Battlefield 2142 it varies widely and doesn’t affect your playing comfort much.

Call of Duty 4: Modern Warfare

Nvidia’s new GeForce 9800 GX2 shows superb scalability, being about two times as fast as the single GeForce 8800 GTS 512MB despite the somewhat lower frequency of the GPUs. With the addition of a second card, the 4-way SLI subsystem adds 40-50% more performance, which is an excellent result too, because the scalability of a homogeneous GPU solution worsens greatly as you add more GPUs into the system.

Alas, the lack of software optimizations means a lack of performance benefits for the ATI Radeon HD 3870 X2 and the 3-way/4-way CrossFireX systems.


Crysis

Hoping to see acceptable gaming performance at least on multi-GPU systems we changed the level of detail setting from Very High to High, except Shaders option which affects the image quality the most. Even in this case we decided to use no FSAA.

Unfortunately, even this measure didn’t help to improve the gaming performance here. We didn’t achieve the frame rate of 60fps, desired for playing a first-person shooter, on any of the graphics subsystems. The best we could do was 40fps and the minimal performance still remained very low. We cannot possibly talk about any gaming comfort with fluctuations like that. I have to point out that multi-GPU systems with more than 2 graphics processors didn’t work correctly here. Moreover, Nvidia GeForce 9800 GX2 Quad SLI platform refused to work in 2560x1600 resolution. Hopefully, new drivers will fix this problem.

Enemy Territory: Quake Wars

The frame rate is fixed at 30fps in this game as this is the rate at which the physical model is being updated at the server. Thus, this 30fps speed is the required minimum for playing the game.

The single GeForce 9800 GX2 boasts high efficiency. It is 50-62% ahead of the single GeForce 8800 GTS 512MB depending on resolution, which is a very good result.

The 3- and 4-way SLI and CrossFireX systems and the ATI Radeon HD 3870 X2 do not have anything great TO BOAST: they do not outperform single graphics cards and sometimes even lose to them. Besides, Nvidia GeForce 9800 GX2 Quad SLI proved unstable in 2560x1600: every time we launched the game it demonstrated different average performance ranging from 11fps to 50fps, which can hardly be considered stable.

That’s just another example of the imperfect state of multi-GPU technologies even though they’ve been around for quite a while. In fact, this proves that the homogeneous multi-GPU concept has fundamental drawbacks which cannot be eliminated even theoretically, at least with the current 3D rendering techniques.


Half-Life 2: Episode Two

The ATI Radeon HD 3870 X2 shows better scalability than the GeForce 9800 GX2 but the latter offers higher performance. For example, the GeForce 9800 GX2 has a lead of 40% at 1920x1200, which justifies the price difference between these two cards. The ATI solution allows playing at 1920x1200 with 4x FSAA comfortably, though.

The 4-way CrossFireX platform is the winner among top-performance systems delivering over 80fps at 2560x1600. The GeForce 9800 GX2 Quad SLI suffers from poor driver optimizations, being slower than the single such card. We should acknowledge that the single GeForce 9800 GX2 is fast enough for normal play even at 2560x1600 with FSAA.

S.T.A.L.K.E.R.: Shadow of Chernobyl

The game doesn’t support FSAA when you enable the dynamic lighting model, but loses much of its visual appeal with the static model. This is the reason why we benchmarked the cards in S.T.A.L.K.E.R. using anisotropic filtering only.

The single GeForce 9800 GX2 is better than the single ATI Radeon HD 3870 X2, but it comes from a higher price category after all. The ATI solution shows better scalability relative to the single card, though. Unfortunately, it is too slow at 2560x1600.

Nvidia’s 4-way SLI is no good again. There no effect from adding a second GeForce 9800 GX2 into the system and enabling Quad SLI mode. The speed even lowers a little. The 4-way CrossFireX system is, on the contrary, correct. Although it cannot reach the level of the single GeForce 98800 GX2, its performance is quite sufficient for comfortable play at resolutions up to 1920x1200. So, ATI’s multi-GPU support is better in some application than Nvidia’s, although the latter has more experience in that field.


Performance in Third-Person 3D Shooters

Lost Planet: Extreme Condition

The GeForce 9800 GX2 is the leader among single-card solutions, ensuring a comfortable frame rate at 1280x1024 and 1600x1200.

Nvidia’s solutions boast almost linear scalability in this game. For example, the GeForce 9800 GX2 is 115% ahead of the GeForce 8800 GTS 512MB at 1600x1200. When two more GPUs are added, the system’s performance increases by 80% more. The only exception is 2560x1600 resolution where the Quad SLI system feels that its 512MB of graphics memory is not enough.

ATI’s solutions have serious problems, obviously due to the lack of driver optimizations. You have a twofold performance hit when enable CrossFire, including the single ATI Radeon HD 3870 X2.

Tomb Raider: Legend

Interestingly, the single GeForce 9800 GX2 is no better than the GeForce 8800 GTS 512MB but the tandem of two such cards ensures a considerable performance increase, up to 85%.

Unfortunately, the minimum speed worsens when there are more GPUs in the graphics subsystem. It is high enough with the GeForce 8800 GTS 512MB at 1920x1200, but the GeForce 9800 GX2 Quad SLI has a very low minimum speed, which affects your playing comfort. The same goes for 2560x1600 mode: the average frame rate is high but the minimum speed is very low.

ATI’s CrossFireX technology is no good in this test. Its performance lowers with every additional GPU in the system. The best result in the ATI camp comes from the single Radeon HD 3870.


Performance in RPG

Hellgate: London

The scalability of modern multi-GPU solutions usually worsens as more GPUs are added into the system, but it’s just the opposite here starting from 1600x1200 mode: the GeForce 9800 GX2 is 35% ahead of the GeForce 8800 GTS 512MB while the pair of GeForce 9800 GX2 brings about a 100% performance boost! We don’t observe this at 1280x1024 where the GeForce 9800 GX2 is two times as fast as the single G92-based card.

ATI’s multi-GPU technology turned out a disappointment here having demonstrated minimal performance improvement compared with the single ATI Radeon HD 3870 X2 and having failed in the 4-way CrossFireX configuration at all. We didn’t manage to get any results for the latter, because every time we tried launching the game it would freeze even in the start-up menu.

The Elder Scrolls IV: Oblivion

The game loses much of its visual appeal without HDR. Although some gamers argue that point, we think TES IV looks best with enabled FP HDR and test it in this mode.

The ATI Radeon HD 3870 X2 works impeccably although there is but a small effect from a second RV670 chip in closed game environments – the tandem can’t show its real worth. The GeForce 9800 GX2 goes far ahead of the GeForce 8800 GTS 512MB only from 1920x1200 and can be even slower than the single-chip card at lower resolutions.

The effect from 4-way SLI and 3/4-way CrossFireX configurations is negative here. The minimum speed is low, especially with ATI’s solutions.

It’s different in the open game scenes. The new solutions from Nvidia, especially the Quad SLI config, suffer a terrible reduction of minimum speed: you can’t play with slowdowns like these. The CrossFireX systems are never slower than 25fps while the pair of Radeon HD 3870 X2 cards ensures superb performance even at 2560x1600.

From a practical standpoint, this behavior of CrossFireX shows it as an immature technology. It suffers terrible reduction of minimum speed in a simpler case, but delivers superb results in a more complex one. That’s the consequence of insufficient software optimizations.


Performance in Strategies

Company of Heroes: Opposing Fronts

The new add-on to Company of Heroes is tested in DirectX 10 mode only since it provides the highest quality of the visuals.

The new version of ATI Catalyst improves the result of the ATI Radeon HD 3870 X2 considerably. This card now delivers a twice better minimum speed than the GeForce 8800 GTS 512MB at 1280x1024 and provides a high level of comfort. The support of more GPUs is not implemented properly and leads to a reduction of both average and minimum speed.

The GeForce 9800 GX2 has certain problems here. Although its average frame rate is as high as 60fps at 1280x1024, its minimum speed is very low. Such fluctuations of performance coupled with the delayed response to the user’s actions may the game unplayable. This also refers to the GeForce 9800 GX2 Quad SLI system whose minimum speed is below 5fps.

Command & Conquer 3: Tiberium Wars

The game having a frame rate limiter, you should consider the minimum speed of the cards in the first place.

We can’t see a leader here because every graphics card reaches the frame rate limit.

World in Conflict

The new graphics cards from both companies are not quite good in this test. The Nvidia GeForce 9800 GX2 has a lower minimum speed than the GeForce 8800 GTS 512MB while the ATI Radeon HD 3870 X2 is far slower than the ordinary Radeon HD 3870 X2. The 4-way CrossFireX system provides a performance growth, yet it is not big enough for normal play. To cut it short, the most expensive and powerful multi-GPU solutions available today are perfectly useless for this game.


Performance in Synthetic Benchmarks

Futuremark 3DMark05

We have noted in our previous reports that 3DMark05 is already unable to serve as a true benchmark for modern graphics cards. This test session is yet another example: there is a difference of only 1200 points between the fastest and slowest participants while the overall performance level is as high as 17,000-18,000 points. Against our expectations, it is not the GeForce 9800 GX2 Quad SLI platform, but the 4-way CrossFireX system based on two Radeon HD 3870 X2 that wins this test.

The first test is indicative of how old 3DMark05 is: except for the single-chip solutions, every graphics subsystem delivers the same performance. We can only see that the ATI Radeon HD 3870 X2 is somewhat slower than the GeForce 9800 GX2 at high resolutions.

Quad SLI technology doesn’t work in the second test. The 4-way CrossFireX system offers no performance benefits notwithstanding its huge computing capability provided by 1280 execution units. The latter system can be viewed as a leader, though.

The struggle between the quad-processor systems from ATI and Nvidia doesn’t produce a winner up to the resolution of 2560x1600 where Nvidia gains the upper hand, probably due to the significant advantage in texture processor performance. The single GeForce 9800 X2 is the leader among single cards due to the same reason. It’s also clear at 1280x1024 that CrossFire technology features better scalability than Nvidia’s SLI.


Futuremark 3DMark06

The system with two Radeon HD 3870 X2 cards working in 4-way CrossFireX mode wins 3DMark06, too. Its opponent, the system with two GeForce 9800 GX2, fails the test, being but slightly ahead of the single GeForce 9800 GX2. The results of the single dual-chip cards are also indicative of higher scalability of ATI’s technology.

The modern multi-GPU platforms seem to hit the performance ceiling in the SM2.0 tests. The expensive and hot GeForce 9800 GX2 Quad SLI configuration also betrays poor driver optimizations since its result is lower than that of the single GeForce 9800 GX2 and equals the GeForce 8800 GTS 512MB only.

According to the SM3.0/HDR tests, the ATI CrossFireX platform has better scalability than Nvidia’s SLI. Moreover, the most expensive and supposedly the fastest Quad SLI configuration is but a little ahead of the single dual-chip card from Nvidia and is no match to ATI’s 4-way CrossFireX config that easily scores 8,000 points.

Oddly enough, the GeForce 9800 GX2 Quad SLI system doesn’t show compatibility problems in the individual SM2.0 tests. Although its scalability isn’t perfect, it competes with the4-way CrossFireX system successfully and even beats it at 2560x1500.

Nvidia’s solutions have no chances in the shaders-heavy SM3.0/HDR tests, yet it is in these tests that they show good performance scalability. The Radeons are superior even at 2560x1600 whereas Nvidia only wins the 1280x1024 resolution of the first test.


Conclusion

Having benchmarked all modern top-performance multi-chip consumer-level solutions, we have to make a disappointing conclusion: the current state of multi-GPU technologies is still far from ideal.

Multi-chip graphics systems do not ensure maximum performance even in our selection of tests whereas there are much more gaming titles available in the market these days. In other words, you can invest over $1000 to get the highest speed possible in a particular game, but there is a big chance that the speed will be no higher than with your older graphics subsystem. It may even drop. Moreover, the graphics card Nvidia positions as a replacement to the old GeForce 8800 Ultra sometimes failed to outperform the much cheaper GeForce 8800 GTS 512MB or was just insignificantly faster. The same is true for the ATI Radeon HD 3870 X2, though. We can certainly call Nvidia GeForce 9800 GX2 a formal leader, but it is not 100% superior to the previous 3D flagship product.

So what can a gaming enthusiast expect from modern multi-GPU platforms offered by ATI and Nvidia? Unfortunately, it is much easier to say what he or she shouldn’t expect. And you shouldn’t expect stable and flawless operation at least with the currently available drivers. When you decide on getting yourself a pair of GeForce 9800 GX2 or ATI Radeon HD 3870/3870 X2 graphics cards, you have to keep in mind that you will not only inevitably see no performance boost in some cases, but also will have to combat system instability, freezing, visual artifacts and system overheating.

A single graphics accelerator will be a better buy in most cases, even though it may have two GPUs onboard, at least if your monitor doesn’t support resolutions beyond 1920x1200. The decision to go with ATI Radeon HD 3870 X2 or Nvidia GeForce 9800 GX2 depends totally on the type of games you intend to play and on the available developer support. I would like to specifically point out Call of Juarez - Nvidia GeForce 9800 GX2 turned out the first single graphics accelerator to ensure acceptable level of gaming performance in this title with maximum level of detail settings. However, GeForce 9800 GX2 is much more expensive than ATI Radeon HD 3870 X2: $599/$649 vs. $449. It also features worse acoustic characteristics, consumes more power and dissipates more heat. However, we should also say that Nvidia’s dual-GPU product boasts broader compatibility due to Nvidia’s actively promoted “The Way It’s Meant To Be Played” initiative among computer game developers, which in the end may become a determinative factor for the consumer.

If, however, you have a 30-inch monitor that supports 2560x1600 resolution, then your choice is clear: ATI 4-way CrossFireX outperforms the similar solution from Nvidia or runs at comparable speed offering acceptable gaming performance in such titles as Battlefield 2142, BioShock, Half-Life 2: Episode Two, The Elder Scrolls IV: Oblivion and Сompany of Heroes: Opposing Fronts. Nvidia GeForce 9800 GX2 Quad SLI platform, however, leads in Call of Duty 4, S.T.A.L.K.E.R.: Shadow of Chernobyl and Tomb Raider: Legend. In other games, both quad-GPU configurations either work incorrectly or cannot provide acceptable performance in 2560x1600 resolution. So, the total score would be 5:3 in favor of AMD/ATI that offer better compatibility, scalability and fewer technical issues for the users.

All in all, the situation in the multi-GPU market didn’t really change much since the introduction of Nvidia SLI. Of course, there have been significant improvements since then but they are all of local type. From the global standpoint, all issues typical of homogeneous multi-GPU remained untouched and we tend to believe that things will hardly change dramatically in the near future, since new games requiring software optimizations will continue to come out. In other words, multi-GPU systems cannot fully mature with the current architecture.

As for Nvidia GeForce 9800 GX2, it retained all the advantages and drawbacks of Nvidia GeForce 7950 GX2 taking into account the new technologies, of course. Some of them are there because of the peculiarities of contemporary multi-GPU implementation, and some will be fixed in the future. For example, they will implement multi-monitor configurations support in SLI mode. For a gamer that has over $600 at his disposal, it is a good choice: in the worst case the card will perform at the level of a GeForce 8800 GTS 512MB, which is pretty good, and with proper driver optimizations it can even demonstrate unattainable performance. Nevertheless, those who already own a GeForce 8800 Ultra shouldn’t really hurry to computer stores just yet.

Gainward Bliss 9800 GX2 1024MB Summary

Gainward Bliss 9800 GX2 1024MB is the first 9800 GX2 graphics cards we got our hands on. It differs from the competitor solutions on the same chips only by the cooler sticker, because all these products are manufactured under Nvidia’s strict supervision on one large contract fab. On the one hand, it means that Bliss 9800 GX2 1024MB is as good as analogous solutions from other vendors, but on the other we have to state that it has no unique features that could help distinguish it among others.

Although Gainward Bliss 9800 GX2 1024MB made an overall good impression, we have to point out its very scarce accessories bundle: although GeForce 9800 GX2 boasts very high-quality HD video post-processor, the manufacturer included no software HD player, not even an HDMI cable for connection to a TV. All in all, Gainward Bliss 9800 GX2 1024MB is an excellent graphics card from the high price range, but don’t expect to get the whole bunch of goodies for your buck: a free Tomb Raider game and a pair of adapters is all you can count on.

Highs:

Lows:

<%BANNER[banner_468x30]%>