ATI Crosses the Swords: Multi-GPU CrossFire Technology Previewed

ATI Technologies has been supplying aerospace industry with graphics processor for multiprocessor graphics sub-systems for about half of a decade already, whereas its first and only dual-chip consumer 3D accelerator Rage Fury MAXX released in 1999 has been nearly forgotten. Following NVIDIA with its multi-GPU tech called SLI, ATI Technologies reintroduces its multi-processor graphics technology for consumers today targeting the market of enthusiasts and gamers who are eager to pay any money for the highest performance possible at all.

by Anton Shilov
05/30/2005 | 05:35 PM

No matter how high the speed in 3D games is, there are always people who would like to have it higher and higher. This process is generally known as graphics chips’ constant evolution and promised to leap performance forward every six to twelve months sometimes in the late nineties, when graphics chip designers started the race that continues nowadays. But there is something that can deliver you the next-generation performance here and now, which is beyond evolution, this is multi-GPU.

 

Multi-GPU: Seduction for Unbeatable 3D Graphics Performance

Both ATI Technologies and NVIDIA Corporation deliver astonishing speed with their RADEON X8 and GeForce 6800 graphics processors these days. Nevertheless, even today there are demanding applications that do not run perfectly even on the top graphics cards. It is obvious that those applications will play blisteringly fast with all the eye-candy set to the max on the next-generation graphics cards, but what if someone wants to play those games with amazing quality at this moment? Graphics cards designers have known the answer for ages: just put two graphics cards into one system and get performance close to unbelievable.

  
3dfx Voodoo 2 SLI

Ages ago, in 1997 to be precise, 3dfx Interactive offered a technology called Scanline Interleaving (SLI) that allowed to install two graphics cards, or equal amount of chips on one board for professional apps, into one system and get higher performance in 3D games. The idea found its supporters and graphics chips designers realized the power of multi-chip approach: 3dfx later released its Voodoo 5 that could sport up to 4 VSA-100 graphics chips, while ATI put two Rage Fury Pro chips onto one board. 


ATI Rage Fury MAXX

       

In fact, both 3dfx Voodoo 5 and ATI Rage Fury MAXX were virtually forced to be made by very high performance NVIDIA’s GeForce 256 and GeForce 2 GTS products had to offer. Neither Voodoo 5, nor Rage Fury MAXX, became popular, but could formally showcase ability of 3dfx and ATI to compete against NVIDIA.


3dfx Voodoo 5 5500

There were strong rumors about possibility of launching dual-RADEON 256 graphics card to beat the GeForce 2 Ultra and GeForce 3, but ATI has never released such a product for consumer market possibly due to its high cost. Probably those multi-RADEON applications were used for some aerospace simulators, though. Starting from late 2001, ATI’s RADEON 8500 were used in simulators by Evans & Sutherland, then, in late 2002, ATI announced that its RADEON 9700 PRO graphics chips were used by E&S’ other simulation systems. In mid-2003 ATI said its graphics processors were used in multi-GPU solutions by SGI.


ATI RADEON 256 MAXX prototype

For sometime no multi-GPU solutions for consumer market were available until in late 2003 XGI Technology launched its Volari Duo product lineup that featured two chips. The Volari Duo V8 Ultra was also a graphics card which development was also forced by performance of ATI RADEON 9800- and NVIDIA GeForce 5900-series based devices. Needless to say that XGI’s product did not win benchmarks and did not become widespread.


Evans & Sutherland ATI RADEON 9700-based graphics accelerator

NVIDIA’s GeForce Scalable Link Interface (SLI) made a loud marketing noise in mid-2004 and was finally adopted by numerous gamers seeking for extreme speed in early 2005, demonstrating that there indeed is a market for such kind of solutions despite of their pricing of about $800 for graphics cards alone. Despite of the fact that the GeForce 6800 Ultra delivered performance close to that of the RADEON X800 XT Platinum Edition, NVIDIA probably had an unconquerable seduction to add another GPU and beat ATI’s top offering by a substantial margin. Putting aside the issues SLI may have, NVIDIA succeeded – the absolutely fast graphics solution today comes from its camp. Naturally, ATI just did not want to eat humble pie: it has yielded to temptation and now brings its own multi-GPU technology to the market to prove its perfection.


NVIDIA GeForce 6 SLI

There is a huge difference between multi-GPU technologies “then” and “now”. Earlier they were either professional or were required to be made because competition had superior single-chip offering(s). Today professional solutions remain multi-chip, whereas consumer multi-GPU products are made not because of poor performance of single-chip cards, but because there is a relatively small market of gamers (perhaps, about a quarter of a million configurations yearly) who care to buy graphics sub-systems at virtually any cost.

No More Graphics Cards, Gaming Platforms Are Offered Instead

Intel is not the only in its understanding of the future of computer usage: the top two designers of graphics cards also believe that the best computing experience is possible only with a tailored computer platform rather than with a set of hardware from different designers. With the launch of proprietary multi-GPU technologies, ATI and NVIDIA also demonstrate commitment to platform approach.

Throughout the history of the PC all the components of the IBM PC-compatible computers were generally compatible and interchangeable, which conditioned the widespread popularity of such PCs. For system makers as well as end-users it was very comfortable to have absolute compatibility between different components developed and made in the same timeframe (with some exceptions) and still have excellent usage experience. Nearly three decades after the IBM PC was invented, this paradigm is put under doubts: leading designers of computer components advice to install hardware developed by the same company for optimal compatibility, stability and performance.

NVIDIA’s multi-GPU technology SLI unveiled in middle-2004 and actually launched in December, 2004, was originally meant to operate at Intel Tumwater platform, but then NVIDIA started to recommend to use NVIDIA nForce4 SLI instead. In fact, virtually none SLI-enabled systems featuring the Tumwater chipset have been shipped commercially, which may indicate certain issues during qualification. System makers and computer users were quick to start utilizing NVIDIA nForce4 SLI-series, probably due to excellent qualification results.

ATI, who is not generally known for enthusiast-oriented chipsets, now also recommends to utilize its own RADEON XPRESS 200 CrossFire Edition core-logic with systems running graphics cards in multi-GPU CrossFire mode. While the company’s representative indicate that the RADEON X8 graphics cards themselves are compatible with any chipset, including Intel Tumwater, NVIDIA nForce4 SLI and possibly even VIA PT890 Pro, the best experience and compatibility would only be achieved on the RADEON XPRESS 200 CrossFire mainboards. While eventually the Markham, Ontario-based graphics company is expected to validate third-party chipsets for CrossFire, it is obvious that the market is strongly advised to use ATI’s own CrossFire gaming platform – two graphics cards and a mainboard, all powered by ATI.

Generally speaking, despite of the fact that it is a bit harder to sell a set of chips compared to selling of one chip, putting platforms on the market gives a number of strong benefits to silicon designers:

At the end of the day, the customers receive a somewhat better solution, whereas vendors spend less funds and get higher revenue/profits. Given the complexity of today’s graphics sub-systems, platform approach seems reasonable enough, even that it poses some danger to companies who specialize only on one particular type of production.

The bottom line is that if you want a premier graphics sub-system – consider not only graphics cards, but also a mainboard that is based on a chipset developed by the designer of your graphics processing units.

ATI RD480, ATI RD400 Enter the Scene

For the CrossFire platform ATI offers two new RADEON XPRESS 200 CrossFire Edition chipsets code-named RD480 and RD400 for AMD and Intel processors respectively. Both are re-worked versions of already introduced RADEON XPRESS 200-series core-logic products (this time without integrated graphics) with some new capabilities that allow ATI to call the products “built for the enthusiast”.


ATI RD400 mainboard by Sapphire. Click to enlarge 

ATI’s new RADEON XPRESS 200 CrossFire Edition North Bridges for AMD and Intel processors allow to share PCI Express x16 interconnection between two slots, providing x8 connection to both, and can be paired using PCI Express x4 bus with I/O controller from ATI Technologies or ULi Electronics.

Quite naturally, the new RD480 supports all the latest enthusiast-oriented AMD Athlon 64 processors with up to 1GHz Hyper-Transport bus. The revamped IXP450 I/O controller also brings high definition audio, which may be interested to some enthusiasts, but ATI advices its mainboard partners to use external controllers for Serial ATA II (ATI’s demo platforms use Silicon Image Controller) and Gigabit Ethernet (current demo mainboards use Broadcom’s chip).


ATI RD480 mainboard by Sapphire. Click to enlarge 

Not everything is clear about support of Intel processors for enthusiasts, such as Intel Pentium 4 Extreme Edition, by ATI’s RD400 core-logic. ATI’s presentation slides imply that chips with 1066MHz processor system bus are supported, however, some of the company’s representatives claim that ATI does not have license for this bus. Just like predecessor, the RD400 sports dual-channel DDR or DDR2 memory controller supporting 400MHz and 667MHz respectively. While the IXP450 bridge can be paired with the RD400, external controllers for Serial ATA II and GbE will still be required for mainboards that provide top functionality.

Currently the main advantage ATI’s CrossFire platforms have over majority of NVIDIA SLI platforms is the lack of mode selector: as is known, the bulk of NVIDIA nForce4 SLI-based mainboards are equipped with a special card that determines whether one or two graphics chips are used.

To sum up, with the RADEON XPRESS 200 CrossFire platforms ATI is trying hard to provide the same level of functionality as rival NVIDIA Corp. offers with its nForce4 SLI products, but without really high level of integration. For instance, NVIDIA has built-in SATA II and GbE controllers, whereas ATI has to use external chips, which may be more expensive. Still, in case actual cost of CrossFire mainboards is not much higher than that of SLI platforms, the approach of ATI may be feasible.

ATI produces its new RADEON XPRESS 200 CrossFire Edition code-named RD400 and RD480 chipsets using low power 0.13 micron process technology and suggests that both core-logic sets will provide “huge overclocking potential” demanded by enthusiasts and overclockers.

ATI CrossFire – RADEON X800, RADEON X850 Ease the Burden for Each Other

ATI’s CrossFire technology currently allows to use two RADEON X800 or X850 graphics cards on a special mainboard in parallel to either boost performance or produce higher image quality.

ATI Uses External Interconnection for Multi-GPU

ATI does not seem to have any special logic inside its RADEON X8 graphics processors to allow multi-GPU solutions, but uses a special Compositing Engine to blend parts of images rendered by different graphics cards and external interconnection between the boards.

In order to set the CrossFire work, users will need one RADEON X800 or X850 card and a RADEON X800 or X850 CrossFire Edition graphics card equipped with Compositing Engine. DVI-I output of a typical graphics card should be connected to DMS port of CrossFire Edition card using a special cable provided with CrossFire Edition graphics cards. The CrossFire reportedly does not transfer data between graphics cards using PCI Express bus, even though initially the company planned to use this ability.

Currently CrossFire can only be enabled on a mainboard powered by ATI’s RADEON XPRESS 200 CrossFire Edition chipset, so, at this point CrossFire is not certified to work with NVIDIA nForce4 or Intel 945/955-series chipsets. While ATI does not have any special logic than enhances performance of two PCI Express graphics cards and, in theory, there is no difference for graphics cards which chipset is used, in practice ATI demands that a mainboard featuring RADEON XPRESS 200 CrossFire Edition chipset to be utilised.

ATI’s SuperTiling – Best Load Balancing Ever?

ATI CrossFire offers four new rendering modes with different peculiarities:

SuperTiling – image is split into 32x32 pixel squares, half of the squares are rendered by one graphics card, another half is rendered by another GPU. ATI claims that such chess-board like approach provides superior load balancing, which seems to be correct, as it does not require any intelligent driver to balance the load, yet, given that tiles are relatively small, parts of the image with nearly equal complexity of rendering are distributed equally to different graphics cards. SuperTiling works in Direct3D applications only and requires both graphics cards to store identical frame buffers and process all the geometry for the frame.


SuperTiling example. Click to enlarge

Scissor – image is split into two parts, driver dynamically analyzes the frame and balances the load between two graphics cards. The scissor is supported by both Direct3D and OpenGL applications and works just like NVIDIA’s Symmetric Multi Rendering technology. The approach also does not provide any geometry performance advantages as both cards process vertexes for the whole frame, according to ATI.

Alternate Frame Rendering – already known ATI’s technology that was used with the company’s Fury MAXX graphics cards. In the AFR process, one chip renders even frames while the other chip renders odd frames. Each chip processes triangle setup for its own frame without waiting for the other chip, which makes AFR very efficient in terms of performance scaling. AFR is supported in OpenGL and Direct3D.

Note that neither of the approaches actually allows geometry scaling, which limits the number of advantages multi-GPU technology can provide to a couple of general cases:

Generally speaking, geometry performance is not something needed tremendously for modern games. The only benchmark that can benefit from additional vertex shader power is probably 3DMark05. Furthermore, current ATI hardware has certain limitations in pre-vertex shader buffer (which is currently a 256-bit line), which means that even with a single visual processing unit geometry performance may not be limited by vertex processors themselves.

Super AA – RADEON graphics cards work independently rendering image using different FSAA masks. CrossFire Compositing Engine then blends the two images for enhanced image quality with real-time performance. Super AA mode is enabled through the control panel.

With CATALYST A.I. enabled, the preferred rendering mode is selected for targeted applications automatically. For applications that are not identified in CATALYST A.I., or when CATALYST A.I. is disabled, default multi-GPU rendering modes are offered.

By default either SuperTiling or Scissor modes are applied. Alternate Frame Rendering mode is used for applications identified in CATALYST A.I. (when enabled). When CATALYST A.I. is disabled, graphics processors with 16 pixel processors running Direct3D applications are accelerated by SuperTiling mode, whereas other hardware configurations (e.g., 12-pipe GPUs) are accelerated by Scissor mode.

Different Graphics Cards Work Together

One of the strong points of CrossFire is ability of different graphics cards to work together, which is very important, given the number of ATI RADEON X800 and X850 models already available on the market.


ATI RADEON X850 graphics cards in CrossFire mode. Click to enlarge

Since a CrossFire Edition graphics card is needed for ATI multi-GPU operation, ATI’s add-in cards partners will produce three CrossFire Edition graphics cards:

In general, ATI recommends to use CrossFire Edition cards with corresponding RADEON X800 or X850 chips, however, theoretically RADEON X850 CrossFire should work fine with RADEON X800 cards and RADEON X800 CrossFire will work well with RADEON X850 cards.


ATI RADEON X850 graphics cards in CrossFire mode. Click to enlarge

While ATI allows graphics processors with different amount of pixel pipes to function together, a processor with more pipes will disable “extra” pixel processors (e.g., in case a RADEON X850 PRO works with RADEON X850 CrossFire Edition, the latter will operate as 12-pipe GPU). Both cards will continue to operate at their individual clock speeds. However, if one card performs significantly faster than another, it will throttle its clock-speed so that to remain synchronized.

Compositing Engine – The Power of the CrossFire

The corner-stone of ATI’s CrossFire technology is the Compositing Engine that allows to compose images rendered on two separate graphics cards into one, which may be done either to improve image quality, or to boost the speed. ATI’s Compositing engine is borrowed from professional solutions, so, it should offer pretty high scalability and additional features that will be exclusively available on multi-VPU technology from ATI.


ATI Compositing Engine. Click to enlarge

The Compositing Engine is a set of a three chips: a Field-Programmable Gate Array (FPGA) chip by a third-party designer, TMDS transmitter and RAMDAC. ATI programs the FPGA chip to perform either blending of two images rendered using different patterns (for Super AA modes) or parts of images (SuperTiling and Scissors modes).

Currently ATI does not disclose what type of an FPGA it uses, but it says that the chip stores parts of an image inside a built-in buffer and is programmed by ATI CATALYST driver.

ATI CrossFire Super AA – Image Quality Enhanced

Unlike NVIDIA SLI, ATI CrossFire is designed not only to increase performance, but also improve image quality by offering extreme FSAA modes like 8x, 10x, 12x and even 14x FSAA.


CrossFire SuperAA. Click to enlarge

Both 10x and 14x modes are hybrid: ATI uses both multi-sampling and super-sampling methods to provide the best image quality possible. As previously noted, ATI renders frame using different FSAA patterns on each of the GPUs and then blends two images together. Given that ATI offers numerous modes, users should find it easy to balance between quality and performance. Keep in mind that this is the first time ATI uses super-sampling antialiasing on the RADEON X8-series products, hence, the outcome is very interesting from performance perspective.


Micro-geometry advantages of SuperAA

ATI believes that the main advantage SuperAA provides is improved micro-geometry details and object contours, something which is not always perfect with lower-grade full-scene antialiasing.

Meet the CrossFire System

ATI RADEON XPRESS 200 CrossFire Edition mainboards will be available from ATI’s partners beginning in June, whereas RADEON X850 CrossFire Edition graphics cards will be in production end of June and available mid-July. RADEON X800 CrossFire Edition cards will be available early August.


ATI RADEON X850 CrossFire Edition graphics card


Connectors of ATI RADEON X850 CrossFire Edition graphics card

So, the earliest opportunity to lay your hands on ATI’s multi-GPU technology is in mid-July, if everything goes as planned. Currently ATI has a number of systems it demonstrates to journalists around the world along with some performance estimations.


ATI CrossFire demo system

At this point ATI’s multi-GPU systems are based on RADEON X850 XT technology. Unfortunately, the firm does not let media to snap screenshots or run benchmarks using the system, therefore, nothing can be said about performance advantages provided by ATI's CrossFire technology. Furthermore, at least now the system is pretty loud: both RADEON X850 and RADEON X850 CrossFire do not reduce the speed of their fans, which will hardly give any pleasure for users’ ears.

Performance Estimations, Future Thoughts


ATI's performance estimations for the CrossFire

Most likely, multi-GPU system based on two ATI RADEON X850 XT will offer higher speed than single-GPU R520 system. However, the R520 will offer additional feature-set, such as Shader Model 3.0, and will cost a lot less than two RADEON X850 boards. It is unclear how ATI will position its dual-RADEON X850 products against its own R520, as top-of-the-range R520 unlikely to cost more than $549, which is the price of the RADEON X850 XT CrossFire Edition that will be available in mid-July, which is currently a timeframe for ATI to start talking about the R520 officially.


ATI's performance estimations for the CrossFire

All in all, ATI’s CrossFire in its current implementation may not be something that will become widespread on the market partly due to lack of future-proof technologies (Shaders 3.0, HDR, etc.) and proximity of ATI R520 and NVIDIA G70 release, which will make RADEON X850 CrossFire Edition to co-exist with the next-generation GPUs sometime after the launch. It may turn out that only with R520 configurations hitting the market the CrossFire will achieve popularity, whereas what is launched today will be found only in PCs of hardcore gamers who want to have the best "right here and right now".

ATI CrossFire – the Gamers’ Dream?

ATI CrossFire presents everything needed for a premium class graphics solution: stunning image quality, broad compatibility and high performance, three things that gamers not constrained by budgets would pay attention to. Probably, ATI’s finest 14x antialiasing, which is only available on the company’s multi-GPU solutions, is a thing that should worth additional money spent on an extra graphics card, as blazing image quality may be even more important than extreme performance in currently available games.

By contrast, the competitor NVIDIA Corp. currently does not offer any additional eye-candy features with its SLI multi-GPU technology over single-GPU products (except of the fact that you can use 8xS FSAA with proper speed in high resolutions). Still, NVIDIA seems to be close to releasing its next-generation GeForce 7800-series product, which may present advanced antialiasing capabilities in addition to already existent advantages, such as Shader Model 3.0 and UltraShadow technology.

From technology standpoint ATI’s CrossFire seems to have enough potential for the future. Being based on a Compositing Engine similar to what is used in professional multiprocessor graphics applications, such as simulators, scalability of such technology is obvious and seems to be relatively cost-efficient.

Not having final hardware at the time of the announcement is undisputable drawback of ATI’s technology launch. While on the paper we do see some clear advantages ATI has to offer, including:

… in reality everything still needs confirmation and actual testing, as we still do not know:

Furthermore, multi-GPU solutions naturally have their own disadvantages mentioned in our NVIDIA SLI Preview and Review: high price of two graphics cards and supporting components, power requirements, support of high resolutions (higher than 1600x1200) by games and relatively short time of being the top solution due to development of future visual processing units.

One thing the formal launch of the CrossFire technology shows is that there is no point to wait for better solutions like G70, R520, etc., as they are coming out constantly and there is always a seduction to add another board to receive even greater performance. If you want the best – consider something which is already on the market right here and right now (and keep in mind that CrossFire is not available right now) – new graphics technology is absolutely not something that you should actually wait for in general.