ATI RADEON X1000: Brand-New Graphics Architecture from ATI Explored

ATI's new family of graphics processors is finally born. It took the company a couple of years to develop Shader Model 3.0 graphics architecture and several month of re-spinning the chips to deliver products that run at nearly extreme clock-speeds. Let's find, what the RADEON X1800 XT, 1800 XL, 1600 XT and 1300 PRO are capable of!

by Alexey Stepin , Yaroslav Lyssenko, Anton Shilov
10/05/2005 | 12:05 PM

Both leading GPU manufacturers have a habit of announcing their new-generation solutions at the beginning or end of a year and they both usually do that simultaneously, with a difference of just a couple of months. We had this scenario in 2003: the ATI R350 was announced on March 12 while NVIDIA unveiled its NV35 on May 12. The next year, NVIDIA introduced its NV40 on April 14, and ATI Technologies retorted with its R420 on May 4. This tradition was broken this year, however.

 

Until the last summer ATI managed to be the performance leader with its RADEON X850 XT and XT Platinum Edition graphics cards which outperformed the GeForce 6800 Ultra in a number of applications. NVIDIA’s only argument was the multi-GPU SLI technology. But the tables were turned on June 22, 2005: the NVIDIA G70 processor officially marked the birth of the GeForce 7 product series (for details see our article called NVIDIA GeForce 7800 GTX: New Architecture Exposed ). Not into any revolutions, NVIDIA based its GeForce 7800 GTX graphics card on the best qualities of the GeForce 6 architecture. Thanks to its 24 improved pixel pipelines and 8 vertex processors that card was instantly the fastest consumer 3D solution. Also important was the fact that NVIDIA could provide the new chip in mass quantities and this strengthened its position in the high-end sector. NVIDIA even grabbed some OEM contracts from ATI’s hands, particularly to supply the GeForce 7800 GTX to Dell for use in high-performance computers. A little later the company also rolled out its GeForce 7800 GT graphics card which did superbly in our tests, too (for details see our article called NVIDIA GeForce 7800 GT: Full-Throttle Graphics for $449).

No answer came from ATI Technologies, although anxiously awaited by the public. With GeForce 7 products coming to market in a billow, the balance between the two manufacturers was shaken. NVIDIA got on top as ATI didn’t try to challenge it with a new product. ATI’s new-generation graphics processor was first expected at June, then in July and August and September.

This long delay was due to manufacture-related problems. First prototypes of the new GPU, codenamed R520, appeared as far back as December of 2004, but ATI had to spend some time perfecting the new product to achieve an acceptable chip yield.

Each change in a complex electronic chip takes a lot of time and money, usually a few months and a few million dollars, so the delay of the R520 seems a natural, even though regrettable event. But all the problems being finally solved, we can at last hear ATI answer to NVIDIA today, on October 5, 2005. Can ATI dethrone NVIDIA or the GeForce 7 family remains unrivalled? Has ATI become superior from the technological point of view with the release of the new GPU series? You’ll learn everything right now, in this review!

Please Meet: ATI RADEON X1000

ATI didn’t have much freedom in choosing the name – “RADEON X900” was the only position left vacant in ATI’s former product nomenclature. Unlike NVIDIA, ATI Technologies used three-digit numbers in the product names of the RADEON X series. But the marketing men from ATI found a simple and elegant way to bypass this problem by adding the number 1000 to the numeric markings of the new products. Thus, ATI’s new graphics processors got the names of RADEON X1800, RADEON X1600 and RADEON X1300. Thus, the names are indicative that we deal with a new-generation architecture, and the company has also reserved enough names for its future products.


RADEON X1800 core

This time “new” in “new-generation” really means it. While NVIDIA’s G70 was a greatly improved NV40, the RADEON X1000 is a completely new architecture that has little in common with ATI’s previous generations of GPUs. By the way, the senior model of the family, RADEON X1800 (R520) chip, is more complex than the NVIDIA G70: 320 against 302 million transistors!

Note that the RADEON X1600 (RV530) targeted for the mainstream market segments consists of 157 million transistors, while RADEON X1300 (RV515) is claimed to be the first value chip that is built of about 100 million transistors.

The architecture has become more complex due to a number of improvements, including:

A lot of rumors about the new RADEON – sometimes absolutely incredible – have been around all that time, but now we can offer you the precise information:

As you see, ATI Technologies really got to frequencies above 600MHz with its new 0.09-micron tech process although the process doesn’t involve low-k dielectrics. The R520 is no worse than the G70 in any other technological aspect, excepting the number of pixel processors. But the higher frequency should make up for that deficiency. Moreover, the memory of the senior RADEON X1800 works at a huge frequency of 750 (1500) MHz which can theoretically yield an up to 48GB/s bandwidth (ATI quotes a smaller number, 42GB/s).


RADEON X1800 XT graphics card

ATI tried to make the new architecture as flexible as possible and the graphics processor is split into several components which can be combined in any fashion in a particular GPU model.

From now on, different RADEON X1000 models will differ not only in the number of pixel and vertex processors, but in other details as well. It should ensure an optimal price/performance ratio for each model. As usual, low-performance versions of the new GPU have “RV” in their names.

The RADEON X1000 series will be represented with the following graphics card models:

As you see, the new family spans the entire market, from entry-level to high-end products. You should be aware, though, that not all models from the list will be available on the day of the announcement.

ATI Technologies promised it would have begun mass shipments of RADEON X1800 XL, RADEON X1300 PRO and RADEON X1300 by October 5, 2005. The faster RADEON X1800 XT is going to be available in mass quantities in a month, on November 5. On the 30-th of November, two mainstream models, RADEON X1600 XT and RADEON X1600 PRO, will join the rest of the family on the shop shelves.

The company seems to have problems still with the yield of RADEON X1800 chips capable of working at frequencies above 500MHz. They must have decided to delay the release of the top-end model a little to avoid unnecessary excitement and speculation.

All RADEON X1000 processors are currently manufactured on TSMC facilities, but ATI has already contracted UMC to produce these chips, too, as far as we know. It means ATI will be able to bring more GPUs to the market and to reduce the cost of the final product, too.

All of ATI’s new graphics cards support the CrossFire mode, but the appropriate Master cards are not yet available. When they come out, a RADEON X1800 CrossFire Edition with 512MB of graphics memory is expected to cost $599, and a RADEON X1600 CrossFire Edition, $299. These two cards will be free from the display resolution restrictions of the current version of the CrossFire technology: the new Compositing Engine will permit to use resolutions up to 2048x1536 at a refresh rate higher than 70Hz. As for the RADEON X1300, two such graphics cards will unite into a CrossFire configuration through the PCI Express x16 bus, so there’s no need for a special CrossFire edition of this product.

Pixel Processors

ATI focused on distributing the load among the various functional units of the chip, so the new RADEON X1000 architecture is a multi-threaded or, in ATI’s terms, Ultra-Threaded Architecture. The name of the technology sounds like Intel’s Hyper-Threading and its purpose is similar, too: to use the available computational resources of the processor in the most efficient way and to minimize the time when the processor’s execution devices are idle.


ATI RADEON X1000 ultra-threaded architecture

The RADEON X1000 (R5xx) architecture has some points of similarity with both RADEON 9000 (R3xx) and RADEON X800 (R4xx) architectures as well as with the completely new architecture employed in the Xbox 360 GPU, but ATI’s new processors have a number of unique traits that have no analogs in the other chips.

For example, the RADEON X1000 GPUs have an integrated intelligent switching unit, the so-called Ultra-Threading Dispatch Processor, which is to optimally distribute the load among the quads of pixel processors (each quad consists of four pixel processors, each of which can process a shader for a 2x2-pixel block in a single clock cycle) and the texture-mapping units. Particularly, the Ultra-Threading Dispatch Processor divides the pixel processing workload into small threads of 4x4 pixels. It can also determine the moments of idleness of some pixel processors in the quads and assign them new tasks. When further execution of the shader requires some not yet ready data, the arbiter processor halts the thread until the data is received thus freeing the ALUs for other threads and masking the texture sampling latency, for example, for textures stored either in cache or memory. According to ATI, this architecture helps to achieve a 90% efficiency of the pixel processors on any shader.

Quick switching between the threads requires storing the intermediate data of each thread, and ATI uses special registers (General Purpose Register Array) connected at high speed with the pixel processors as in earlier ATI’s GPUs. It’s not quite clear yet how many registers there are in the RADEON X1800, X1600 and X1300, and how sensitive the GPUs are to the degree of complexity of pixel shaders.

Complying fully with the Shader Model 3.0 standard, ATI’s new solutions fully support loops, branches and subroutines. The flow control helps them execute virtually infinite shaders. The RADEON X1000 family processors do all executions in 128-bit floating-point format which minimizes the possibility that round-off errors accumulate and worsen the image quality.

The number of simultaneously executed code threads has become bigger, but the size of each thread has been reduced to 4x4 pixels. This helps to achieve a higher efficiency at dynamic branching as illustrated by the next diagram:


Thread size and dynamic branching efficiency

The advantages of ATI’s approach are obvious: the dynamic branching efficiency degenerates greatly at big thread sizes and it becomes downright unprofitable with 64x64-pixel threads. The senior model, RADEON X1800 (R520), can execute up to 512 threads of shader code simultaneously while the weaker models are limited to 128 simultaneous threads.

A special dedicated branch execution unit is another interesting feature of the RADEON X1000. Executing one flow control instruction (conditions, loops, subroutines) per clock cycle this unit greatly reduces the load on the main ALUs. Shaders that use flow control instructions are executed in fewer cycles than usually. This may bring a considerable performance increase with version 3.0 pixel shaders over NVIDIA’s solutions.


RADEON X1800 pixel shader, ALUs, and texture address units

Since contemporary games make wide use of pixel shaders, ATI put an emphasis on high pixel shader performance of the new GPUs. As you remember, the pixel pipelines of the GeForce 7 were also improved since NVIDIA’s earlier GPUs.

The goal was achieved by increasing the number of ALUs. Each pixel processor of the R520 has 2 scalar and 2 vector ALUs, capable of executing up to 4 instructions per clock cycle (2 ADD-type instructions + modifier, 2 ADD/MUL/MADD-type instructions).

The new RADEON is also the first GPU in which the texture-mapping units and the texture addressing units communicate with the shader processors not directly, but through the Ultra-Threading Dispatch Processor. This must have been done as another optimization measure for the whole graphics core, mostly to hide the texture sampling latency. It’s just more efficient to coordinate all the units from a single “control center”!

ATI Technologies says the total performance of the RADEON X1800 (R520) equals 83Gflops whereas NVIDIA claims 165Gflops for the G70. This is two times higher than the performance of the ATI chip, but the comparison is probably incorrect. The speed of the GeForce 7800 GTX was measured on MADD instructions, and we don’t know how ATI measured the performance of their card.

Vertex Processors

The vertex processors of the RADEON X1000 are designed in much the same manner as in the NVIDIA GeForce 7 architecture. Each processor consists of two units, vector and scalar, with the only difference that the G70’s vertex processors have 32-bit ALUs, while the vector ALU of a vertex processor of the RADEON X1000 is 128-bit. This feature allows emulating the central processor on the GPU.


RADEON X1800 vertex shader engine

The new vertex processors can perform two instructions per clock cycle. An ordinary shader may be as long as 1024 instructions or virtually infinite with flow control instructions. Of course, the vertex processors of the RADEON X1000 are fully compliant with the Shader Model 3.0 specification.

Theoretically, the vertex shader performance of the RADEON X1800 XT should be much higher than that of the GeForce 7800 GTX since its vertex processors work at a much higher frequency. But we can only know this for sure after we’ve tested the card in appropriate benchmarks.

Memory Controller

ATI endowed its new graphics processors with a completely redesigned memory controller. The internal memory bus of RADEON X1800 has acquired ring topology and consists of two 256-bit bidirectional ring buses, while the ring topology of the RADEON X1600 implies two bi-directional 128-bit buses.


RADEON X1800 Ring Bus diagram,
with a typical memory read sequence highlighted

Ring buses go around the entire die and help to simplify and optimize its interconnects. The chip components can thus be connected in the shortest way. Coupled with the dispatch unit, this solution minimizes latencies and signal distortion at memory write operations. Thanks to the Ring Bus technology, the RADEON X1800/X1600 can work with high-frequency memory like GDDR4, for example, while a traditional architecture wouldn’t support GDDR4 due to interference in the not optimally wired connectors inside the GPU.

The memory is connected to the buses at the so-called Ring Stops. There are four such stops in total; each has two 32-bit access channels. For comparison: the memory of the RADEON X850 connects to the controller through four 64-bit channels. Each Ring Stop can give out data to the requesting client, according to the memory controller’s instructions.

The Ring Bus memory subsystem works simply. A client sends a data request to the memory controller which is located in the center of the chip. The memory controller uses a special algorithm to determine the priority of each request, giving the highest priority to those that affect the performance the most. Then it sends an appropriate request to the memory chips and sends the data along the Ring Bus to the Ring Stop nearest to the requesting client. From the Ring Stop the data arrives to the client. A so-called Write Crossbar Switch is located around the controller proper for optimal memory access – it makes sure the requests are distributed evenly.

The operation algorithm of the new controller can be programmed from the driver, so its operation can be improved further in the future. Moreover, ATI has a theoretical opportunity to program the controller for a specific application and create an appropriate profile in the Catalyst driver.

The cache has become fully associative, i.e. any cache line can store the contents of any location in the external memory.


Caches comparison

The frequency being the same, an associative cache works more efficiently than a direct-mapped cache. Thus, the new architecture has a great performance reserve for applications critical about the graphics memory subsystem bandwidth. In other words, the RADEON X1000 is expected to perform well in high resolutions and/or with enabled full-screen antialiasing and anisotropic filtering.

The HyperZ technology has also been improved and a more sophisticated algorithm is now employed to identify invisible surfaces that are to be removed. ATI says the new algorithm is 50% more efficient than in the RADEON X850.

Note that although RADEON X1300 doesn’t support Ring Bus as well as programmable memory requests arbiter, it uses other techniques intended to improve the memory bandwidth of the RADEON X1000 family.

HDR: Speed AND Quality

The new generation of ATI’s graphics processors fully supports high dynamic range display modes, known under the common name HDR.

One HDR mode was already available in RADEON X800 family processors, but game developers didn’t appreciate that feature much. We also described HDR in detail in our review of the NV40 processor which supported the OpenEXR standard with 16-bit floating point color representation developed by Industrial Light & Magic (for details see our article called NVIDIA GeForce 6800 Ultra and GeForce 6800: NV40 Enters the Scene ).

OpenEXR was chosen as a standard widely employed in the cinema industry to create special effects in movies, but PC game developers remained rather indifferent. The 3D shooter Far Cry long remained the only game to support OpenEXR and even this game suffered a tremendous performance hit in the HDR mode. Resolutions above 1024x768 were absolutely unplayable. Moreover, the specifics of the implementation of HDR in NVIDIA’s graphics architecture made it impossible to use full-screen antialiasing in this mode (on the other hand, FSAA would just result in an even bigger performance hit). The later released GeForce 7800 GTX, however, had enough speed to allow using OpenEXR with some comfort, but it still didn’t permit to combine it with FSAA.

ATI Technologies took its previous experience into account when developing the new architecture and the RADEON X1000 acquired widest HDR-related capabilities, with various – and even custom – formats. The RADEON X1000 GPUs also allows you to use HDR along with full-screen antialiasing. This is of course a big step forward since the NVIDIA GeForce 6/7, but do the new GPUs have enough performance to ensure a comfortable speed in the new HDR modes? We’ll only know this after we test them, but at least we know now why the R520 chip, the senior model in ATI’s new GPU series, came out more complex than the NVIDIA G70. The above-described architectural innovations each required its own portion of transistors in the die. As a result, the R520 consists of 320 million transistors – the most complex graphics processor today! – although it has 16 pixel pipelines against the G70’s 24.

New FSAA and Anisotropic Filtering Methods, 3Dc+

As you know from our reviews, NVIDIA implemented a transparent texture antialiasing technique in its G70 chip. ATI also developed an analog for the RADEON X1000 and called it Adaptive Antialiasing. Improving the visual quality of transparent textures in such objects as wire fences, foliage, etc., Adaptive Antialiasing can be used in combination with all the other types and modes of antialiasing the RADEON X1000 architecture supports, including Temporal AA and Super AA. This is also accompanied with full HDR support.

The anisotropic filtering algorithm from the previous ATI products has also undergone some improvements. The new mode, Quality AF, uses the so-called Area-Aniso algorithm to achieve a better texture filtering quality than before. However, as we have repeatedly argued in our reviews, it’s difficult or near impossible to see a slight image quality improvement in modern dynamic 3D games. While the difference between tri-linear and anisotropic filtering is obvious, the difference between 8x and 16x AF is hardly discernable. We will show you below how the new anisotropic filtering algorithm improves the visuals in a real game and will also measure the performance hit the new graphics cards suffer with enabled Adaptive AA and Quality AF.

One more technology which directly affects the image quality has also been improved further in the new RADEONs. We mean the normal map compression (3Dc). Its new version has acquired a plus sign in the name and can compress textures that are used, for example, as lighting and shadow maps, HDR textures, material properties, etc. 3Dc+ provides a compression coefficient of 2 to 1 with such textures and 4 to 1 with two-channel textures. If you don’t know what it’s all about, here’s its purpose in brief: the 3Dc (and now 3Dc+) technology helps to improve the level of detail of 3D models by using high-resolution normal maps rather than increasing the number of their polygons. Why 3Dc but not something else? The DXTC technology, for example, provides an 8 to 1 texture compression, but it doesn’t suit for normal maps where per-pixel precision is necessary. The quality of the final rendering would suffer.

ATI Avivo: New Era in Video Processing

Even the RADEON 9700, the world’s first DirectX 9-compatible GPU, already had some abilities to process video streams with the help of pixel shaders, but no progress has been made in the ATI camp since then. The video-processing skills of RADEON X800/X850 chips were rather poor if compared with GPUs from NVIDIA or S3. The RADEON X1000 architecture comes to change this situation. Developing the new GPU series, ATI caused a mini-revolution in video-processing, too, by endowing the new-generation GPUs with hardware H.264 and VC-1 encoding and decoding capabilities. These two formats are the basis of Blu-Ray and HD-DVD standards, respectively.


Avivo engine

Besides that, the Avivo Display Engine – this is the name of the new video-processing technology from ATI – includes two independent 10-bit engines each of which supports overlays, high-quality gamma correction, color correction, image scaling and de-interlacing. The signal is encoded into a TV-friendly format with a Xilleon chip which was originally developed for hi-end home HDTV devices.

Such powerful video-processing capabilities seem to be the most appealing in the RADEON X1300 graphics card. Thus inexpensive model has the lowest heat dissipation and power consumption in the family and thus can be an optimal choice for a high-quality home multimedia center.

We’ve been talking about the RADEON X1000 architecture in general so far. The next section is concerned with particular graphics card models that represent the new architecture. We want to express our sincere thanks to ATI Technologies for providing us with the following cards:

We will scrutinize these devices in a number of theoretical tests and will also check them in the new FSAA and anisotropic filtering modes developed and implemented by ATI Technologies.

The New RADEON: A Family Portrait

RADEON X1800 XT

We’ll start with the senior model, RADEON X1800 XT. This graphics card is much longer than the RADEON X850 and similar in size to the GeForce 7800 GTX.

Such a long PCB isn’t just a whim of the developers. The chip consisting of 320 million transistors and the memory working at 750 (1500) MHz frequency consume a lot of power call for appropriate power circuitry, which occupies the entire rear part of the RADEON X1800 XT. Unlike on the GeForce 7800 GTX, the power transistors are placed in a single vertical row and are covered with a narrow heatsink. The GeForce 6800 Ultra used a similar design, by the way. The power elements of the circuit are controlled with a multi-phase Volterra VT1103 controller (you can see it below the 6-pin additional power connector on the snapshot). The rest of the card’s surface is concealed by the massive dual-slot cooling system which we dismantled to get a better view:

Memory chips on ATI’s graphics cards used to be placed in the shape of the letter L, but this time the engineers had to use the same placement as NVIDIA came to employ since GeForce FX. This measure was necessary to ensure stable operation of the memory at frequencies above 1200MHz. But even this placement of the chips required an ingenious wiring between the GPU and memory. The developers met their goal, even though they made the PCB very complex.

The card carries eight GDDR3 chips marked as Samsung K4J52324QC-BJ12. According to the specification, these 512Mbit chips have an access time of 1.25 nanoseconds and work at 2.0V voltage. The chips are rated to work at 800 (1600) MHz frequency, but the card clocks them at 750 (1500) MHz. Eight 512Mbit chips yield a total of 512 megabytes of graphics memory – this is the most powerful model in the RADEON X1800 series.

The left part of the PCB is not particularly interesting. Besides the ordinary DVI-I and S-Video connectors, there is a Rage Theater chip here which is responsible for capturing video from external sources. This solution looks somewhat odd in combination with the Avivo, but for some reason ATI didn’t use their more progressive Rage Theater 200 chip, equipped with 12-bit ADCs, even in the new-generation graphics cards. Well, the Rage Theater anyway ensures nearly the same video capture quality as the Philips SAA7115HL chip NVIDIA uses in its GeForce 7800 GTX.

A little higher you can see an empty seat, probably intended for an additional TMDS transmitter (to connect to high-resolution TFT panels across a DVI interface). Higher still, there’s a connector that looks like an old VESA feature connector that you could see on almost all graphics cards in the past. We are not sure about its purpose – maybe CrossFire configurations will be united with a special cable through such connectors in the future? The reverse side of the PCB doesn’t have anything of interest. Besides numerous small elements, there is a metal back-plate which prevents the PCB from bending under the cooler’s weight.

You can see the effect of the new 0.09-micron tech process with your own eyes here. Although the ATI R520 includes more transistors than the NVIDIA G70 does, the die size of the new GPU is noticeably smaller and is comparable to that of the 0.13-micron R480.

The R520 chip on our sample of the card was manufactured on the 37-th week of 2005, i.e. around the middle of September. This is indirect evidence of the manufacturing problems with the new GPU – ATI seems to have begun the mass production of the final revision of the R520 quite recently. The chip has a metal frame that protects it against chipping. The GPU frequency is 625MHz on the XT version of the RADEON X1800.

The cooling system deployed on the RADEON X1800 XT is nothing else but a slightly modified cooler from the RADEON X850 XT which should be known to you from our earlier reviews. The central point of this cooler is a copper heatsink which directly contacts the GPU die through a thin layer of dark-gray thermal paste. The memory cools down by giving its heat out to the massive aluminum base of the cooler through elastic pads. Like on the RADEON X850 XT, a blower drives air through the heatsink ribbing and exhausts it to the outside.

This cooler differs from the RADEON X850’s one in having a bigger heatsink and a different shape of the base. The bottom part of the casing is painted white and is adorned with a picture of the ATI symbol – the girl called Ruby with a sword in hand. Unfortunately, we have some grave apprehensions about the acoustic characteristics of the cooler. If the blower works at a higher speed than on the RADEON X850 XT, the noise will hardly be acceptable, especially since the closed casing works as a resonator. But we’ll check this shortly in the appropriate section of the review.

RADEON X1800 XL

The less powerful, XL-indexed RADEON X1800 uses the same PCB as the senior model but differs in the design of the cooling system and the clock rates. Like the RADEON X1800 XT, the RADEON X1800 XL is equipped with a Rage Theater chip.

This graphics card uses Samsung K4J55323QG-BC14 chips (256Mbit, 1.4ns access time, 1.8V voltage, 700 (1400) MHz rated frequency). The RADEON X1800 XL clocks its memory at 500 (1000), i.e. below its official rating. Eight 256Mbit chips give you a total of 256 megabytes of memory. Like on the RADEON X1800 XT, the memory chips are located on one side of the PCB only; there are no empty seats on the reverse side of the PCB. So, unlike NVIDIA with its GeForce 7800 GTX, ATI varies the amount of graphics memory in its two top-end models by replacing 256Mbit chips with 512Mbit ones.

This R520 chip is a little older than the one we have seen on the RADEON X1800 XT. This one is dated the 32-th week of 2005. We suspect it to be an earlier revision of the R520 chip, unable to work at frequencies above 500-550MHz.

The cooling system differs from the one installed on the RADEON X1800 XT. It is based on two U-shaped heat pipes that evenly distribute the heat from the GPU-contacting base in the thin copper ribbed section. The memory gives its heat out through the aluminum casing. The straight-bladed fan takes air from inside the PC case and drives it through the ribbing section and back into the system case. In fact, the air stream cools just the heatsink on the power circuit elements.

Using the heat pipes ATI made the cooling system of the RADEON X1800 XL fit into one slot, but the fan is probably noisy. We already met such a fan on the PowerColor X800 XL graphics card and were not pleased with it (for details see our article called PowerColor X850 XT and PowerColor X800XL Graphics Cards Review ). Like on the RADEON X1800 XT, the cooling system is adorned with a picture of Ruby.

RADEON X1600 XT

The RADEON X1600 XT is much smaller than the senior models, its PCB being comparable to the RADEON X700 PRO in size.

The card has a seat for a Rage Theater chip, but the chip itself is missing, unlike on the RADEON X1800. It’s probably up to the graphics card manufacturer to put a VIVO chip on this card, and the standard version of the RADEON X1600 XT will come without VIVO functionality. Like the RADEON X1800 XT/XL, the RADEON X1600 XT is equipped with two DVI-I and one S-Video connector.

The RADEON X1000 architecture comprises several independent units, so the RV530 chip differs from the R520 not only in having fewer pixel and vertex processors. The new mainstream solution from ATI also has fewer texture-mapping units, render back-ends and Z-compare units. The maximum number of simultaneously executed threads of shader code is reduced from 512 to 256, too. The resulting chip turned to be compact and economical.

The power circuit of the RADEON X1600 XT is simple if you compare it with that of the RADEON X1800, and consists of fewer elements. Some of these elements are grouped in the top right corner of the PCB, but the switching transistors are located at the left part of the card and at an angle of 45 degrees relative to the GPU. The RADEON X1600 XT uses two programmable PWM controllers RichTek RT9232 (both located on the reverse side of the PCB) to control the power supply of the GPU and memory, respectively. Also on the reverse side of the PCB there is a seat for an additional TMDS transmitter and a cooler’s back-plate.

The memory chips are placed in the traditional way, but again only on the front side of the PCB. The card carries four Samsung K4J52324QC-BJ12 chips (the same chips as installed on the RADEON X1800 XT). Four 512Mbit chips mean 256MB of memory and the memory bus width is 128 bits since the chips have a 16Mx32 structure. It’s not quite clear why ATI put 1.25ns chips on this card – they could have used cheaper K4J52324QC-BJ14 chips (rated for 700 (1400) MHz) since the specified memory frequency of this card is 690 (1380) MHz. We suspect that the expensive chips are only employed on engineering samples, while off-the-shelf RADEON X1600 XT cards will come with cheaper memory.

The cluster-like architecture and the 0.09-micron tech process helped to reduce the RV530 die almost to the size of the RV380 (RADEON X600) – and still squeeze 12 pixel and 5 vertex processors into it! That’s impressive. The chip is hardly older than the R520 installed on our RADEON X1800 XT. It was manufactured during at the 36-th week of 2005, and the text “Eng Sample” indicates that this is probably not a final revision. The GPU packaging is not equipped with a protective frame, as on the RADEON X1800, but it is not very necessary since the cooler on the RADEON X1600 is lighter than on the RADEON X1800 XT/XL.

The cooling system of the RADEON X1600 consists of a copper foundation with soldered-up ribs under a black plastic casing. The configuration of the fan blades differs from the classic fan of the RADEON X800 XT card: the straight vertical blades are driving the air stream sideways from the fan which is the most optimal solution for the employed cooling system. The fan is connected to the PCB with two wires and there is no tachometer (which is present on the RADEON X1800 XL). The RADEON X1600 XT uses a simplified fan-speed control algorithm. The casing of the cooling system bears no pictures like those you’ve seen on the top-end models – maybe this is just not the final version of the card? The fan will probably be noisy at high speeds.

RADEON X1300 PRO

The RADEON X1300 PRO doesn’t differ from the RADEON X1600 XT at first sight, yet there are really a number of significant differences between these two graphics cards.

Unlike other members of the RADEON X1000 family, the RADEON X1300 PRO is equipped with a single DVI connector, the place of the other being occupied by a 15-pin D-Sub output. That’s why there’s no possibility to put a second TMDS transmitter here. A VIVO chip can be installed, but it is missing on our sample of the card, just like on our RADEON X1600 XT.

The power circuit is visibly simplified, and the wiring is different around the memory chips. Unlike on the RADEON X1600, GDDR2 memory is installed here, and on both sides of the PCB. Infineon’s HYB18T256161AF-25 chips have a 16Mx16 structure, so eight such chips were required for the 128-bit memory bus. The capacity of one chip being 256Mbit, the total amount of onboard graphics memory is 256 megabytes. These 2.5ns chips are rated for 400 (800) MHz frequency and are clocked exactly like that by the card.

There are fewer resistors on the RV515 case than on the RV530. The die area is also smaller, since the chip contains only 4 vertex and 2 pixel processors and the number of Z-compare unit is reduced from 8 to 4. The chip on our sample of the card was manufactured on the 32-th week of the current year, just like the GPU of our sample of the RADEON X1800 XL, but here we most probably deal with a final revision of the chip. Just as described by the specification, it is clocked at 600MHz. The cooling system is exactly the same as the one deployed on the RADEON X1600 XT, so there’s no sense in repeating the description.

Power Consumption: 100W Is Not the Limit

We couldn’t disregard such important topic as the power consumption of the new graphics card family. We make the corresponding tests for all four ATI newcomers.

We used a special testbed based on a modified mainboard that allowed connecting special measuring devices to the 12V and 3.3V power lines leading to the PCI Express x16 slot. To measure how much power the graphics accelerator consumes through the external connector, we used an adapter equipped with special shunt and connectors. Our testbed was configured as follows:

To create realistic testing conditions we used the game 3 benchmark from 3DMark05 test suite run in 1600x1200 resolution with enabled FSAA 4x and AF 16x. This test loads the pixel processors really heavily that is why in this test the graphics accelerator works in conditions close to those in many contemporary games. So, the results of our measurements were close to real life, too. Here is what we obtained during this test session:

RADEON X1800 XT consumes almost twice as much power as RADEON X1800 XL. Moreover, this graphic adapter beat all the records for the single graphics cards and even exceeded the power consumption of the 24 pipeline GeForce 7800 GTX! Of course, the higher clock frequency and complexity of the new chip played an important role here. But I can also suppose that the main contribution to this sky-high power consumption rate of the new RADEON X1800 XT was made by the 512MB of memory working at 750 (1500) MHz with 2.0V voltage.

It could also be the case that the memory voltage was increased even beyond that value to ensure maximum stability of the solution, but unfortunately, we cannot check if this supposition is true or not, because there are no utilities that could monitor the voltage level of the RADEON X1000. Two RADEON X1800 XT graphics card working in CrossFire mode can consume up to 225W of power, so if you have a 400-450W PSU, it will hardly be enough in this case. You will most likely need a power supply unit that can produce up to 550-600W in continuous mode, because a top-end gaming system with two RADEON X1800 XT graphics cards will also have a bunch of other power-hungry components, such as a powerful CPU and a couple of hard drives.

RADEON X1800 XL demonstrated considerably lower power consumption rates – only 59W, which is about the level of GeForce 7800 GT. The latter uses a 0.11micron GPU with 20 pixel processors onboard, however the higher working frequency and greater chip complexity of the ATI’s solution makes up for that easily. Unlike RADEON X1800 XT, the onboard graphics memory of the RADEON X1800 XL works 1.5 times slower, and its voltage level is only 1.8V. Besides, there are only 256MB of memory against 512MB by the top-end model. This is another argument proving that one of the major power consumers on the RADEON X1800 XT is the graphics memory. We could theoretically check this out by overclocking the RADEON X1800 XL GPU to the level of XT, however, we failed to do it this time, because neither RivaTuner, nor PowerStrip can work with the RADEON X1000 family at this point. RivaTuner 15.7 simply doesn’t recognize the graphics cards at all, and PowerStrip 3.61 displays the memory working frequency incorrectly and doesn’t allow adjusting the GPU frequency at all.

The more predictable results were obtained for the mainstream representatives of the RADEON X1000 family:

11W difference between the 12-pipeline RADEON X1600 XT and a 4-pipeline RADEON X1300 PRO probably results from the fact that the latter uses 8 GDDR2 memory chips, while RADEON X1600 XT is equipped with 4 GDDR3 memory chips.

Noise and 2D Image Quality

Unfortunately, only RADEON X1800 XT and RADEON X1800 XL demonstrated acceptable level of noise. Their cooling fans worked at their maximum speed only during the system boot-up. A few seconds later the cooling fan on the RADEON X1800 XT slowed down a little bit, and in another short while its rotation speed would again drop down to the minimal level. At this point the generated noise was minimal, nearly negligible. During the test session the fan rotation speed on the RADEON X1800 XT did increase a little bit, however, the noise level still remained within a comfortable range, even though all our tests were carried out in an open testbed.

RADEON X1800 XL acted very similarly with that only difference that the rotation speed of its cooler fan didn’t change as smoothly as by RADEON X1800 XT, so that the cooler would either work pretty loudly or pretty quietly: nothing in between. The overall impression was pretty comfortable. Sometimes, it even felt better than in case of the XT model, because the generated noise was free from that typical “plastic” sound of the ATI’s dual-slot cooling solutions. Anyway, RADEON X1800 XT and RADEON X1800XL would sound exactly the same in a closed system case.

RADEON X1600 XT and RADEON X1300 PRO left a completely different impression: their fans have always been working at the highest rotation speed and generated a lot of unbearable noise, which got especially annoying after you have been listening to it for couple of hours. The major noise component was the high-frequency squealing of the fan, so even in a closed system case the overall noise picture would hardly get any different. Luckily, the graphics card makers do not have to stick with the reference design of the cooling system when they develop their products, so I would expect most RADEON X1600 XT and RADEON X1300 PRO based graphics cards to acquire less noisy cooling solutions. I believe that RADEON X1300 PRO will look especially attractive if equipped with a passive fanless cooler based on heatpipes: this card has every chance to become a great multimedia center solution in this case.

All four graphics accelerators produced very good 2D image quality in resolutions up to 1800x1440x75Hz, which is typical of up to 99% of all contemporary graphics cards.

Testbed and Methods

We used the following tesbed for our today’s test session with the new graphics accelerator family from ATI:

ATI and NVIDIA drivers were configured as follows:

ATI CATALYST:

NVIDIA ForceWare 78.01:

Before we pass over to the results obtained in theoretical benchmarks, let’s discuss the advantages of the new FSAA and anisotropic filtering algorithms introduced by ATI in the new RADEON family.

Anisotropic Filtering Quality

It has already become a tradition that before we get to test graphics cards performance we check the quality of anisotropic filtering and tri-linear filtering they provide. Especially since we have a very good reason to do it this time: ATI declared the new enhanced anisotropic filtering algorithm for its RADEON X1000 products family, which we simply cannot overlook.

RADEON X1800

RADEON X800

GeForce 7800

Trilinear

Aniso 16x

Aniso 16x
Quality

As we see, there is not so much difference in quality compared with the tri-linear filtering by RADEON X850 XT. As for the tri-linear filtering by GeForce 7800 GTX, it reveals slightly harsher transitions between the mip-levels, although I wouldn’t regard it as something drastic, really.

As we see from these pictures, RADEON X1800 XT does boast somewhat better quality mode than any of its competitors or predecessors. However, you should keep in mind that this is not the default mode and it needs to be enabled specifically, so it can potentially slow down the performance.

I have to draw your attention to the fact that we haven’t found any real evidence pointing at the significant advantage of the enhanced AF mode over the standard AF mode. In other words, there is no big difference in the image quality of real games between the enhanced anisotropic filtering mode of the new RADEON X1800 XT and the standard anisotropic filtering of the new ATI solutions as well as of the other graphics cards.

Full-Screen Anti-Aliasing Quality

Besides the enhanced AF quality, ATI also announced that the new RADEON X1000 family would boast enhanced full-screen anti-aliasing algorithm for alpha textures.

NVIDIA GeForce 7800 GTX

ATI RADEON X1800 XT

ATI RADEON X1800 XT


no FSAA

 
no FSAA

 


FSAA 4x + TMS


FSAA 4x


aFSAA 4x


FSAA 4x + TSS

 

 


FSAA 8xS + TMS

 
FSAA 6x


aFSAA 6x


FSAA 8xS + TSS

 

 

Unfortunately, just like in case of Transparent AA by GeForce 7800, Adaptive FSAA of the new RADEON X1000 cannot bring any image quality improvement to Far Cry game. At least, we didn’t notice any visible differences.

Now let’s take a look at Half-Life 2 game.

NVIDIA GeForce 7800 GTX

 ATI RADEON X1800 XT

ATI RADEON X1800 XT


no FSAA

 
no FSAA

 


FSAA 4x + TMS

 
FSAA 4x


aFSAA 4x


FSAA 4x + TSS


 


FSAA 8xS + TMS


FSAA 6x

 
aFSAA 6x


FSAA 8xS + TSS

 

 

As we can see from the screenshots, adaptive anti-aliasing of transparent textures works fine on RADEON X1000, however, the actual image quality improvement is not that significant, just like in case of alpha-textures multi-sampling by NVIDIA GeForce 7 (TMS, transparent multi-sampling). I have to stress that the Adaptive FSAA of the new RADEON X1000 is of much better quality than the similar mode by GeForce 7800 GTX, however it is still much lower than what the competitor’s TSS (transparent textures super-sampling) would provide.

I would also like to say that adaptive anti-aliasing of alpha textures by RADEON C1800 XT may sometimes lead to their complete removal. In fact, it could be a drive issue, because the anti-aliasing masks can be set on the software level for ATI RADEON solutions.

So, the laurels for the best FSAA quality, in at least certain cases, will remain with NVIDIA for now.

Performance Hit in New Image Quality Modes

Of course, any image quality improvement cannot be free, so let’s take a closer look at the performance of RADEON X1800 XT in different quality modes.

As we can see, the performance of our hero in new FSAA and anisotropic filtering modes is quite up to the mark for comfortable gameplay. The only exception if Far Cry Pier level.

If you wish you may also compare the results of the new RADEON X1800 XT against the GeForce 7800 GTX tested in about the same conditions (for details see our article called NVIDIA GeForce 7800 GTX: New Architecture Exposed ), although we believe it is not quite correct to compare these two side by side.

Performance in Theoretical Benchmarks: RADEON X1800 XT/XL

For our theoretical testing session we used the following benchmarks:

Since this review is devoted not only to RADEON X1800 graphics solution, but also to RADEON X1600 and RADEON X1300, we performed three rounds of theoretical tests instead of one. Let’s start with the top representatives of the new graphics processor family: RADEON X1800 XT and RADEON X1800 XL. We will compare them against the following competitor solutions and predecessors:

Fillrate

This time we used a new version of Marko Dolenc’s Fillrate Tester, which was kindly submitted by the developer. Compared with the previous version, the new Fillrate Tester offers much broader testing features.

Judging by the pure fillrate results, RADEON X1800 XT demonstrates unattainable performance due to very high chip and memory subsystem working frequencies. RADEON X1800 XL looks not so overwhelmingly impressive, but it also outperforms slightly the GeForce 7800 GTX with its relatively modest working frequencies.

Since RADEON X1000 family cannot process doubled number of Z values per clock like GeForce 6/7 families would do, NVIDIA solutions lead the race here.

After that when we have only one texture, the performance of RADEON X1800 XT and GeForce 7800 GTX and GeForce 6800 Ultra levels out, while RADEON X1800 XL falls behind RADEON X850 XT PE for some reason. With two textures involved the situation changes dramatically: the fillrate of GeForce 6800 Ultra drops down almost to the level of RADEON X1800 XL, which finally starts outperforming the predecessor, RADEON X850 XT Platinum Edition. Top solutions from ATI and NVIDIA run almost neck and neck, with GeForce 7800 GTX being just a little bit faster. It is most likely to indicate that NVIDIA’s product boasts better caching algorithms.

When the number of textures rises to three, RADEON X1800 XT starts falling badly behind the rivals despite its very high clock frequency. Looks like better caching algorithms and 24 pipelines of GeForce 7800 GTX help a lot in this case. Adding another texture doesn’t change anything: GeForce 7800 GTX is the indisputable leader, while RADEON X1800 XT managed to outperform only GeForce 6800 Ultra. RADEON X1800 XL and RADEON X850 XT Platinum Edition finish the race side by side.

Well, everything indicates that RADEON X1800 XT/XL cannot do the texturing as good as NVIDIA GeForce 7800 GTX/6800 Ultra. ATI’s newcomers prove efficient only with no or just one single texture to be laid. RADEON X1800 XT can blame the fewer pixel processors for its failure in this test. However, as for the performance of the RADEON X1800 XL, which turned out slower than GeForce 6800 Ultra and sometimes even slower than RADEON X850 XT PE, it looks much more complicated than that. Was it the caching issue? Shall we blame raw unfinished drivers? Hard to tell. Although, it is still too early to make any conclusions; let’s see how the new GPU family will cope with the pixel shader tests, particularly, since ATI paid special attention to improving the shader performance of its newcomers.

Pixel Shader Performance

The new version of Fillrate Tester includes a lot of shader benchmarks. We selected only the ones with full precision pixel shaders: half-precision is no longer acute nowadays.

RADEON X1800 XT copes with the simple shaders version 1.1-2.0 almost as well as GeForce 7800 GTX. However, once the shader length increases, our hero starts falling far behind the rival. When it comes to the most complex shader of this test, PS 2.0 Per Pixel, GeForce 7800 GTX turns almost twice as fast as RADEON X1800 XT! RADEON X1800 XL finally manages to get ahead of GeForce 6800 Ultra, and to significantly reduce the gap to RADEON X850 XT Platinum Edition, although it cannot outpace the predecessor. Unfortunately, RADEON X1800 cannot boast much here: GeForce 7800 GTX does look better in this test.

Xbitmark results prove this point once again: the overall performance of RADEON X1800 XT is lower or the same as that of GeForce 7800 GTX.

Although there are a few exceptions. One of them occurs during the processing of multi-pass shaders (27-Pass Fur) and shaders using dynamic branching. In both these cases RADEON X1800 XT is undefeated. Moreover, in case of the hardest Heavy Dynamic Branching, it is twice as fasta s the rival.

In other words, Ultra-Threading architecture and potentially bigger number of time registers do provide the performance advantage in those cases where it is theoretically possible: during multi-pass rendering and processing of complex shaders with branching. Looks like Ultra-Threading and intellectual memory controller ensured impressively high results in 27-Pass Fur test. Unfortunately, NPR shader (hatch, 10 textures ps3) refused to run on RADEON X1000, and all Xbitmark displayed at that point was an empty screen.

As we have just found out from the simple pixel shader processing tests, RADEON X1800 can sometimes lose to NVIDIA solutions. However, the shader test version 1.1 included with the 3DMark 2001 SE benchmarking suite shows that RADEON X1800 XT and GeForce 7800 GTX work almost equally fast.

A similar test for pixel shaders version 1.4 shows that RADEON X1800 XT is a leader in low resolution and then slows down to the level of its major competitor in 1600x1200.

RADEON X1800 XT is very far behind GeForce 7800 GTX: ATI’s new solutions cannot withstand the more pixel processors of their rival. Maybe it is not just the number of pixel processors, but also the raw drivers, because RADEON X1800 XL is the slowest of all in this test, which shouldn’t be happening keeping in mind its extremely high clock frequencies. Again we can see that the new memory controller does have its positive effect on the performance of the new ATI solutions: as the resolution increases, RADEON X1800 doesn’t lose speed as dramatically as NVIDIA products.

A similar test from the 3DMark05 benchmarking suite once again proves that RADEON X1800 works very well with shaders rich in complex texturing. In our case the shader of the rock consists of two color maps, two normal maps and uses Lambertian diffuse shading. Only 24 pixel processors allows GeForce 7800 GTX to retain the leading positions in this test. RADEON X1800 XL also performs quite well here, yielding only to the elder brother.

Vertex Shader Performance

3DMark 2001 SE may be working not quite correctly with RADEON X1800 and prerelease CATALYST driver version: the results demonstrated by the XT and XL models are exactly the same. Moreover, the performance doesn’t drop as the resolution increases.

High frequencies and 8 vertex processors make RADEON X1800 XT and XL the winners in 3DMark03 vertex shaders processing.

Simple Vertex Shader test deals with transformation and lightning of a few models with high polygon count. There are 6 million vertexes in this test with only one light source. The shaders used here belong to Shader Model 1.1. In this case RADEON X1800 XT is an indisputable winner. RADEON X1800 XL is also ahead of all other testing participants although the gap to the nearest follower is not as significant as between the top model and the others.

Complex Vertex Shader test is a much more complicated benchmark, because each blade of grass on the meadow shown in this test is processed individually. RADEON X1800 XL gives in here for some reason and lets the RADEON X850 XT Platinum Edition take the second prize. RADEON X1800 is still beyond any competition.

From the technical point of view, the geometrical Xbitmark test is very similar to Simple Vertex Shader test: we have to transform and light a few models with high polygon count, however this tie the number of light sources involved varies from 0 to 8. As the number of light sources increases, the performance of RADEON X1800 XT/XL doesn’t drop down as dramatically as that of the other testing participants. It means that the new ATI solutions boast highly efficient vertex processors and memory subsystem.

Fixed T&L Emulation

With only one light source involved, RADEON X1800 XT is at the head of the race, while RADEON X1800 XL is at the very end of it, most likely because its clock frequency is quite low for architecture of this kind.

As the number of light sources increases up to 8, the overall picture changes, but still remains quite strange: RADEON X1800 XT suddenly starts yielding to GeForce 7800 GTX, which contradicts the results of the geometric Xbitmark test. RADEON X1800 XL, on the contrary, turns out capable of defeating GeForce 6600 and RADEON X850 XT Platinum Edition.

I believe the CATALYST drivers are not ready yet for efficient work with T&L emulation.

The point sprites performance test is just another vertex processor test. And it seems to be working just fine: the results demonstrated by RADEON X1800 prove its leadership in geometry processing.

Relief Rendering and Other Tests

RADEON X1800 XT doesn’t cope that well with the EMBM relief rendering. Besides, it appears no faster than RADEON X1800 XL in lower resolutions, which automatically indicates that the results are incorrect and require verification.

During Dot3 relief processing, RADEON X1800 XT competes with GeForce 7800 GTX, while RADEON X1800 XL runs as fast as its predecessors.

This test allows us to estimate how well-balanced the CPU-driver-GPU connection really is. Its engine uses real physical model, pixel shaders version 1.1 and vertex shaders version 1.4. From the better balance prospective, the leader here will be GeForce 6800 Ultra: it has the flattest graph of all. Speaking about the fps rates, the first place will belong to RADEON X1800 XT, and the second one - to GeForce 7800 GTX, and the third one to RADEON X1800 XL.

Performance in Theoretical Benchmarks: RADEON X1600 XT

We will compare the RADEON X1600 XT against the following rivals:

Fillrate

Unlike RADEON X1800, RADEON X1600 XT behaves completely differently in the fillrate test. The most remarkable result is the Z pixel fillrate: RADEON X1600 XT can process double number of Z values per clock, just like GeForce 6800, because RV530 features four texturing units and 8 Z-Compare units.

Throughout the entire test our hero yields only to the 12-pipeline NVIDIA solution but proves faster than RADEON X700 PRO. In the last case the advantage is not dramatic: it must the only 4 texturing units that prevent it from doing better this time.

Pixel Shader Performance

RADEON X1600 XT copes with simple pixel shaders version 1.1 as well as 2.0 with equal amount of effort, which is quite logical, actually. Since there are only four texturing units, the shader processors waste clock cycles waiting for the data from them. So, the advantages of12 pixel processors start to show off only when the shader gets longer. That is why the newcomer’s performance hardly drops throughout the whole test, unlike the performance of GeForce 6800 and RADEON X1800. The only exception is PS 2.0 Per Pixel shader.

According to a number of tests from the Xbitmark suite, the availability of only 4 TMUs prevents RADEON X1600 XT from showing what it is really capable of. It is especially evident in case of NPR shader (hatch, 8 textures), which name speaks for itself. RADEON X1600 XT shows its best performance in shaders with complex math1ematics and dynamic branching. In the first case RADEON X1600 XT benefits from high working frequency (590MHz against 325MHz by GeForce 6800), and in the second case from Ultra-Threading architecture.

Pixel Shader Tester 1.1 from the 3DMark 2001 SE benchmarking suite doesn’t use a lot of textures, so the four TMUs of our RADEON X1600 XT will hardly be the limiting factor here. As a result, the newcomer is the fastest during pixel shaders 1.1 processing in this particular test. At the same time, just like with Fillrate Tester, it yielded to other testing participants.

RADEON X1600 XT manages to get only to the level of GeForce 6800 in pixel shaders 1.4 test. And it looks quite logical to me: there are quite a few textures in this benchmark, so the four texturing units of the new RADEON can certainly limit its performance significantly.

I have to admit that the failure in 3DMark03 test dealing with pixel shaders 2.0 performance looks quite strange to me. The workload in this test is mostly math1ematical, and since RADEON X1600 XT has 12 pixel processors working at 590MHz, the results should theoretically be much higher than that. Maybe it is the raw CATALYST drivers that are to blame here.

Similar test but from 3DMark05 testing set puts everything in the right place. RADEON X1600 XT becomes the leader, although it doesn’t get too far ahead of GeForce 6800. Looks like there are some evident drawbacks of the RADEON X1000 modular architecture: ATI seems to have deprived its RV530 of too many useful things, trying to make it as compact and economical as possible.

Vertex Shader Performance

Both, GeForce 6800 as well as RADEON X1600 XT feature 5 vertex processors onboard. Although the newcomer from ATI has them working at a significantly higher frequency, which determines the excellent results in this test.

The same picture can be seen in the corresponding 3DMark03 test.

Even higher performance advantage over the previous generation solutions can be observed in Simple Vertex Shader test from 3DMark05 test suite. In this case RADEON X1600 XT proves more than twice as fast as GeForce 6800.

This advantage is slightly smaller in Complex Vertex Shader test, although RADEON X1600 XT remains an indisputable leader.

With no light sources involved, the performance of GeForce 6800 and RADEON X1600 XT is about the same. The latter gets ahead of the competitor with 4 light sources in the scene. And the best result is achieved when there are 8 light sources activated. This is another piece of evidence proving how efficient ATI’s new vertex processors are.

Fixed T&L Emulation

With only 1 light source RADEON X1600 XT yields to RADEON X800 PRO, though mostly in higher resolutions.

The same scene lit by 8 lights is also of no problem to the new RADEON X1600 XT, even though RADEON X800 PRO is still a little bit faster. You should keep in mind that the performance of RADEON X800 PRO was tested on a different platform, so it is not quite correct to compare these results against the performance of graphics accelerators with PCI Express interface. Moreover, as we have already mentioned, the results obtained from RADEON X1000 family in 3DMark 2001 SE do not seem trustworthy at this point.

5 vertex processors of the RADEON X1600 XT in the point sprites scene perform almost as fast as the 6 vertex processors of the RADEON X800 PRO. There is only one major difference: the vertex processors of the new solution work at much higher frequency. Also it has a more efficient memory controller – despite the 128-bit memory bus, its performance doesn’t drop down as greatly as that of the RADEON X800 PRO when the resolution increases.

Relief Rendering and Other Tests

RADEON X1600 XT copes quite fine with EMBM emulation. At least it shows the best result in this test.

During Dot3 emulation RADEON X1600 XT, RADEON X800 PRO and GeForce 6800 performed almost equally fast. Only RADEON X700 PRO fell far behind them.

It looks like RADEON X1600 XT is not as well-balanced as RADEON X800 PRO and GeForce 6800: its performance in the Ragtroll test drops very rapidly with the growth of the screen resolution. Will ATI’s newcomer behave like that in real games, too? We are going to look into it in our next articles devoted to gaming performance of the new ATI RADEON X1000 solutions.

Performance in Theoretical Benchmarks: RADEON X1300 PRO

We selected the following rivals for our RADEON X1300 PRO:

Since we did not have these graphics cards in our lab at that moment, we emulated them by reducing to the appropriate level the clock frequencies of RADEON X700 PRO, RADEON X600 XT and GeForce 6600 GT, respectively.

Fillrate

RADEON X700 demonstrates the highest theoretical fillrate: its GPU working at 420MHz frequency is equipped with 8 pixel pipelines, while RADEON X1300 PRO working at 600MHz can boast only 4 pixel processors and 4 texturing units.

The new architecture allows it to get ahead of the rivals during single texture processing. With two textures, RADEON X1300 PRO also performs well enough. However, the third and fourth textures do not let it retain the leading position any more: the laurels go to GeForce 6600. At the same time, during four textures processing RADEON X700 lacked cache capacity and hence fell down to the level of the budget GeForce 6200.

Pixel Shader Performance

8 pixel processors of GeForce 6600 working at 300MHz prove almost as efficient as 4 pixel processors of RADEON X1300 PRO working at twice as high frequency. The only exception is PS 2.0 4 Registers shader, in all other cases the performance of these two products is identical.

All in all, we observe certain parity between RADEON X1300 PRO and GeForce 6600. However, just like in the previous case the last three shaders using dynamic branching and working within the version 3.0 shader model demonstrate ATI’s indisputable victory.

Thanks to incredibly high clock frequency, RADEON X1300 PRO outperforms all the competitors, including RADEON X700 and GeForce 6600.

It is interesting but our hero fell behind GeForce 6600 in the benchmark measuring the pixel shader 1.4 performance.

We have also faced the same low-performance problem with our RADEON X1300 PRO in Pixel Shader 2.0 benchmark included into the 3DMark03 suite. I assumethat ATI’s driver developers are the ones responsible for this dramatic performance hit.

Keeping in mind the problems we have just revealed in the previous benchmarks, we were pleasantly surprised by the pretty good results obtained in 3DMark05. RADEON X1300 PRO managed to get ahead of the predecessors and leave behind all the competitors here.

Vertex Shader Performance

Due to very high clock frequency, RADEON X1300 PRO is seriously ahead of GeForce 6600 and RADEON X700, which is far from being a logical outcome, I should say.

The same test from the 3DMark03 benchmarking suite reveals a much more realistic result. 2 vertex processors of the RADEON X1300 PRO do not let it catch up with the GeForce 6600, and especially RADEON X700, which boasts 6 fully-fledged vertex processors, as you know.

It is interesting that in case of a simple vertex shader 2.0, RADEON X1300 PRO is again victorious and undefeated. It probably the fast rasterization unit that determines this success, due to high working frequency of the solution.

High core frequency and 128-bit vector ALUs allow RADEON X1300 PRO to show pretty decent performance during complex vertex shaders processing. Nevertheless, it is not enough to catch up with the RADEON X700 with its 6 vertex processors.

The same is true for the geometric Xbitmark test, where the newcomer yields only to RADEON X700.

Fixed T&L Emulation

The scene is not complex enough to take real advantage of the extended options offered by RADEON X1300 PRO for vertex shaders processing. Just like in a number of other geometric benchmarks it yields to GeForce 6600 and RADEON X700 here.

However, with 8 light sources RADEON X1300 PRO is slightly ahead of GeForce 6600, although it is still behind RADEON X700.

The new ATI solution does pretty well in the point sprites modeling test in almost all resolutions.

Relief Rendering and Other Tests

Due to frequencies difference RADEON X1300 PRO performs as fast as GeForce 6600 and RADEON X700 during EMBM, although I have to admit that GeForce 6600 still looks a little bit more attractive.

Another relief emulation, Dot3, places RADEON X1300 PRO ahead of RADEON X700, but GeForce 6600 still remains unattainably far ahead.

RADEON X1300 PRO seems to be a pretty well-balanced solution here and yields almost nothing to the RADEON X700 predecessor.

Performance during Video Playback

To evaluate the performance of our today’s testing participants during the playback of different video formats, we used Windows Media Player 10 with the patch to enable the Direct X Video Acceleration for WMV HD-content . For the test purposes we used the following movies:

All RADEON X1000 family members demonstrated identical performance here.

New ATI graphics accelerators proved most efficient during WMV HD file playback: the CPU utilization was only 41% in the most complex scene of the movie, where a huge flock of birds raised up into the sky from the lake surface. The minimal CPU workload in this test was 15%, which is more than twice as good as the results obtained by GeForce 6/7 and RADEON X850 XT Platinum Edition. The average CPU utilization during the test movie playback was 20%-25%, which can surely be called an excellent result. According to the results of this test we can call new RADEON X1000 the best solution for HD content playback.

The playback of a movie in DiVX format was less successful: the new ATI product family showed the highest CPU utilization rates during this test. The minimum CPU workload throughout this test equaled 8%, which is better than the 10%-11% by GeForce 6/7 and worse than 2% by RADEON X850XT Platinum Edition.

As for the playback of a standard DVD disc, the newcomers did real well there, leaving no chances for other testing participants.

Well, as we can see from the results of these few tests, ATI Avivo technology works perfectly well, being especially efficient during HD and DVD content playback. Moreover, the results obtained for the 1080p movie can be even called sensational: there haven’t yet been any graphics solutions out there that would cope so well with a resource-hungry format like that. By the first GeForce 6800 Ultra modifications with disabled video processor, the CPU Utilization level could be as high as 100%. Now this number has dropped down to 20%-30% due to ATI’s Avivo technology. At the same time, the CPU utilization during DivX playback appeared higher than by the competitors, although it is still not too bad to cause you any serious inconvenience. I think that the final CATALYST version for RADEON X1000 may be free from this drawback.

It is remarkable that all RADEON X1000 models demonstrated the same performance during video playback. Looks like the Avivo processor works at its own frequency, which is the same for all members of the new family, and isn’t connected with the GPU clock rate.

Conclusion

Without any doubts, from the architectural point of view, ATI RADEON X1000 is the most advanced graphics architecture for the PC to date. The RADEON X1800 – ATI’s newest top-of-the-range offering – not only sports Shader Model 3.0, High Dynamic Range and image quality enhancement capabilities – such as quality area anisotropic filtering and adaptive antialiasing – but also provides future proof by featuring H.264 video decoding hardware acceleration, really speedy pixel shader 3.0 branch execution as well as efficient and reprogrammable Ring Bus memory controller.

When working on the RADEON X1000 product line in general and the top-of-the-range X1800 XT solution in particular, ATI paid special attention to the practical value of each of the above mentioned features. According to the company representatives, the graphics card makers try to take advantage of all the features offered by the new architecture, making sure that nothing will be wasted. Today we saw that the Shader Model 3.0 support was really done for good: just look how the RADEON X1600 XT manages to leave behind a far more expensive and enhanced GeForce 7800 GTX during dynamic branching and pixel shaders 3.0 processing. In addition, ATI’s new RADEON X1000 family copes perfectly well with multi-pass rendering, and full-screen anti-aliasing, according to our preliminary observations.

Taking into account very efficient architecture, namely Ultra-Threading Dispatch Processor, reprogrammable memory controller, high clock frequencies and a great number of execution units (ALUs), we expect RADEON X1800, RADEON X1600 and RADEON X1300 to show worthy and competitive results during our gaming test session.

However, things are not so completely cloudless. The fastest product in the new family, the RADEON X1800 XT consumes a lot of power and sometimes still fails to outperform the competitor in complex pixel shaders 2.0, because GeForce 7800 GTX boasts more pixel processors onboard.

RADEON X1800 XT, as well as RADEON X1800 XL are built on relatively long PCBs and use massive cooling systems, which can hardly be regarded as an advantage. Although, we cannot call their competitors compact either. Moreover, since the system cases have grown bigger during the last couple of years, only a dual-slot design of the cooling system used for the top-of-the-line solution may be considered too bulky and inconvenient. RADEON X1600 XT and RADEON X1300 PRO are designed on more compact and practical PCBs, although their reference cooling systems could use some improvement in terms of the level of generated noise.

We will not draw any conclusions about the perspectives of these new graphics solutions in the today’s market today, because it would be possible only when they go through our detailed gaming test session. So, what I am driving at is: stay tuned for more articles on the new ATI RADEON X1000 product line!