by Anton Shilov , Alexey Stepin
01/24/2006 | 08:28 PM
The famous Radeon X1800 product series by ATI Technologies was delayed significantly from its original date and when it hit the market, its architectural advantages over the competing GeForce 7800-series were not obvious for the consumer. While ATI’s Radeon X1800 XT was significantly faster compared to the GeForce 7800 GTX in a variety of benchmarks, it was left in the dust by the GeForce 7800 GTX 512, a product that was never available widely, but that created “the right” effect for the whole 7800 lineup during the holiday season.
On the 24th of January, 2006, ATI tries to recapture leadership in both performance and availability with the Radeon X1900-series: more than 5000 units will be available for purchase on the first day, 60% of which are XTX flavours, with more than 50 000 graphics cards already made. Perhaps, the numbers are not really great, but we should keep in mind that we are talking about offerings that will cost $549 (Radeon X1900 XT 512MB), $599 (Radeon X1900 XT CrossFire Edition 512MB) and $649 (Radeon X1900 XTX 512MB), demand for which is unlikely to count in hundreds of thousands, especially right after the New Year parties.
The new Radeon X1900-series not only offers instant availability, but also improved performance amid moderate increase in the size of the die and transistor count. The newbie features 48 pixel shader processors, three times more than the Radeon X1800 XT and two times more than the GeForce 7800 GTX. But there is a question whether modern games truly need extremely high pixel shader performance. Read on to find out whether and by what margin the Radeon X1900 XTX can beat Nvidia’s latest GeForce 7800 GTX 512 in the broadest set of benchmarks available over the Internet.
Unlike microprocessors, which performance has been stagnating for several years now, graphics processing units (GPUs) consistently improve their speed substantially at least once a year. This happens not only because developers of graphics chips are free to implement new architectures much easier than designers of central processing units (CPUs), but also because scaling of graphics performance is very easy to achieve.
Speed in 3D applications essentially depends on a number of general factors featured by hardware: number of execution units and their clock-speeds. Since graphics processing is highly-parallelized, more units mean better performance, but since there is a plenty of complex operations to do, there will always be demand for higher performance on every single arithmetic logic unit (ALU), hence, efficiency and high clock-speed are always advantages. Given that graphics processors have been again and again improving clock-speeds and the number of execution engines for the most recent decade, we can only expect GPUs to continue doing so going forward.
But constant performance improvements cannot be done without understanding of modern games demands. Years back games created effects using multi-texturing and graphics chips needed as many texture mapping units as possible to demonstrate leading performance. Today games generate eye-candy using pixel shaders that consist of arithmetic and texture instructions. The recent trends show that modern pixel and vertex shaders tend to be a lot more math1-intensive than those a couple of years ago, thus, to offer a high-performance GPU, engineers need to shift their efforts towards the improvements of the number of engines that perform math1ematical tasks.
For designers of chips a difficulty is that it is pretty hard to determine the demands of future games while keeping performance of current titles on the high-level. As a consequence of that, we are seeing two ideologically different types of graphics processors developed by ATI Technologies and Nvidia Corp. The former unveils its Radeon X1900-series GPU that features 48 pixel shader processors, 8 vertex shader processors, 16 render back-ends (ROPs) and 16 texture units, the latter is expected to offer its GeForce 7900-series that sports 32 pixel shader processors, 32 texture address units and 16 ROPs this spring. As we see, ATI hopes that modern games will demand more arithmetic power than texture operations, whereas Nvidia seems to be sure that there is equal demand for texture and math1 power in new titles.
Which approach is better for today and tomorrow? We need to test both to find out.
Games tend to be more and more glorious when it comes to visual attraction. There is a tremendous difference in quality of the GLQuake released a decade ago with the Half-Life 2: Lost Coast released months back. It is not a secret that pixel shaders significantly simplify development of modern games and it is obvious that next-generation titles will only have more of them. A question for hardware developers is what will the next-generation pixel shaders look like. Particularly, it is completely clear that pixel shaders will tend to be more math1ematically intensive, as we mentioned above, but what remains unknown, is the ratio between arithmetic and texture operations required by future titles.
In fact, we still do not know the demands of even current games: the developers are tight-lipped over the technologies they use to create effects. We have seen very few software suites that can actually determine the bottlenecks of the current games – one is ATI’s plugin for Microsoft’s PIX , another one is Nvidia’s PerformanceHud . A problem with launching such suites is that game developers protect their titles from being analyzed by such software by third parties. That said, we have to believe what hardware developers are reporting about requirements of modern games based on claims allegedly made by game developers.
ATI claims that even current games like F.E.A.R. requires a GPU to perform up to 7 arithmetic operations on one texture operation, whereas the ratio for the Splinter Cell: Chaos Theory is 8 arithmetic operations on 1 texture operation. Keeping this in mind, ATI’s latest graphics processors – Radeon X1600- and Radeon X1900-series – feature three times more pixel shader processors compared to texture address units. The company believes that 3:1 ratio of arithmetic to texture units provides the ideal balance for current and future 3D performance.
While generally ATI’s approach seems to be correct since the company has to consider transistor count and die size for its chips, texture fetch performance and bandwidth are still crucial for the graphics processors. For instance, when creating a realistic material, a pixel shader has to do a number of lookups to large textures, which means that performance limitations will come from both insufficient number of texture address units and/or insufficient memory bandwidth. While game developers have chances to reduce the demand of their titles for high memory bandwidth and texture operations by using more pixel shaders to create certain effects, they seem to have been reluctant in doing so for a while as there are a lot of graphics processors for which high math1ematical load may mean significant performance degradation; hence, ATI’s new chips won’t necessarily become the winners across the board just now.
However, when games will acquire math1ematically intensive effects, such as parallax occlusion mapping and others, when the main performance limiter will be pixel shader performance, the new approach to design a GPU is likely to show its advantages.
One of the main achievements for ATI in the case of the Radeon X1000 architecture was easy scalability: adding more pixel shader processors is now much easier for ATI and does not require designing a new GPU from the scratch, which simplifies adding performance to the part greatly.
Having developed an architecture that swaps traditional pixel pipeline for a set of blocks that are controlled by a special ultra-threading dispatch processor, ATI gave us a glimpse onto the future of graphics processors as well as its abilities to boost performance of the Radeon X1000 chips.
The Radeon X1900 graphics chip, also known under code-name R580, is a reincarnation of the Radeon X1800 visual processing unit (VPU) with 48 pixel shader processors, enlarged by 50% hierarchical Z-buffer (HyperZ buffer), higher amount of general purpose register arrays as well as Fetch4 feature designed to accelerate lookup of textures consisting of one component by the factor of four (when looking up different types of textures with single-component values [such as shadow maps], Fetch4 allows four values from adjacent addresses to be sampled simultaneously, which effectively increases the texture sampling rate by a factor of 4).
Radeon X1800 (left) and Radeon X1900 (right) internal architecture in brief. Click to enlarge
Pure pixel shader power of the R580 increased 3 times - or by 200% - over the R520, whereas the transistor count increased by about 60 million – or by approximately 20% in total. It remains to be seen whether actual gaming performance will increase by 20%, but we have to keep in mind that the Radeon X1900 was architected for future games and immediate benefit may be lower than long-term benefit.
Let’s have a look at the specifications of the Radeon X1900:
The new RADEON X1900 XT/XTX doesn’t look much different than the RADEON X1800 XT at first:
Radeon X1800 XT 512MB. Click to enlarge
Radeon X1900 XTX. Click to enlarge
The PCBs of the RADEON X1900 and X1800 are not identical, but a careless observer may be misled into supposing that the new graphics cards from ATI use the older PCB design.
Take a look at the rear part of the card where the power circuit is placed: this section is fully populated, from top to bottom, on the RADEON X1900 XT. There are 5 switch transistors and 5 inductance coils on the RADEON X1800 XT (but there exist versions of the card with more of these elements installed) and one set of seats is left empty. There are 7 such seats on the RADEON X1900 XT and all of them are occupied because the R580 consumes more power than the R520.
Radeon X1800 XT 512MB with cooler removed. Click to enlarge
Radeon X1900 XTX with cooler removed. Click to enlarge
The reverse side of the PCB shows appropriate changes, too. It carries more tiny capacitors and there is a seat for an additional power element near the memory chips, which is not soldered in, though. With the cooler removed, you can also see a power controller chip in about the same spot, but on the face side of the card, along with a MOSFET and some smaller elements. We guess this circuit is responsible for powering the memory chips. A multi-channel Volterra VT1103 controller is still the heart of the power circuit of the card.
And this is where the difference between RADEON X1900 and RADEON X1800 ends: the left parts of the PCBs are identical. The wiring of the PCB of the RADEON X1800 ensures stable operation of the GPU and memory at high clock rates, so ATI had no need to design a completely new PCB from scratch. We’d like to remind you that most top-end graphics processors are pin-compatible, so graphics card manufacturers may try to use the same PCBs for the R520 and R580. But considering the higher power requirements of the new chip and the constant evolution of PCBs, we suppose that the RADEON X1900 PCB may be used for some RADEON X1800-based products, but not vice versa.
Radeon X1800 XT CrossFire Edition. Click to enlarge
Radeon X1900 XTX CrossFire Edition. Click to enlarge
The same is also true for the RADEON X1900 CrossFire Edition which too differs from the RADEON X1800 XT CrossFire Edition in the power circuit only. We didn’t find any differences in the design of the Compositing Engine which we described thoroughly in our new review of ATI CrossFire technology.
The snapshot shows that the die surface of the R580 is somewhat larger than the R520, and the former is a square whereas the latter is a rectangle. The R580 in its turn is smaller than the less complex NVIDIA G70 chip thanks to 0.09-micron tech process.
Nvidia G70, ATI R580, ATI R520 visual processing units. Click to enlarge
Such a small area increase is explained by the rather small increase in the amount of transistors, from 321 to 384 million. The R580 has 48 pixel shader processors on board as opposed to the R520’s 16 and it means the pixel processors don’t require too many transistors. Easy to calculate, 48 pixel processors are comprised of less than 90 million transistors – a quarter of the total amount. The rest of the transistors make up the caches, texture units, ring-bus memory controller, ultra-threading dispatch processor, etc.
As for the marking, the text “ENG SAMPLE” speaks for itself – this is an engineering sample of the R580 chip. It is dated the 45th week of the last year, i.e. early November. ATI Technologies said they had received the first batch of commercial wafers from TSMC at the end of November and that they already had working samples of the R580 even before the official announcement of its predecessor, RADEON X1800 (R520).
The GPU die is protected against damage with a traditional metal frame. Since we are dealing with a RADEON X1900 XT, the graphics processor is clocked at the same frequency as on the RADEON X1800 XT, i.e. at 625MHz. The graphics core frequency of the RADEON X1900 XTX is 650MHz.
Like the RADEON X1800 XT, the RADEON X1900 XT CrossFire Edition uses Samsung K4J52324QC-BJ12 GDDR3 memory. These chips are 512Mbit each, so eight such chips suffice for a total of 512MB of graphics memory with 256-bit access. The access time of the chips is 1.25 nanoseconds; they are rated to work at 2.0V voltage and at 800 (1600) MHz frequency. The memory of the RADEON X1900 XT works at a lower frequency than on the RADEON X1800 XT: 725 (1450) MHz against 750 (1500) MHz.
The memory frequency of the higher-performing RADEON X1900 XTX is 775 (1550) MHz and Samsung’s 1.1ns K4J52324QC-BJ11 chips are employed. These chips can theoretically be clocked at 1800MHz, but ATI decided to keep to more conservative settings. It is quite possible that ATI’s partners will come up with “updated” versions of RADEON X1900 XTX with higher chip/memory frequency at some moment in the future. Such graphics cards may appear along with the G71 chip (GeForce 7900), an updated version of the G70 processor, or earlier as “extreme” versions of the card.
The cooling system of the RADEON X1900 XT/XTX is the same as the RADEON X1900 XT’s: a blower is pumping air from inside the PC case and into the thin-ribbed copper heatsink with a massive sole and is then exhausting it to the outside. The heatsink contacts the GPU die through a layer of dark-gray thermal paste; the memory chips give their heat away to the cooler’s aluminum sole via elastic rubber-like thermal pads. This cooling system is quite efficient and quiet even at the lowest speed of the fan. When the fan speed increases, the card becomes noticeably louder, the plastic casing acting as a resonator.
Power consumption is by far not the least important of a modern graphics card’s parameters, so each new device we receive in our labs has to pass an appropriate test. The appetite of the Radeon X1900 XT was measured on a special testbed with the following configuration:
We measured the power consumption of the card with a digital multimeter Velleman DVM850BL (0.5% measurement accuracy). To put a Peak 3D load on the card we ran the first SM 3.0 graphics test from 3DMark06 in a loop at 1600x1200 resolution and with enabled 4x FSAA and 16 AF. Then we created an extremely high 2D load by launching the 2D Transparent Windows test from Futuremark PCMark05. Here are the results:
The Radeon X1800 XT has been considered the most voracious graphics card up till now, consuming over 110W when running heavy 3D applications. The new Radeon X1900 XT eats up more, but only by a negligible 8 watts, despite its 48 pixel shader processors. The senior model in the series, Radeon X1900 XTX, is going to consume a little more, like 2-4 watts extra, considering the relatively small difference in the clock rates of the two cards. The results also suggest that the graphics processor is not the single heavy consumer on a modern graphics card. We are sure a big portion of those 120 watts falls on the 512 megabytes of high-speed GDDR3 memory, although we don’t have the means to check out this supposition.
Since the Radeon X1900 XT differs but very little from the Radeon X1800 XT in terms of power consumption, the power supply recommendations remain the same. If you want to use a Radeon X1900 XT or XTX, you will need a high-quality 450-500W PSU. And if you are into multiple GPU configurations and want to run two Radeon X1900 cards in CrossFire mode, you will need a 500-550W or better power supply, the same as you would need for a CrossFire system with two Radeon X1800 XT.
ATI Radeon X1900 XT/XTX graphics cards are equipped with the same cooling system as installed on the ATI Radeon X1800 XT, so their acoustic characteristics are identical. The fan works at its full speed only at the start of the system and then steps down to the min speed and becomes almost completely silent. The fan speed may occasionally increase in 3D mode, but the noise remains in a comfortable range even then, although there is a rather irritating “plastic” tone in the noise due to resonance in the cooler’s sealed casing. The fan seems to be ready to speed up even more if the GPU die temperature becomes still higher, but it didn’t do so even once during our tests. The card was quiet on our open testbed, so it is going to be perfectly silent in a closed PC system case.
We couldn’t check the overclockability of our sample of the card, although ATI Technologies had given us the latest version of the WinClk utility. The program just did not recognize the installed RADEON X1900, while the rest of overclocking tools do not support the new series of graphics cards from ATI as yet.
Both the ordinary version of Radeon X1900 XT/XTX and the Radeon X1900 XT CrossFire Edition yielded an impeccably-looking 2D picture in all resolutions supported by our test monitor, including its highest 1800x1440@75Hz mode.
We tested the performance of the ATI Radeon X1900 series on this platform:
We set up the ATI and Nvidia drivers in the same way as always:
We select the highest graphics quality settings in each game, identical for graphics cards from ATI and Nvidia, except for the slight sim Pacific Fighters that requires vertex texturing for its SM 3.0 rendering mode. We do not edit the configuration files of the games. If possible, we use the in-game benchmarking tools to record and reproduce a demo and then measure the reproduction speed in frames per second. Otherwise we measure the frame rate with the FRAPS utility. If it is possible, we measure minimal as well as average fps rates to give you a fuller picture.
We turn on 4x full-screen antialiasing and 16x anisotropic filtering in the “eye candy” test mode from the game’s own menu if possible. Otherwise we force the necessary mode from the graphics card driver. We don’t use the “eye candy” mode at all if the game engine doesn’t support FSAA.
The following graphics cards took part in this test session, besides the Radeon X1900 series:
These games and applications were used as benchmarks:
3D Shooters with First-Person View
3D Shooters with Third-Person View
As we see, the Radeon X1900 XTX is only slightly better compared to the Radeon X1800 XT in simple pixel shaders amid massively improved theoretical pixel shader power, which means that the actual performance limiter in cases of simple pixel shaders 1.1, 1.4 and 2.0 is memory bandwidth and caching efficiency, not the lack of computing power. The GeForce 7800 GTX 512 with its lighting speed 1700MHz memory is substantially ahead of both Radeon products in such cases.
Once pixel shaders get complex, the Radeon X1900 XTX leaves the Radeon X1800 XT behind by the factor of 2 and also manages to outstrip the GeForce 7800 GTX 512, but not that significantly.
Obviously, the Radeon X1900 XT demonstrates excellent results in cases where outstanding math1ematical power is required and shows magnificent results in scenes that include dynamic branching and a lot of math1s.
Since ATI has implemented special branch execution units into its Radeon X1000-series, the new Radeon X1900 XTX leaves no chances to the GeForce 7800 GTX 512 and manages to leave the rival in the dust with 3-4 times performance lead.
We can notice that in the case of shaders that depend on arithmetic performance of the GPU the Radeon X1900 XTX may be two times faster than the predecessor. Nevertheless, given that the shaders originally written for the Radeon X800 and the GeForce 6800-class hardware are so effortless for the Radeon X1900 XTX and the GeForce 7800 GTX 512 that the main factors that limit performance turn out to be texture fetches and cache efficiency/memory bandwidth.
The outcome is that even in synthetic benchmarks that reproduce typical workload, the GeForce 7800 GTX 512 showcases pretty high results, which implies that the product is, in fact, very well-balanced.
There is hardly a point to talk about performance in pixel shaders 1.1 where memory bandwidth and cache efficiency play the most significant roles today. This is the last time when we publish high-end GPU results in pixel shader 1.1 test from the 3DMark 2001 SE which was released slightly less than 5 years ago.
In case of pixel shaders 1.4 the GeForce 7800 GTX 512 has the considerable lead over the Radeon X1900 XTX, but the framerate is so high that nobody is likely to see any difference.
The pixel shader 2.0 test from the 3DMark03 package contains neither high-resolution textures nor tough pixel shaders. We cannot explain why the Radeon X1900 XTX is outperformed by the GeForce 7800 GTX 512 by about 100%, nor can ATI.
Pixel shader 2.0 test from the 3DMark05 and 3DMark06 benchmarks shows a light moving on the rough surface. Generally speaking, the test is not math1ematically heavy, but it is dependant on texture fetches, thus, on memory bandwidth.
The Radeon X1900 XTX is virtually two times faster compared to the Radeon X1800 XT, but the GeForce 7800 GTX 512 manages to come to the leader very close.
Surprisingly, the same shader from the 3DMark06 package performs differently compared to the case of the 3DMark05.
The Perlin Noise pixel shader 3.0 test of the 3DMark06 computes six octaves of 3-dimensional Perlin simplex noise using a combination of arithmetic instructions and texture lookups. The pixel shader used in this test consists of a total of 48 texture lookups and 447 arithmetic instructions, resulting in a total of 495 instructions and around 9:1 ratio of arithmetic instructions to texture instructions. All texture lookups are made into a single 32 bit 256x256 texture (64KB), which keeps the memory bandwidth requirement reasonably low despite the large number of lookups.
Needless to say that the amount of the arithmetic instructions means that the graphics card with the highest math1 power will claim performance lead. Judging by the results memory bandwidth still influences the benchmark results, however, it is obvious that 48 pixel shader processors allow the Radeon X1900 XTX to leave the GeForce 7800 GTX 512 far behind, while the latter leaves the Radeon X1800 XT in the dust because of higher amount of pixel processors and higher memory bandwidth.
As in case of pixel shaders 1.1, vertex shader 1.1 test shows that performance is limited by something else than vertex shader performance of the modern VPUs.
Approximately 660 thousand triangles are skinned in each frame using vertex shaders 1.1. As we see, there is virtually no scaling across resolutions, which means that the test is actually limited by vertex shader performance, or, at least, something else related to vertex shaders. Generally, the Radeon X1900 XT is ahead here, even though the gap could be larger keeping in mind the 100MHz clock-speed difference between the newbie and the GeForce 7800 GTX 512.
Short and simple vertex shaders 2.0 are performed pretty well by all high-eng graphics cards today, but the Radeon X1900 XTX seems to be notably ahead of the competition due to high clock-speeds.
The same vertex shader 2.0 from the 3DMark06 benchmark, however, performs notably better on the Nvidia GeForce 7800 GTX 512 hardware and slight better on the Radeon X1800 XT GPU. We do not know the reason for that, as Futuremark claims in its documents (1, 2) that these shaders are identical. Moreover, it is unclear which result is correct for Nvidia GPU.
There is no difference between 3DMark05 and 3DMark06 for the case of the complex vertex shader 2.0:
The gap between the Radeon X1900 XTX and the GeForce 7800 GTX 512 has shrunk significantly compared to the previous case in the 3DMark05, even though the former still maintains lead. The reason for that may be the central processing unit that calculates fractal noise for this test scene, even though Futuremark believes that CPU impact here is negligible.
To measure the video decoding capabilities of the Radeon X1900 XTX with various formats, we carried out a few tests using the popular Windows Media Player 10 (with the patch that enables DirectX video acceleration for WMV HD content available here) and our standard selection of video clips:
The CPU load during video playback:
As we see, there is no substantial difference between the Radeon X1900 XTX and the predecessor – the Radeon X1800 XT when it comes to video playback.
We have not checked the H.264 decoding intentionally this time as to do so we needed a decoder by Cyberlink that can take advantage of the H.264 decoding capabilities of the Radeon X1000 family. The decoder is currently available for additional fees only, just like Nvidia’s PureVideo decoder.
Using per-pixel lighting, normal maps, bloom and other post-processing effects, Battlefield 2 should show off well the R580’s 32 additional pixel shader processors.
And this is really so! The previously unrivalled GeForce 7800 GTX 512 has got a worthy opponent in this test and it is the Radeon X1900 XTX.
These graphics cards match one another’s “pure speed” and the Radeon X1900 XT even goes ahead in high resolutions at the 4x FSAA + 16x AF settings. The gap is the biggest in 1600x1200 and the barrier of 100fps is successfully surmounted. The GeForce 7800 GTX 512 and the Radeon X1800 XT do ensure an average performance of about 85fps, though. This is more than enough for comfortable play.
The exceptional performance of the Radeon X1900 XTX in this test is the consequence of its having more pixel shader processors as well as a 50% larger HyperZ buffer.
Besides using stencil shadows, this shooter’s engine works with OpenGL which is another way of saying the Radeon X1900 XTX just can’t win here. And it is really slower than the GeForce 7800 GTX 512. There are quire a lot of pixel shader in these Chronicles , so the new graphics card from ATI enjoys a considerable speed boost over the previous ultra-high-end model of the Radeon series. The improved HyperZ contributes to the better results of the newer card, too. The Radeon X1900 XTX offers you higher resolutions for comfortable play: 1600x1200 and higher in the pure speed mode and up to 1280x1024 at the “eye candy” settings.
The Call of Duty 2 engine developed by Infinity Ward supports dynamic lighting and shadows, advanced smoke effects, normal maps and other superb visual effects, making the 48 pixel shader processors even more appropriate than in Battlefield 2 .
The Radeon X1900 XTX is like a higher-league performer here. Just compare its speed of 125fps with the GeForce 7800 GTX 512’s 100fps even in the lowest resolution of 1024x768. All the tested resolutions are playable on the new top-end Radeon, including 1600x1200.
It retains its leadership in the “eye candy” test mode and stops very short of 60fps in 1600x1200.
Doom 3 generally runs better on GeForce 7 series graphics cards as it uses dynamic stencil shadows and OpenGL. The Radeon X1900 XTX only has some advantage over the Radeon X1800 XT in high resolutions, yet this is not enough to overtake the GeForce 7800 GTX 512.
It’s better for ATI in the “eye candy” mode: the Radeon X1900 XTX is a mere 5-7% behind the GeForce 7800 GTX 512; this may due to the difference in their memory bandwidth.
Both these graphics cards are powerful enough for you to play in 1600x1200 resolution, although the Radeon X1900 XTX is formally a little slower than 60fps.
Although not a recent title, Far Cry is still an etalon of image quality in a 3D shooter and many game developers would do well to learn from it. Moreover, Far Cry is one of the few existing games that support floating-point color representation.
We used to refer to the pixel shaders employed on the Pier level as to very complex ones, but they are not a problem at all for a modern top-end graphics card. It is now the performance of the system’s central processor that is the bottleneck in this game; it’s only in 1600x1200 resolution with enabled FSAA and anisotropic filtering that we can make sure the Radeon X1900 XTX is really faster than the Radeon X1800 XT. And of course all top-end graphics cards give you a playable frame rate even in the hardest visual mode of this game.
The Radeon X1800 XT was always slower than the GeForce 7800 GTX and the GeForce 7800 GTX 512 on the Research map in the “pure speed” mode, even though it was originally targeted at executing long pixel shaders. But Nvidia’s superiority in this test has come to an end because the GeForce 7800 GTX 512 now has to share the top position with the new Radeon X1900 XTX, even though it took the latter a threefold increase in the number of the pixel shader processors to achieve this parity.
It’s no different in the “eye candy” mode, except that in 1600x1200 resolution the Radeon X1900 XTX beats Nvidia’s top performer with the help of a more advanced memory controller. Note also that even the slowest graphics card we included into this review, Radeon X1800 XL, delivers an average speed of about 64fps here.
As usual, we checked the new graphics card in Far Cry ’s HDR mode. The final version of the game patch that enables HDR for graphics cards of the Radeon X1000 family isn’t yet released and the version we have at our disposal is not very fast and does not allow using FSAA along with HDR; it is also unavailable to the public. So, the results that follow below should only be considered as a preliminary performance estimate.
The Radeon X1800 XT is very slow, while the Radeon X1900 XTX looks quite competitive even against the GeForce 7800 GTX 512. The new solution from ATI Technologies makes up for the imperfections of the patch with its 48 pixel shader processors and other improvements. The resulting performance is close to the desired 60fps in 1600x1200; 1280x1024 resolution is easily playable.
F.E.A.R. is probably the most shader-heavy game of today and the Radeon X1900 XTX might be supposed to beat the GeForce 7800 GTX 512 easily in such a test, but it is not quite so in practice.
The GeForce 7800 GTX 512 wins the lower resolutions and it is only in 1600x1200 that the new graphics card from ATI overtakes it.
The “eye candy” mode is quite a different story – the improved HyperZ technology brings in some dividends. The new Radeon is the best in every resolution and even permits to play comfortably in 1280x1024 resolution, which has been hitherto unplayable on any graphics card. The Radeon X1800 XT didn’t exactly hit the 60fps mark which is the frame rate you need to fully enjoy a first-person-view 3D shooter.
As a consequence of rapid development of graphics hardware, Half-Life 2 has moved down to the same category as Far Cry . These legendary shooters may still be considered etalons of quality, but they are not very hard for a modern graphics subsystem anymore.
All today’s top-end graphics cards, except for the Radeon X1800 XL, yield over 75fps in Half-Life 2 . As for the Radeon X1900 XTX, it shows but a glimpse of its potential, being severely limited by the CPU. In 1600x1200, however, it is noticeably faster than the Radeon X1800 XT.
The small tech demo Half-Life 2: Lost Coast is not to be compared to Half-Life 2 as it is much more demanding than the original game, first of all because of its using advanced SM 3.0/HDR-based effects and dynamic shadows. This should be a favorable environment for the Radeon X1900 XTX which not only has 48 pixel shader processors, but also features technologies for shadow mapping acceleration.
This looks like a well-deserved win. The new graphics card is not far ahead of the GeForce 7800 GTX 512, but you’d better compare it with the Radeon X1800 XT: the difference between the senior Radeon X1800 and X1900 models is as big as 45% in 1600x1200!
The Radeon X1900 XTX doesn’t have any advantage over the Radeon X1800 XT due to the simplicity of the game engine; they are both far slower than the GeForce 7800 GTX 512 in all the modes and resolutions. On the other hand, the game runs at a comfortable speed on these Radeons even in 1600x1200 with enabled full-screen antialiasing and anisotropic filtering.
It’s only in 1600x1200 resolution that we can make any comparisons because the game speed depends just too greatly on the CPU performance. The Radeon X1900 XTX does not win, but it does not lose even a single frame-per-second to the GeForce 7800 GTX 512, either.
Nothing changes as we switch to the “eye candy” settings, except that the GeForce 7800 GTX 512 manages to win in 1600x1200 resolution thanks to its 850 (1700) MHz memory frequency as well as new drivers. Note that the Radeon X1900 XTX and the Radeon X1800 XT have almost the same speed here. It means the game really values high graphics memory frequency and the improvements in HyperZ and in the sampling of single-component textures have no effect here.
There are some shader-based effects in Serious Sam 2 , too. Besides normal maps it supports parallax mapping and HDR.
The game still exhibits a strong liking for the GeForce 7 architecture and the Radeon X1900 XTX is only a few fps better than the previous model despite the threefold difference in the number of their pixel processors. The Radeon X1900 and X1800 cards can’t even maintain a playable frame rate in 1024x768 with FSAA disabled. Since the Radeon X1800/X1900 architecture has proved its worth in the rest of the games, we are inclined to put the blame on the Serious Sam 2 game engine.
Unreal Tournament 2004 is an easier application even than Far Cry or Half-Life 2 , being not very generous for shader-based special effects. The problem with the lower performance ceiling for ATI’s graphics cards can be observed here: all the devices, except for the Radeon X1800 XL, are limited by the CPU. The cards from Nvidia are likewise limited, but the speed ceiling is higher for them for some unclear reason.
The new Radeon is only visibly faster than the older one in high resolutions; the gap is never bigger than 8-15% which is not enough to overtake the GeForce 7800 GTX 512 but is enough for a min frame rate of 130fps. This is about three times the speed you need to play a third-person-view shooter with comfort, so you shouldn’t be too sad about the lower performance of the Radeon X1900 XTX, especially as it is only 10% slower than the leader.
Splinter Cell: Chaos Theory is among those games that can appreciate the GPU’s having more pixel shader processors. The game makes wide use of Shader Model 3.0 and HDR to achieve a considerably higher image quality and we know that the Radeon X1800 architecture was designed exactly for such operating environments.
In 1280x1024 the Radeon X1900 XTX delivers about the same average speed as the GeForce 7800 GTX 512, but has a higher minimum speed. The newer card also wins the higher resolution of 1600x1200 and becomes the current performance leader in Splinter Cell: Chaos Theory .
It’s like in the “pure speed” mode, but the Radeon X1900 XTX is never slower than the GeForce 7800 GTX 512 even in the lowest resolution. Moreover, the Radeon provides you more comfort in higher display modes by having a higher minimum frame rate.
The two senior Radeon models and the GeForce 7800 GTX 512 reach the CPU-imposed speed ceiling in the two lower resolutions of the “pure speed” mode, but in 1600x1200 you can see that the Radeon X1900 XTX still can’t rival the top model from Nvidia. Not that it is a big problem since the cards both keep the frame rate at 73fps and higher, which is more than enough for any racer.
The Radeon X1900 XTX does much better at the “eye candy” settings. Its performance does not get lower than 48fps, which is a little below the min speed of the GeForce 7800 GTX 512, but the average frame rate of the ATI card is, on the contrary, higher. The advantage is the biggest in 1280x1024 resolution where graphics memory frequency is of less importance than in higher display modes.
Since neither Radeon X1800 XT/XL nor Radeon X1900 XT support texture sampling from vertex shaders, they use the SM2.0 rendering path to draw the water surface, while the GeForce 7800 family use SM3.0. So, ATI’s cards work under easier conditions, yet the performance ceiling is lower for them, probably because of their less efficient OpenGL driver. This problem, however, doesn’t prevent the Radeon X1900 XTX from delivering a playable frame rate everywhere except for 1600x1200 resolution with enabled FSAA and anisotropic filtering (the min speed is less than 25fps then).
Age of Empires 3 uses SM3.0 special effects, supports HDR and seems to be just the application for the Radeon X1000 to show its best in, but we’ve got nothing like that in practice, at least as concerns the Radeon X1800 XT which is far inferior to the GeForce 7 cards. Featuring more pixel shader processors, the Radeon X1900 XTX is on the same level with the GeForce 7800 GTX in the “pure speed” and better in the “eye candy” test mode. The GeForce 7800 GTX 512 remains unbeaten, though. There must be something wrong with the game engine, similar to what we’ve seen in Serious Sam 2 .
Dawn of War is the opposite of Age of Empires 3 as concerns the performance of the Radeon X1900 XTX with respect to the Radeon X1800 XT. There is in fact no difference between these two graphics cards in this game irrespective of our using full-screen antialiasing or not. The performance of the cards as such is sufficient for normal play in any tested mode, but the Radeon X1800/X1900 series card don’t offer much security whereas the GeForce 7800 GTX cards never slow down below 25fps.
There are few version 2.0 shaders in Aquamark 2 , mostly version 1.1 and 1.4 instead, so this benchmark suits but poorly for testing modern graphics cards. As you can see, the Radeon X1900 XTX and the Radeon X1800 XT do not differ much here. The GeForce 7800 GTX 512 still holds the first place; the GeForce 7 architecture, very fast at executing simple shaders, proves to be more efficient for this test.
Final Fantasy XI Official Benchmark 3 has become out-dated and is not informative anymore when it comes to top-end graphics cards. They all deliver roughly the same performance here, even though the GeForce 7800 family is somewhat faster than the Radeon X1800/X1800. There are no complex pixel shaders here and no dynamic shadows, so the Radeon X1900 XTX can’t profit much by its additional pixel shader processors or other improvements.
The Radeon X1900 XTX doesn’t offer much more performance relative to the Radeon X1800 XT and that’s natural since only the fourth game test from 3DMark03 uses really complex shaders. Both these Radeons are far inferior to the GeForce 7800 GTX 512 in this test. The GeForce 7 architecture may be viewed as somewhat out-dated in a sense, but sometimes it is an advantage, not a drawback!
Let's see the performance results within the 3DMark03 in more detail.
Everything we’ve said above is wholly applicable to the first game test which doesn’t use any functions beyond the scope of DirectX 7. There’s no use for additional pixel processors here, so the Radeon X1900 XTX behaves exactly like the Radeon X1800 XT.
The second test is more up-to-date as it uses version 1.1 and 1.4 shaders and dynamic stencil shadows. Lacking a technology similar to Nvidia’s UltraShadow II, the Radeon X1900 XTX and Radeon X1800 XT have a lower pure speed than the GeForce 7800 GTX 512 which is capable of processing up to 32 Z-values per clock cycle.
Unlike in the first test, the 48 pixel shader processors are useful indeed and the Radeon X1900 XTX rises above the level of the Radeon X1800 XT to the level of the GeForce 7800 GTX 512.
3DMark03’s third game test produces largely the same picture as the second one since they only differ in the geometry of the scene. There are some minor differences, though. The gap between the senior graphics cards from ATI and the GeForce 7800 GTX 512 is bigger and the Radeon X1900 XTX only overtakes the Nvidia card in 1600x1200 when full-screen antialiasing and anisotropic filtering are enabled.
The Radeon X1900 XTX finds itself in the first place in 3DMark03 in the fourth test only. But this test is rather too old to be a good indicator of the perspectives of this or that graphics hardware architecture.
The “eye candy” mode results are curious here. ATI’s graphics cards usually do better in this mode thanks to their more efficient graphics memory subsystem, but this time we’ve got quite the opposite: the ring-bus controller and memory clocked 775 (1550) MHz do worse than Nvidia’s traditional controller and 850 (1700) MHz memory.
It is the first time in our tests that a graphics card singly beats the 10,000 points barrier in 3DMark05. Even the GeForce 7800 GTX 512 couldn’t do that! On the other hand, we should have expected it since this version of 3DMark makes extensive use of version 2.0 pixel shaders and the Radeon X1900 XTX should be at ease executing them – it has been actually designed to execute such shaders! Well, even the ex-senior model of the Radeon series is not far worse than the GeForce 7800 GTX 512, although it has only 16 pixel processors against the GeForce’s 24.
More detailed results of the 3DMark05 gaming tests follow on the next page.
The CPU obviously interferes with the results of the first test and the performance of CrossFire-linked Radeon X1800 XT cards that yielded no more than 42fps was a confirmation of this fact. The HyperZ unit has been improved in the Radeon X1900 XTX and it also uses the Fetch4 feature, so its performance goes down less quickly in higher resolutions than that of the Radeon X1800 XT. As the result, the new Radeon wins this test.
It’s all roughly the same in the second test: the Radeon X1900 XTX provides about the maximum performance available on a platform with an Athlon 64 4000+ processor in 1600x1200 while the GeForce 7800 GTX 512 and the Radeon X1800 XT are 20-25% behind it. The new card is also about 20% faster than its main opponent in the “eye candy” mode.
With its complex and numerous pixel shaders, the third game test is a greater stress exactly for the graphics processor, so the Radeon X1900 XTX shows its best, leaving the older Radeon and the GeForce 7800 GTX 512 far behind.
The new Radeon X1900 XTX has no rivals among single ultra-high-end graphics cards at the “pure speed” as well as “eye candy” settings. In the latter case the GeForce 7800 GTX 512 gets closer to the leader due to faster memory.
The new Radeon X1900 XTX doesn’t look like an indisputable winner if we compare the overall 3DMark06 scores: it scores a mere 103 points more than the GeForce 7800 GTX 512. However, the results of the new benchmark from Futuremark should not only be considered wholly, but also separately, i.e. by the two sections, Shader Model 2.0 and Shader Model 3.0/HDR.
The Radeon X1800 XT’s 16 pixel shader processors don’t cope with the load the SM 2.0 graphics tests put on it. It is only with 32 such processors more that the Radeon X1000 architecture becomes as fast as the GeForce 7800 GTX 512. It seems that the involved shaders are not very complex for the GeForce 7 architecture, yet they are numerous enough for the number of computational devices to matter much.
The SM3.0/HDR graphics tests are more sophisticated and there is a gap between the Radeon X1900 XTX and the GeForce 7800 GTX 512, even though surprisingly small. It is probably due to the second SM3.0 graphics test which is rather simple compared with the first one.
Let’s examine the results of each test independently on the next page.
In the first SM2.0 test we have the same picture in both “pure speed” and “eye candy” modes: the GeForce 7800 GTX 512 and the Radeon X1900 XTX have the same performance in low resolutions and the latter is superior in higher ones, thanks to the improved HyperZ and sampling of single-component textures. The new graphics card offers a tremendous performance gain over the Radeon X1800 XT – more than 50% in 1600x1200 resolution! – largely due to the Fetch4 feature the new series supports.
The second SM2.0 test is not so hard-pressing on the graphics memory subsystem because it uses a smaller-scale scene, so the advantage of the Radeon X1900 XTX over the GeForce 7800 GTX 512 is felt earlier and is outlined sharper. The new Radeon is again 50% faster than the Radeon X1800 XT as this test is no less shader-heavy than the previous one.
It’s in the first SM3.0 graphics test that the complexity and number of pixel shaders are at the highest and there’s also HDR involved, so the GeForce 7800 GTX 512 can’t help falling far behind the new Radeon X1900 XTX which has 48 highly efficient pixel shader processors, even if the latter has to spend some computational resources to filter the FP16 textures in the shader.
The second SM 3.0/HDR graphics test is simpler than the first one and is almost free from the pixel shaders that are such a burden for the GeForce 7800 GTX 512. This test is about dynamic shadows and HDR instead, and that’s why the Radeon X1900 XTX cannot break away from its main opponent farther than by 5-10%. And like in the previous cases, the new Radeon is about 50% faster than the Radeon X1800 XT.
ATI’s new Radeon X1900 XTX is definitely among the fastest graphics products these days, according to our test results. Furthermore, it is promised to be available from the first day and the designer of the VPU claimed that the new part will be available from then on, not episodically, as some other products launched recently. In case the Radeon X1900 XTX and the slightly slower Radeon X1900 XT are really available widely from the first day and going forward, ATI will break the chain of formal announcements when the hardware existed at the hands of journalists, but not in computers of end-users.
Best for the games, best for the media playback, best for the future games – that’s what the Radeon X1900 XTX is, based on the numbers we have obtained. The newcomer wins in massive amount of benchmarks and loses in some other because certain game engines, mostly OpenGL, are better optimized for Nvidia GeForce 7 hardware, but once there is intense arithmetic instructions load in pixel shaders, the Radeon X1900 XT is ahead and its lead over the currently available hardware is only going to increase as time goes by. We also would like to stress that ATI's hardware is times faster compared to the rivals in cases of branches in pixel shaders. It, however, remains to be seen whether game developers will really heavily use branching in their next-generation titles.
Another advantage ATI’s Radeon X1900-series have over the GeForce 7800 is somewhat better anisotropic filtering as well as higher performance when anisotropic filtering are enabled. Gamers who buy a graphics card for $649 probably have every reason to believe that it is not only future-proof, but also produces the best image quality possible and that best image quality comes at a decent framerate.
In order to demonstrate serious advantage over the GeForce 7800 GTX 512, ATI had to increase its die size by 20% as well as push the clock-speeds up slightly compared to the Radeon X1800 XT. Given that the size of the R580 core is still smaller compared to the GeForce 7800 GTX 512, such an increase cannot be considered as the act of desperation from ATI. Hence, the cost to make the Radeon X1900 chips will not be too high.
To put it short, ATI has recaptured performance leadership for now and graphics cards based on the Radeon X1900 XTX are the boards of choice at a rather excessive price-points of $649.
ATI Radeon X1900 XTX 512MB