Introduction or "Seek and Destroy GeForceFX"
The graphics cards based on ATI RADEON 9700 PRO graphics chip remain the best and most up-to-date now among all 3D gaming cards. But the immediate rival from NVIDIA, GeForceFX, has already been announced and we are waiting for graphics cards based on it to appear soon. If we take aside the prices and price-to-performance ratios of the new cards, and look into performance only, we may quote different sources claiming that NVIDIA GeForceFX (NV30) may turn to be 30% faster overall than the present-day leader, ATI RADEON 9700 PRO (R300).
ATI is already making preparations for a counter-attack in the form of R350, an overclocked and enhanced R300. However, we will see it in the spring of 2003 at best. By that time the situation may become unfavorable for ATI: the new GPU from NVIDIA will win the top place in performance.
Anyway, if you have got an ATI RADEON 9700 PRO graphics card, you shouldn't be worried. These graphics cards have an enormous potential in functionality and performance. Which becomes especially evident if you increase their speed by overclocking.
The goal of our today's experiment is to overclock ATI RADEON 9700 PRO to its maximum. Ideally, we would like to get as much as a 30% performance growth. If we get it, there will be a graphics card (although only one in the world, but now and here) that would be able to "kill" NVIDIA GeForceFX.
Since we will be able to adjust the graphics chip and memory frequencies in a very wide range, the additional goal of the article appears to estimate how well-balanced our ATI RADEON 9700 PRO is by checking the performance growth achieved during chip and memory overclocking under various conditions.
The honor to fall the victim in the name of speed was granted to the FIC 1stGraphics AT010 graphics card based on the ATI RADEON 9700 PRO chip. By the way, we have already taken a close look at it in our article called CPU Influence on Graphics Card Performance: ATI RADEON 9700 Pro vs. NVIDIA GeForce4 Ti4600.
Cooling System Modification
During graphics card overclocking, the first thing to take care of is effective cooling.
I decided to use passive cooling solutions for the onboard graphics memory. Each of the eight memory chips had to be covered with a separate heatsink. There are rather high capacitors between the memory chips that prevent using one heatsink for two chips in a bunch. To make small heatsinks for our memory chips we had to use the ones from the Thermaltake Memory Cooling Kit and to saw them up into three pieces each.
The heatsinks were fastened to the memory chips surface by means of sticky thermal pads from the same Thermaltake Memory Cooling Kit.
That's all we have undertaken to improve the memory chips cooling.
As for the graphics chip cooling, the standard cooler our ATI RADEON 9700 PRO based graphics card was equipped with by default cannot be used for overclocking purposes. We have to seek for something more efficient.
The ATI RADEON 9700 PRO graphics chip is made in such a way that the heat is taken off directly from the die surface. A special metal frame is mounted onto the chip's wafer to preventing the cooler from being installed wryly. This frame may also cause certain problems sometimes. In some graphics cards the upper edge of the frame comes out a tiny portion of millimeter above the die surface, thus hindering tight contact between the heatsink and the chip surface.
Fortunately, this frame is not soldered to, as I have erroneously written in the recent ATI RADEON 9700 Pro Graphics Card Review, but glued to the wafer. You can easily remove it by gingerly lifting it up with something thin, like a knife's blade. No sooner said than done:
Now we are ready to install a really good graphics chip cooling system.
Having weighed up all the pros and cons, we decided to use a water cooling system from 3R System aka Poseidon WCL-02. It suited us both in dimensions and power, but didn't include a water unit for graphics cards and chipsets. So we had to use a water block provided to us by the Russian enthusiast of water cooling our colleague Vyacheslav Zaikin. This unit, made of solid pure copper, fitted nicely into Poseidon WCL-02. The water unit was mounted by means of a presser frame and two long screw-bolts, fitted into the holes in the card. The KPT-8 thermal paste served as the thermal interface.
When everything was ready, the graphics card looked like that:
The power supply of the ATI RADEON 9700 PRO graphics chip in this card is provided by an impulse regulator with SC1175CSW control circuit from Semtech:
This chip has two independent control channels, but here they are enabled in parallel. The typical application circuit scheme for this mode is shown below:
The output voltage of the regulator is determined by resistors R1 and R12 according to the formula Vout = 1.25 x (1 + R12/R1) (the resistors are numbered according the typical application circuit scheme). The blue color in the scheme above indicated how you can increase the graphics chip voltage with the help of a shunting R1' resistor.
The nominal graphics chip voltage in the graphics card was 1.5V. After soldering up the shunting 2.7kOhm resistor to pins 18 and 20 of the chip, as shown in the picture…:
…the graphics core voltage rose up to 1.75V. The voltage grew 16.7% above the nominal value. It's not very much considering that we use a water cooling system with a copper water unit for graphics chip cooling.
Next to the graphics chip voltage regulator, there is a graphics memory voltage regulator, covered with an aluminum plate for better heat dissipation. To get to the regulator, you need to carefully remove this plate. Very carefully! The plate is glued to the chip with heat-conductive glue and you might finish off with torn-off chips if you are overeager.
Under the plate we found an IRU3037A chip from International Rectifier, which provides regulation of voltage for I/O circuits of the graphics memory chips (VDDQ). As with the first regulator, the output voltage here is determined by the ratio of resistances of the resistors in the feedback circuit. To increase the voltage, you need simply to change this ratio by soldering up an additional shunting resistor.
Before modification, the graphics memory I/O circuitry voltage (VDDQ) was 2.78V. After we soldered up a 10kOhm resistor to pins 1 and 4 of the chip as shown in the picture…
…the output voltage grew up to 3.19V.
The aluminum plate turned to be a real nuisance. Since it had to be returned to its place we had to solder up small wire cuts to the chip pins first, and then we soldered the resistor to the wires.
The nominal inner circuitry voltage (VDD) of the graphics memory chips was equal to 2.92V. We soldered up an additional 2.7kOhm resistor to pins 5 and 7 of the chip, as shown in the picture:
And the voltage increased up to 3.32V.
Now, the graphics card is ready for overclocking.
The maximum frequencies, at which the extremely overclocked card worked stably, were 450MHz for the chip and 800MHz (400MHz DDR) for the memory. In other words, the graphics chip frequency grew by 38.5% over the nominal 325MHz, and the graphics memory frequency grew by 29% over the nominal 620MHz (310MHz DDR).
We claimed that the graphics card was stable if it managed to successfully pass, without any artifacts, the full set of 3DMark2001 SE tests in 1024x768x32 resolution, and then four benchmarks in Unreal Tournament 2003 in 800x600, 1024x768, 1280x1024 and 1600x1200 resolutions with default graphics quality settings.
Before we increased the voltages and mounted the additional cooling systems, the graphics card managed to work well at 380MHz chip and 750MHz (375MHz DDR) memory. So, the increased graphics chip voltage resulted in significant overclocking potential growth (from 380MHz to 450MHz), while the graphics memory wasn't so enthusiastic about its voltage increase (the frequency rose from 750MHz to 800MHz only). Maybe the PCB layout of the graphics card imposes its limitations onto the maximum of the supported memory frequency. But we could also be lucky to have got a graphics card with well-overclockable memory, which didn't bother much about the voltage increase.
Well, all these discussions are certainly less interesting than the actual tests of the overclocked graphics card. Before proceeding to the results, take a look at out testbed and check our testing methodology.
Testbed and Methods
Having assembled the testbed, we couldn't help making a couple of snapshots to show what the "workplace of an extreme overclocker" looks like.
It's an uneasy feeling to see the graphics chip water cooling system and additional fans blowing the air onto the graphics memory chips. But it's all for speed! :)
We used the following testbed:
- FIC 1stGraphics AT010 graphics card based on the ATI RADEON 9700 PRO chip;
- Intel Pentium 4 2800MHz CPU;
- ASUS P4S8X (SiS648) mainboard;
- 512MB PC2100 CL2.5 DDR SDRAM by Samsung;
- IBM DTLA 305030 HDD.
The software we used included:
- Windows XP Professional;
- Windows XP driver v.6200 (Catalyst 4.3) for graphics cards based on ATI chips;
- Unreal Tournament 2003 v.2107.
The extreme overclocking and benchmarking were carried out at the room temperature of 20-25oC.
As a benchmark, we used Unreal Tournament 2003 in resolutions from 1024x768 to 1600x1200 and in four modes: with default graphics quality settings, with forced anisotropic filtering, with forced full-screen anti-aliasing and with both these options enabled.
In order to estimate how efficient turned the graphics chip and graphics memory overclocking, we chose two intermediate chip and memory frequencies, so that the interval between the frequencies would be about the same. So overall, we got 16 combinations of core and memory frequencies. The benchmarks results in every mode are presented in two diagrams. The first one shows the results with fixed chip clock-rate and changing graphics memory frequency. It will help us to evaluate the "profit" derived from the memory overclocking. The second diagram shows results with fixed graphics memory frequency and changing chip frequency. This will help to estimate the benefits of graphics chip overclocking.
Unreal Tournament 2003, Default Graphics Quality Settings:
The tests indicate that in 1024x768 the overclocking hardly leads to any noticeable improvements. However, as the resolution grows up, the performance increase becomes evident.
Judging by the numbers, both graphics memory and chip overclocking provide about the same performance growth. It means that the architecture of ATI RADEON 9700 PRO is excellently balanced. At least, in Unreal Tournament 2003 with the default settings.
For a more precise estimation, we would like to offer diagrams, which show the percentage of the performance growth due to overclocking in relation to the results of the graphics card working at its nominal frequencies.
These diagrams confirm the above said things: the performance growth is hardly over 5% in 1024x768, but in higher resolutions, when the CPU is less of a bottleneck, the growth reaches 20-25% in 1280x1024 and 30-35% in 1600x1200.
If consider the lines standing for the 1600x1200 results, we may say that graphics chip overclocking turns to be more effective. But we shouldn't forget that the frequency of the extremely overclocked chip got almost 40% higher, while the graphics memory frequency increased by the good 30% over the nominal. So, graphics chip and memory overclocking are almost similarly efficient. It means that ATI RADEON 9700 PRO card is perfectly balanced here.
Here are the numbers for this test:
Unreal Tournament 2003 with Forced FSAA
When we enable full-screen anti-aliasing of the highest level (6x), the workload on the graphics card springs up and the overclocking appears fruitful in all the resolutions.
The look of the diagrams suggests that the performance growth due to chip or graphics memory overclocking is about the same. That is, ATI RADEON 9700 PRO turns to be a well-balanced card in Unreal Tournament 2003 with forced full-screen anti-aliasing, too.
The numbers for the benchmark results:
Unreal Tournament 2003 with Forced Anisotropic Filtering:
When the 16x Quality anisotropic filtering is on, ATI RADEON 9700 PRO uses much more texture samples to calculate the resulting color of a texture: up to 128 instead of 4 or 8 as with bi-linear and tri-linear filtering. It means that the memory bus will be loaded much heavier, as the texture data will be transferred along it. So, it seems quite logical that the graphics memory overclocking would prove very efficient with enabled anisotropic filtering. But the benchmark results suggest that it's the chip overclocking that brings the highest performance growth. We may incur that ATI RADEON 9700 PRO has an optimized texture caching system. And when anisotropic filtering is enabled, most texture data are taken from the cache, while the memory workload remains about the same. That's why the main "limiting factor" here appears the texturing units of the graphics chip, which are responsible for a lot of extra calculations connected with getting the average pixel color for a big number of texturing samples. So, the graphics chip overclocking turned to be more rewarding than memory overclocking in case of enabled anisotropic filtering.
All this points at some imbalances by ATI RADEON 9700 PRO when anisotropic filtering is enabled.
The numbers for this benchmark are given here:
Unreal Tournament 2003 with Forced Anisotropic Filtering and FSAA
When both anisotropic filtering and full-screen anti-aliasing are enabled, the overclocked graphics chip seems more of a profit than the overclocked memory, as anisotropic filtering loads the graphics core more than it does the graphics memory.
This is the "hardest" mode, when the workload onto the graphics card is very big and the CPU doesn't influence the performance of the card at all. So, the performance growth here should be the highest, too. The graphs showing the percentage of the performance growth prove it:
The graphs also indicate that the chip overclocking in ATI RADEON 9700 PRO proved more useful than memory overclocking in the hardest mode (with forced anisotropic filtering and full-screen anti-aliasing).
The performance growth we got during extreme overclocking is quite perceptible: 30-35%. According to the preliminary info, it's exactly the growth necessary for ATI RADEON 9700 PRO to "kill" NVIDIA GeForceFX :).
The numbers for the last test round are here:
So, in spite of all our apprehensions, the extreme overclocking of ATI RADEON 9700 PRO is a rewarding thing.
With proper cooling and increased chip and memory voltages, we managed to raise the graphics core and memory clock-rates by 38.5% and 29%, respectively. The maximum frequencies the graphics card would work at were 450MHz for the core and 800MHz (400MHz DDR) for the memory. That's definitely a record-breaking performance! :)
Judging by the results obtained in Unreal Tournament, the performance growth we got during overclocking reached 30-35%. In other words, it appeared proportional to the increase of the chip and memory frequencies.
We also learned that the issue of poor "balance" of ATI RADEON 9700 PRO holds true. With 256bit DDR SDRAM memory bus, this graphics chip has "only" eight texture modules and it is the texturing speed that may become the bottleneck in certain situations. But, as the benchmarks show, ATI RADEON 9700 PRO gets definitely "out of balance" only when anisotropic filtering is enabled. In all other cases, the workload is shared evenly between the chip and the memory bus.
So, we are ready to meet NV30. Are you? :)
- This research is just a kind of experiment and shouldn't be regarded as an appeal to taking up extreme overclocking and graphics cards resoldering.
- These modifications shorten your card's service life.
- Any sort of mechanical modifications deprives the users of the warranty.
- Should the graphics card or other components be wrecked, the users bear the complete responsibility for their actions.