Information

X-bit Labs for mobile users! Do not forget that we are running a special version of X-bit Labs web-site for users of mobile and handheld devices: http://pda.xbitlabs.com. Check out our news and articles from smartphones and PDAs to be always updated on the latest computer and technology news.

 

Articles: Video

NVIDIA GeForce2 GTS Extreme Overclocking Experience


Category: Video

by FastSite

[ 05/22/2000 | 12:00 AM ]

We managed to overclock the reference graphics card on NVIDIA GeForce2 GTS up to 260/440MHz core and memory frequencycorrespondingly having put the system into the freezer. Although these results are impressive, they showed the weak spotof the powerful GeForce2 GTS.


Table of contents:


We have recently posted a review of the today's most powerful graphics accelerator built on GPU GeForce2 GTSfrom NVIDIA. We won't repeat what has been already said about its advantages, we will only mention that GeForce2GTS owes its high performance to unique architecture and very high core clock frequency - 200MHz (compare with a120MHz GeForce256 core). It seems that this GPU has everything necessary for unbelievably impressive results.However, a powerful graphics processor requires very fast local graphics memory, which should be able to transferas much data as possible to keep GeForce2 GTS always busy.

<%BANNER[article]%>

GeForce2 GTS is a graphics processor, which performs various operations on constantly updated data arrays stored in thelocal graphics memory. So, the bandwidth of the local graphics memory should be large enough to prevent the graphics processorfrom wasting its precious time, so that the graphics accelerator could cope with its primary task, form lifelike 3D graphics,without delays.

Let's make a short tour to the past. In the epoch of NVIDIA Riva TNT the bandwidth of the local graphics memory grew intothe main bottleneck of the graphics subsystem. First of all, it appeared tangible during the work in 32bit color mode. Thepractical performance drop in 32bit color compared to 16bit color reached an incredible value of over 50%! The same thingcan be observed up to nowadays with the rendering in 32bit color depth compared to 16bit color, though with a tendencytowards reduction of this difference. Even by the today's graphics cards based on GPU GeForce256 with local DDR SGRAM memory.Although at the same time the graphics cards based on GPU GeForce256 with DDR memory are the only more or less well-balancedones, namely their graphics processor corresponds to the bandwidth of the local graphics memory bus.

The today's market offers us graphics accelerators based on new GPU GeForce2 GTS from NVIDIA. However, their performance in32bit color hasn't improved at all. We again see the same pretty sad thing, which was so typical of GeForce256 based cards withSDR SDRAM local graphics memory. In fact, the local graphics memory bandwidth is not enough for the most powerful GPUGeForce2 GTS to work to the best of its ability, which is proven by the benchmarks.

Well, let's give you a better idea of the graphics memory bandwidth problems with the help of figures. The list below showsthe relation between the graphics core clock frequency and memory clock frequency of the today's most popular graphics cardsbased on NVIDIA chips:

  • Riva TNT2 - 125/150MHz (the ratio makes 0.83, which means that every 5MHz of the graphics chip frequency correspondto 6MHz of the memory frequency)
  • Riva TNT2 Ultra - 150/183MHz (0.82, which means that the ratio is almost the same as in the previous case)
  • GeForce256 SDR - 120/166MHz (0.72, at first sight the ratio seems a bit better: every 5MHz of chipset frequencycorrespond to 7MHz memory frequency, however, the graphics processor became a four-pipeline one instead of two. Itmeans that more calculations can be carried out in parallel for each pixel that is why the frequency ratio turnedworse on the whole, nearly twice as bad as that of Riva TNT2)
  • GeForce256 DDR - 120/300MHz (0.40, we see that this ratio is the best of all: 5MHz of the chipset frequency get12.5MHz memory frequency. And even if we take into consideration the four-pipeline architecture of the chip, the ratiowill remain the best of all)
  • GeForce2 GTS - 200/333MHz (0.60, as you can see the ratio got significantly worse compared to the previous chip,especially bearing in mind not only four-pipeline architecture but also 8 texture processors of the GPU. So, taking intoaccount all the architectural peculiarities, the ratio will be even worse than by Riva TNT2)

So, it's clear that the performance of the newest graphics processor form NVIDIA, GeForce2 GTS, is limited by theinsufficient local memory bus bandwidth. In other words, the memory bandwidth (33MHz x 128bit = 5.2GB/sec) is toolittle to load GeForce2 GTS with work, especially at high resolutions and when working in 32bit color mode.

Judging by the provided data, we can approximately define the local graphics memory working frequency, which could ensurea better -balanced solution similar to GeForce256 based cards with DDR memory. Of course, since GeForce2 GTS carries out a lotof calculations in parallel, our data may be of pure theoretical value, however, it's quite suitable for backing up ourevaluations. Well, as a result we get 500MHz (which is the minimum, because we didn't consider 4 texture processors ofNVIDIA GeForce256 against 8 texture processors of NVIDIA GeForce2 GTS, which requires more from the memory bus bandwidth).It means that in order to get a well-balanced system, NVIDIA had to use at least DDR SGRAM with 250MHz memory. However,this memory isn't yet available in the market: DDR memory chips aimed at 266MHz clock frequency exist only as test samplesyet. Therefore it is very unlikely that this memory will be manufactured in mass even in autumn of the year 2000 wheneverybody is waiting for NVIDIA to announce its NV20. It turns out necessary to replace the graphics processors architectureor to find new graphics cards solutions. As we see, there are four possible ways-out:

  • Increasing the memory bandwidth from 128bit to 256bit (which will require either changing the card's design orusing new memory chips)
  • Making the memory bus multistream, so that several operations could request the memory simultaneously in parallelusing the existing memory modules. In fact, the graphics processor should have several units, which will work independentlywith the local graphics memory, each via its own separate bus. This idea is pretty complicated in terms of the graphicsprocessor and memory bus design.
  • Shifting to multichip solutions when every graphics processor has its own memory bus and its own local graphics memory,which will also guarantee the desired multistream effect.
  • Radical changing of the rendering architecture: switching to the so-called tile rendering architecture.

The first item may appear one of the most probable solutions. However, the memory may turn out too expensive and besides, theengineering costs - too high.

Something like the second solution can be observed in Matrox G400 with Dual Bus. However, it was a very shy try to put a newtechnology into practice, which could hardly be called a success. Nevertheless, we can't deny that this architecture has had acertain effect: the performance drop by Matrox G400 in 32bit color is now much smaller than that seen by NVIDIA Riva TNT2 Ultra.

The third solution is already introduced by 3dfx Company in their cards on VSA-100 chip, which should allow the graphics cardsfrom Voodoo5 family to show pretty high performance in 32bit color depth mode at higher resolutions, i.e. in the worst conditions.Unfortunately, this technology is run in on the graphics cards equipped with SDR SDRAM memory. Besides, the local graphicsmemory may easily appear insufficient for proper operation at higher resolutions, especially with larger textures used.

The fourth solution may turn out more realistic than it may seem today. Tile rendering architecture will allowsignificantly reducing the requirements to the bandwidth of the local graphics memory bus. In this case there willbe no need in looking for new (maybe very expensive) types of memory: you will be able to do with the alreadyavailable in the market DDR SGRAM. Moreover, the mentioned tile architecture includes the options for massparallel calculations, which provides a really good potential for performance increase. And as for practicalexamples, here you are: PowerVR 250 from VideoLogic (NEON-250). Gigapixel, which has been recently includedinto 3dfx, is also busy working in the same direction. That is why we have every reason to expect new 3dfxproducts in the near future, which will implement and develop this technology.

So, we have to admit that although a new GPU GeForce2 GTS from NVIDIA features the whole bunch of various options, slowgraphics memory restricts its potential and prevents it from acting to the best of its ability. Graphics cards based on thisprocessor are very ill-balanced products compared to the previous ones on NVIDIA chips. Nevertheless, NVIDIA's marketingdepartment didn't waste their time. With the connivance of 3dfx, which hasn't yet started mass sales of its new accelerators,new competing products on GeForce2 GTS appear in the market little by little and sell at really high prices. In fact, we can'tmake out why a card, which manufacturing hardly costs a buck more than a GeForce256 based one with DDR memory, is sold for atleast $50 higher price. We are used to competition in computer industry and when it disappears for some reason, the changes ofthe pricing policy are inevitable. And of course, these are the end-users that suffer most of all in this case. We consider it atotal disgrace when a new product, which is only around 10% faster than the previous one starts selling at the prices of over 2years ago (only in 1998 they could ask $350 for a new and very rare graphics accelerator, such as 3dfx Voodoo2).

However, let's return to real life. Graphics cards based on NVIDIA GeForce2 GTS are already manufactured and selling. If aGeForce2 GTS based card is not balanced, then how will we manage to improve its features ourselves? Of course, we can do a bit.Certainly, we should undertake some overclocking or set the local graphics memory frequency to non-nominal values.

In our review on a reference card based on GeForce2 GTS we touched upon the increase of the graphics card components'working frequencies. Then we overclocked the card with a small extra cooling provided by an additional fan. Today we willdwell on the results achieved during the card's extreme overclocking.

At first comes the testbed configuration:

  • Intel Pentium III 733MHz CPU;
  • Chaintech 6ATA4 (VIA Apollo Pro 133A) mainboard;
  • 256MB PC133 system memory;
  • IBM DPTA 20GB HDD;
  • Windows 98 SE.

And here is the main tool used for our extreme experiments: an old refrigerator :-) :

The mainboard with all the stuff on it was put into the freezer with the maintained temperature of -18oC.

During the tests we took the temperature of the mainboard and the CPU. The sensors showed that the mainboardstayed -8oC and the CPU only +1oC.

We ran the tests in Quake3 v.116n, demo002. The graphs below are built for 1024x768x32 and 1280x1024x32 (HighQuality mode).

We overclocked the graphics card twice:

  • With the memory frequency fixed at 333MHz we increased the chipset frequency as much as we could and measured the performance;
  • With the chipset frequency fixed at 200MHz we increased the memory frequency as much as we could and measured the performance.

The achieved results can be seen on the following graphs:


So, we can state a certain record: at the clock frequency of the graphics processor equal to 250MHz the peak fillratereached 1Gpixel per second! Nevertheless, GeForce2 GTS worked stable even at 260MHz. Although the memory bus bandwidthprevented the graphics processor from acting to the top of its power, the performance grew by nearly 15% (at 260MHz corefrequency, i.e. when it was overclocked by 30%).

However, the most impressive results were achieved with the local graphics memory overclocking. You can easily noticethat the maximum we managed to obtain was about 220MHz (440MHz end frequency). When working at 450MHz end frequency we sawsome artifacts such as some visible image distortions. No doubt the curve showing the performance gain achieved duringmemory overclocking grows much more significantly than in the previous case with graphics core overclocking. We noticedan almost 30% performance gain at 440MHz memory end frequency (note that the graphics core frequency was set to itsnominal value - 200MHz). This result proves very illustratively all our theoretical suppositions discussed above. Thegraphics card based on GeForce2 GTS is more sensitive to the increase of graphics memory frequency than to the increaseof graphics core frequency.

Conclusion

Despite the extreme overclocking conditions and the use of a really powerful additional cooling, we have every right tostate that the increase of the graphics memory working frequency tells greatly on the performance of GeForce2 GTS basedgraphics card. This is true even if there were only an ordinary cooling fan used (400MHz end memory frequency proved veryeasy to achieve). Overclocking the graphics core is also possible, however, it doesn't make very much sense, because evenat the nominal of 200MHz GeForce2 GTS can't show all it's capable of because of the narrow local graphics memory bus.

Nevertheless, we would like to point out that at the maximal frequencies set to 260/440MHz the graphics card on NVIDIAGeForce2 GTS performed excellently in Quake3: at 1024x768x32bpp it showed almost 110fps and at1280x1024x32bpp - 63fps.


Article Rating

Article Rating: 10.00 out of 10
 
Rate this article:
Excellent
Average
Poor
 

Discussion

Comments currently: 0

Add your Comment

Name/Nickname
Your Comments
 

Category News

Category: Video

Thursday, July 17, 2008

5:48 am Microsoft Preps to Unveil DirectX 11 Features in Several Days. ATI, Nvidia, Microsoft to Discuss DirectX 11 Techniques at XNA, Siggraph

Wednesday, July 16, 2008

12:30 pm New Generation ATI Radeon for Mainstream, Mobile Markets are Ready. PCI-SIG Approves ATI RV730, M98-L, M96 Graphics Chips

7:22 am EVGA and XFX Reimburse Price Difference on GeForce GTX 200 after Price Collapse. EVGA and XFX to Return Money to GeForce GTX 200 Purchasers

Tuesday, July 15, 2008

4:23 pm Startup Promises to Revolutionize Multi-GPU Technology Early Next Year. LucidLogix Unveils Hydra Distributed Processing Engine

Friday, July 11, 2008

10:26 pm AMD Plans to Launch Two Dual-Chip ATI Radeon HD 4800 Graphics Cards. ATI Touts 8-Way ATI CrossFireX Multi-GPU Technology

 
News Archive
All Latest News