NVIDIA GeForce 7800 GTX: New Architecture Exposed

NVIDIA has unveiled its brand-new graphics processor and proclaimed it to be the fastest in the industry. Today we take a look at what the company has to offer in terms of acrhitecture, efficiency and power consumption.

by Alexey Stepin
06/22/2005 | 12:52 PM

Introduction

The 4-th of April, 2004, was a remarkable day in the 3D graphics realm. Having previously lost the lead to ATI Technologies, NVIDIA Corporation announced a new graphics processor codenamed NV40. This chip made NVIDIA a technological leader since it was the first consumer graphics solution with such revolutionary technologies as next-generation pixel and vertex shaders (Shader Model 3.0), floating-point color representation and others.

As a sign of departure from the past, NVIDIA abandoned the letters FX in the names of the graphics cards on the new GPU, and GeForce 6800 cards were really brilliant in all the benchmarks, wresting the crown from the RADEON 9800 XT. This was not an easy victory, though. The chip came out very complex, consisting of 222 million transistors, and an acceptable chip yield was only found at frequencies of 350-400MHz. Besides that, the higher heat dissipation made a clumsy and noisy dual-slot cooling system necessary. But even with all these drawbacks the release of the GeForce 6800 Ultra was a big step forward for NVIDIA as well as for the industry at large.

Soon after that, on May 4, ATI Technologies replied with the release of the R420 processor and R420-based graphics cards. Unlike NVIDIA’s, ATI’s approach was evolutionary rather than revolutionary: the RADEON X800 was in fact a greatly improved RADEON 9800 rather than something completely new. That approach was quite justifiable then: the R420 was a rather simple chip (160 million transistors against the NV40’s 222 million), and coupled with new dielectric materials this simplicity allowed ATI to raise the frequency of the new solution to 520MHz, achieving a very high level of performance.

The NV40 and R420 were in fact equals in their basic technical characteristics. Each chip had 16 pixel pipelines and 6 vertex processors, but the RADEON X800 XT was generally faster than the GeForce 6800 thanks to higher operational frequencies. NVIDIA’s card couldn’t use its support of Shader Model 3.0 to its advantage since there were no games capable of using this feature. Even the Far Cry patch that added SM 3.0 to this game didn’t change anything as the same patch also added Shader Model 2.0b which was implemented in competitor processors from ATI.

So, NVIDIA held the crown of the king of 3D graphics but for a very short while. Moreover, the difficulties with production of such a complex chip as the NV40 almost immediately resulted in a deficit of GeForce 6800 Ultra cards (well, ATI’s RADEON X800 XT and PRO were not abundant, either). Later on ATI split the RADEON X800 family into two lines by releasing the high-performance R480 (RADEON X850) and the mass-user-oriented 0.11-micron R430. The maximum frequency of the R480 chip reached 540MHz whereas the max clock rate of NVIDIA’s NV40 and NV45-based solutions was only 425MHz (on “special edition” graphics cards from certain manufacturers). The top models of NVIDIA’s graphics cards were still inferior in performance to their counterparts from the ATI camp.

The announcement of the multi-chip SLI technology helped NVIDIA to offer more performance than the ATI RADEON X850 XT Platinum Edition could deliver. Yet, the solution consisting of two GeForce 6800 Ultra/GT graphics cards turned to be too expensive, awkward and power-wasting and also required a special mainboard based on the nForce4 SLI chipset. On the other hand, people who wanted the best performance money could buy didn’t care about these things at all, and NVIDIA’s multi-GPU technology became quite popular.

So, ATI’s trumps by the middle of 2005 were:

NVIDIA had a few aces, too:

In other words, both GPU developers offered products that were the best in some way or another, but neither could offer a chip both fastest and feature-richest. Today NVIDIA and ATI both need a new graphics processor that would return the crown of an absolute leader to one of them. NVIDIA was the first to announce a new-generation GPU, making ATI Technologies hurry up with an answer.

GeForce7: New Graphics Architecture?

The new graphics architecture was quite expectedly named GeForce 7, and the graphics processor which is its embodiment is called G70. NVIDIA seems to imply that this is a perfectly new architecture rather than an improved GeForce 6. Is it really so? Let’s delve deeper into the matter.

First, the new graphics processor from NVIDIA is the most complex GPU for personal computers today, consisting of as many as 302 million transistors. For note: the NV40 consists of 222 million transistors, and the ATI R480 of only 160 million. This complexity is quite natural since the new GPU contains 24 pixel pipelines and 8 vertex processors against 16 and 6, respectively, in GPUs of the previous generation. Moreover, the G70 is not just an overgrown NV40. NVIDIA has considerably revised the architecture of pixel and vertex processors to improve their performance. More about that shortly, but now let’s have a look at the technical specification of the new GPU in comparison with previous-generation models:

NVIDIA GeForce 7800 GTX

NVIDIA GeForce 6800 Ultra

ATI RADEON X850 XT Platinum Edition

Manufacturing technology

0.11 micron

0.13 micron

0.13 micron low-k

Number of transistors

302 mln.

222 mln.

160 mln.

Clock frequency

430MHz

400MHz

520MHz

Graphics memory controller

256bit GDDR3 SDRAM

256bit GDDR3 SDRAM

256bit GDDR3 SDRAM

Graphics memory clock frequency

1200 MHz

1100 MHz

1180 MHz

Memory bus peak bandwidth

34GB/s

32.8GB/s

33.4GB/s

Maximum graphics memory size

512MB

512MB

512MB

Interface

PCI Express x16

PCI Express x16

PCI Express x16

Pixel processors, pixel shaders

Shader model

3

3

2.x

Static loops and branching

yes

yes

yes

Dynamic loops and branching

yes

yes

no

Multiple Render Targets

yes

yes

yes

Floating-Point Render Target

yes

yes

yes

Maximum number of pixels per clock cycle

24

16

16

Maximum number of Z values per clock cycle

32

32

16

Number of texturing samples

16

16

16

Texture filtering algorithms

Bi-linear

Bi-linear

Bi-linear

tri-linear

tri-linear

tri-linear

anisotropic

anisotropic

anisotropic

tri-linear + anisotropic

tri-linear + anisotropic

tri-linear + anisotropic

Maximum level of anisotropy

16x

16x

16x

Vertex processors, vertex shaders

Shader model

3

3

2.x

Number of vertex processors

8

6

6

Static loops and branching

yes

yes

yes

Dynamic loops and branching

yes

yes

no

Reading textures from the vertex shader

yes

yes

no

Tesselation

no

no

no

Full-Screen Anti-Aliasing

FSAA algorithms

Ordered-grid super-sampling

Ordered-grid super-sampling

Rotated-grid multi-sampling

Rotated-grid multi-sampling

Rotated-grid multi-sampling

temporal anti-aliasing

super-sampling + multi-sampling

super-sampling + multi-sampling

transparent supersampling

Number of samples

2..8

2..8

2,4,6

Technologies increasing the efficiency of the memory bus bandwidth

Hidden Surface Removal (HSR)

yes

yes

yes

Texture, Z-buffer, frame buffer compression

yes

yes

yes

Fast Z-buffer clear

yes

yes

yes

Additional technologies

OpenEXR (HDR)

yes

yes

no

Videoprocessor

yes

yes

no

So, the G70-based graphics card is called GeForce 7800 GTX. This new device looks quite imposing, being technologically head above the GeForce 6800 Ultra, not to mention the RADEON X850 XT Platinum Edition. There’s only one parameter the ATI card is superior in – the clock rate of the G70 is lower, having increased by only 30MHz above the previous-generation NV40 chip. This is of course the consequence of the high complexity of the chip, but there seem to be no problems with production. NVIDIA says graphics cards on the new GPU will be available at the day of the announcement. That’s good, recalling that the last-year announcement of the NV40 was in fact just a marketing event, the actual silicon being unavailable in shops.

The table above does not reveal the distinguishing features of the G70 which make it a truly new-gen solution. We will dwell on these features individually. At first sight, the G70 is similar to the NV40, save for the number of pixel and vertex processors:

The number of Raster Operation (ROP) units has remained the same. There 16 of them in the chip, and the number of texture samples per pass is still 16 or 32 when only Z values are sampled. In other words, the number 24 refers to the number of pixel processors only. So, the GeForce 7800 architecture generally resembles the GeForce 6800, but there are some considerable differences, too.

More than Just a Pixel Pipeline

As we said above, NVIDIA has seriously redesigned the architecture of the pixel pipelines to improve their performance. The developers had modeled 1,300 various shaders to expose bottlenecks of the previous architecture and the resulting pixel pipeline of the G70 looks as follows:

Each of the two shader units now has an additional mini-ALU (these mini-ALUs first appeared back in the NV35, but the NV40 didn’t have them). It improves the math1ematical performance of the processor and, accordingly, the speed of pixel shaders. Each pixel processor can execute 8 instructions of the MADD (multiply/add) type in a single cycle, and the total performance of 24 such processors with instructions of that type is a whopping 165Gflops which is three times the performance of the GeForce 6800 Ultra (54Gflops). Loops and branching available in version 3.0 pixel shaders are fully supported.

Of course, real-life shaders do not consist of MADD instructions only, but NVIDIA claims the pixel shader performance of the G70 is two times higher than that of the NV40. We will check this claim in our theoretical tests, but the improved pixel pipelines look highly promising. We can expect a considerable performance gain in modern pixel-shader-heavy games.

Vertex Processors

The flowchart of the G70’s vertex processor doesn’t differ from the same processor in the NV40:

A higher speed of processing geometry is achieved by means of more vertex processors (8 against the NV40’s 6) and, probably, through improvements in the vector and scalar units. According to the official data, the performance of the scalar unit has increased by 20-30% in comparison with the NV40, and a MADD instruction is executed in a single cycle in the vector unit. Besides that, the efficiency of cull and setup operations in the fixed section of the geometry pipeline has increased by 30%. We are going to cover these things in more detail below.

On the whole, we can’t call the new architecture from NVIDIA a revolution. It is rather a greatly improved and perfected GeForce 6 which has been the most advanced architecture in the 3D consumer graphics market until today. The GeForce 7 carries the leadership on, once again confirming NVIDIA’s technological superiority.

HDR: More Speed

The support of OpenEXR format that allows outputting an image in an extended dynamic range on the screen first appeared in the GeForce 6800 Ultra. This format is employed by Industrial Light & Magic, a division of Lucasfilm, for creating special effects for modern blockbuster movies.

Alas, this rendering mode requires huge resources, even though it ensures a much better image quality. The first game to support HDR was the popular 3D shooter Far Cry, since version 1.3. But in fact, this support of HDR remained more of a marketing trick, since you could not play in this mode even in 1024x768 resolution. For example, the performance normally being from 55 to 90fps on the Training map in different resolutions, the HDR mode yielded no more than 15-30fps. Of course, there was no talking about comfortable play. NVIDIA SLI technology increased the speed in the HDR mode to more acceptable numbers but the cost of a system with two GeForce 6800 Ultra/GT was very high.

The situation changes with the arrival of the G70, and HDR is going to be more useful for owners of G70-based graphics cards. According to NVIDIA, the GeForce 7800 GTX is 60% faster in this mode than the GeForce 6800 Ultra thanks to the improved texture-mapping units. So it looks like you can enjoy a beautiful high-dynamic-range image in resolutions up to 1280x1024 with one such graphics card, while SLI configurations will make 1600x1200 resolution playable in HDR.

New FSAA Modes

When developing the new processor, NVIDIA didn’t just increase the performance but also paid much attention to improving the image quality. So, the GeForce 7800 GTX has acquired new full-screen antialiasing modes. As you know, ordinary FSAA methods (super-sampling or multi-sampling) do not work on such objects as a wire fence or foliage on trees and those annoying jaggies remain on them, spoiling the whole picture.

From the technological point of view, the above-mentioned objects are usually very simple models consisting of several or even a single polygon; the pattern of leaves or wire is created with an appropriate texture. Since pixels inside a polygon are not smoothed out, full-screen antialiasing can’t help here. But the new antialiasing mode introduced by NVIDIA in the GeForce 7800 GTX can. This mode allows performing blending operations on transparent pixels, so the image quality in such spots is greatly improved. The information about the areas of the texture to be blended is taken from the texture’s alpha-channel.

The new FSAA method exists in two versions: Transparency Adaptive Supersampling and Transparency Adaptive Multisampling. The latter has a lower quality since it uses one texel sample instead of four to calculate a sub-pixel value, but it works faster. Combined with the higher performance of the GeForce 7800 GTX, the new antialiasing methods deliver an almost ideal image quality to the minutest detail.

PureVideo

One of the most interesting features of the GeForce 6 architecture was the special programmable video-processor called PureVideo which was to unload the system’s CPU in operations on video streams. The first version of this processor, available in the NV40, was originally disabled, probably due to some problems with the tech process. When they later turned this processor on, it transpired that the GeForce 6800 was equipped with the first version of PureVideo, without WMV HD acceleration, while the more advanced second version was available in the GeForce 6600. The second-generation PureVideo processor has the following features:

Only S3 Graphics can currently offer something like that, but its DeltaChrome and GammaChrome chips are rather slow in 3D. Of course, NVIDIA’s second-generation PureVideo processor was implemented in the G70, acquiring a few new interesting features by the way, so we’ve got a third generation of programmable video-processors from NVIDIA. Besides the above-enumerated functions, the PureVideo processor now supports the following:

These innovations refer to processing of HDTV content which becomes the more popular nowadays. By the way, you need Patch No.888656 from Microsoft in order to enable the acceleration of WMV HD decoding. You can download it here.

GeForce 7800 GTX in Detail

Getting closer to practice, it’s time to take our sample of the GeForce 7800 GTX in our hands. This graphics card resembles the GeForce 6800 GT at first:

  

Both cards use a compact single-slot cooling system, and the component layout of the GeForce 7800 GTX resembles NVIDIA’s earlier products, too. Changes are most visible in the rear part of the PCB where the voltage regulators and other power elements reside. It’s not that the power circuit has become simple, but there are fewer electrolytic capacitors and the power elements are placed in three rows rather than in a single line as before. They form a small rectangular now, covered with a thin-ribbed aluminum heatsink. These changes in the power circuit layout have made the PCB of the GeForce 7800 GTX longer, so the new device is obviously the longest graphics card today.

There is an aluminum plate on the reverse side of the PCB, besides the usual bracket for fastening the cooler. The plate is not just a decoration as one might have thought. The thing is the PCB of the GeForce 7800 GTX is intended for 512 megabytes of GDDR3 memory, and there are 16 places intended for memory chips, 8 places on each side of the PCB. But the standard amount of memory on a GeForce 7800 GTX card is 256 megabytes. They could have put eight 256Mbit chips on the face side of the PCB, but NVIDIA preferred to install four chips on either side. So, the above-mentioned plate is a heat-spreader for the four reverse-side GDDR3 chips. Its efficiency may be not very high, but GDDR3 memory features a low level of heat dissipation, and there are only four such chips to be cooled there.

The graphics card uses 1.6ns memory from Samsung rated for 600 (1200) MHz frequency. This is exactly the frequency the memory chips are clocked at on this card. Note that starting from the GeForce FX 5900 NVIDIA places memory chips in such a way as to make the pathways that connect them to the GPU as short as possible. This helps to ensure stable operation at high frequencies.

We dismantled the cooling system to access the graphics processor:

  

As you see, the die area of the G70 is much larger than the NV40 notwithstanding the thinner tech process (0.11 micron against 0.13 micron). No wonder as they have added 80 million transistors more into the new chip. The surface of the G70 is not mirror-like like the surface of the NV40, but kind of matte, probably due to difference in the tech processes: the NV40/45 is manufactured at IBM’s East Fishkill facilities, while the G70 at TSMC’s fabs. The shape of the dies is different, too. The G70 is a square whereas the NV40 is a rectangular. There is no separate HSI die here – the G70 natively supports PCI Express. Our sample was manufactured during the 16-th week of the current year, i.e. somewhere at the end of April, and this indicates that NVIDIA doesn’t have problems with manufacturing the new chip. The symbols “A2” denote a second revision of the chip, and this too makes us hope that the supply will be sufficient.

The cooling system deserves to be discussed separately. It is a variation of the GeForce 6800 GT cooler, but a seriously improved one. The blower is driving air through two aluminum heatsinks joined with a U-shaped heat pipe and is also cooling the heatsink on the power regulators of the card. The heat pipe doesn’t only transfer heat from one heatsink to the other, but also takes heat off the needle section that touches the memory chips. The whole arrangement is covered in a plastic casing, although NVIDIA used to employ a metal casing earlier.

Curiously enough, it is perfectly visible in reference snapshots from NVIDIA that the first or main heatsink that takes heat off the graphics core is made of copper, while on our sample it was made of aluminum. Why? We suppose that copper coolers will be mounted on advanced versions of G70-based graphics cards, with 512MB of memory and clocked at higher frequencies, while the current version of the GeForce 7800 GTX is quite satisfied with the aluminum cooler.

Note also that the fan is connected to the card via four wires rather than two as usual. It looks like the fan is equipped with a tachometer and the fan speed control system is now more perfect and flexible than before. We’ll tell you below about its noise characteristics.

Power Consumption and Heat Dissipation

24 pixel pipelines and 302 million transistors… Such a complex graphics processor should consume huge amounts of power, but the NV40 was already so voracious that the GeForce 6800 Ultra required connection to two Molex outputs of the power supply and NVIDIA recommended 480W and higher power supplies for owners of such graphics cards. We learned later that the recommendation was due to the fact that cheaper power supplies of less wattage just could not maintain stability of the output voltages well enough. In fact, the GeForce 6800 Ultra had power consumption comparable to that of the ATI RADEON X800 XT Platinum Edition, consuming about 72 watts under load against the RADEON’s 63 watts.

So what about the appetite of the GeForce 7800 GTX? Does it need a more powerful power supply than the GeForce 6800 Ultra? It does not! By using the new, thinner tech process NVIDIA even managed to decrease the consumption. NVIDIA recommends a 350W and higher power supply capable of yielding 22 amperes on the +12V rail for a system with a single GeForce 7800 GTX. It means the GeForce 7800 GTX is more modest in terms of power consumption than the GeForce 6800 Ultra and you don’t have to change your power supply if it already complies with the above-mentioned requirements. Of course, a more expensive power supply will be necessary for a system with two GeForce 7800 GTX (500W and better, 30 amperes on the +12V rail), but such computers are top-end and you are sure to have some money for a high-quality PSU if you’ve already purchased two such cards and a SLI-ready mainboard.

At our labs we have a special testbed that can measure the power consumption of PCI Express graphics cards. It is a specially modded platform of the following configuration:

We measure power consumption not only on the external power line but also on the 12V and 3.3V lines of the PCI Express x16 slot. Thus we get detailed information about power consumption of modern graphics cards.

To load the graphics cards with work we launched the most complex of 3DMark05’s tests, that is the third one, and had it running in a loop in 1600x1200 resolution, with enabled 4x full-screen antialiasing and 16x anisotropic filtering. The results are quite interesting: the GeForce 7800 GTX with its 302 million transistors, 24 pixel pipelines and 430MHz frequency consumed 80.7 watts under that load, while the much simpler GeForce 6800 Ultra had a consumption of 77.3 watts. There’s less than 3 watts of difference! So, the 0.11-micron tech process really helped NVIDIA to keep the power consumption of the GeForce 7800 GTX on the same level with the previous-generation model. By the way, the RADEON X850 XT Platinum Edition has a result of 71.6 watts which is slightly less than the consumption of NVIDIA’s cards, but it consumed much less than the GeForces on the 12V line (31 watts against 39-40 watts) and more on the 3.3V line (5 watts against 2 watts). The power consumption on the external line was similar with all the graphics cards, being about 33-35 watts.

What’s interesting, NVIDIA mentions quite different numbers talking about the power consumption of the GeForce 7800 GTX – 100-110 watts. Why so? They just mean the so-called peak consumption – it’s when all the transistors in the graphics processor are working simultaneously, which is of course a purely theoretical situation. It is impossible to create such a load in real-life applications. According to our data, NVIDIA’s new graphics processor boasts an excellent efficiency: with all its numerous transistors and pipelines and higher frequencies, it consumes about as much power as its predecessor and does not require a power supply with the wattage of a welder. We can’t but congratulate NVIDIA on such a success!

So, people who own a GeForce 6800 Ultra or GT may go for a GeForce 7800 GTX without worrying about the power supply – the current one will do nicely. In the same way, extreme users who have a SLI configuration of two GeForce 6800 Ultra will also be able to use their older PSU with the new cards. If the GeForce 7800 GTX is going to be your first graphics card or if you currently use a GeForce 6600 GT or a RADEON X700 PRO, you should purchase a high-quality 350 or even 400W power supply. Avoid cheap PSUs from obscure manufacturers! Such devices may be assembled by a simplified design and may not maintain the necessary stability of the output voltages. Such units may damage your graphics card as well as other system components.

Noise, Overclocking, 2D Quality

The GeForce 7800 GTX behaves just like its predecessors on the system’s start. The fan of its cooling system works at its fastest and noisiest until the OS is booted. After that, however, the fan is automatically slowed down, and the noise diminishes. This graphics card is subjectively quieter than the GeForce 6800 Ultra or GT, and it is quite comfortable to work in the booted OS. A SLI configuration of two GeForce 7800 GTX is louder, of course, but the noise is bearable, especially if you compare it to the noise from two GeForce 6800 Ultra. The fan speed control system never made the fan work at its full speed throughout our tests. It means NVIDIA’s improved reference cooler does its job well. Unfortunately, you can’t put a noiseless cooler like Arctic Cooling NV Silencer 5 on the GeForce 7800 GTX because the memory chips on the reverse side of the PCB would be left without any cooling. So, we have to wait for a special version of the cooling device.

We had little hope for good overclocking with the new GeForce. A chip made of 300 million transistors and on 0.11-micron tech process without special dielectrics can’t be expected to have a high overclocking potential. Take the RADEON X800 XL as an example: this chip can seldom speed up above 420-430MHz. We were wrong, however, in this particular case. We managed to overclock the GPU on our sample of the GeForce 7800 GTX to 490MHz. That is, we achieved a 60MHz frequency gain instead of the expected 20-30MHz. The memory performed worse, overclocking to 610 (1220) MHz only, but it’s natural for 1.6ns chips that had already been working at their rated frequency. Our second sample of the GeForce 7800 GTX overclocked almost like that, too. To 480/620 (1240) MHz frequencies, to be exact.

The quality of the 2D image was excellent in all resolutions up to our monitor’s maximum of 1800x1440@75Hz. We checked this out on a 21” CRT monitor Dell P1130 which differs from the older Dell P1110 in the design of the case. Well, we didn’t expect 2D quality other than excellent from a top-end graphics card like the GeForce 7800 GTX.

Testbed and Methods

The testbed was configured like follows:

Graphics cards:

We also tested SLI configurations based on GeForce 7800 GTX and GeForce 6800 Ultra.

Drivers:

We installed ForceWare 77.62, NVIDIA’s new-generation driver of the so-called ForceWare Release 75, to test the GeForce 7800 GTX. This is the first version of ForceWare to support the new graphics processor from NVIDIA. It differs from the older official driver (version 71.89) in improved HDTV support, added game profiles, full OpenGL 2.0support, and the option for selecting a SLI mode. The interface of the control panel of the new driver has changed and become handier. We chose the following settings for the time of our tests:

           

The Gamma correct antialiasing and Transparency antialiasing options are available only if a GeForce 7800 GTX is installed in the system. These options would be missing for a GeForce 6800 Ultra. The rest of the settings were selected in the same way. We chose the Catalyst A.I Standard mode in ATI Catalyst 5.6 and set the Mipmap Detail Level to “Quality”. The VSync option was disabled in both drivers.

Anisotropic Filtering Quality

Before running the tests we decided to check out the quality of anisotropic filtering as it was done by the GeForce 7800 GTX using the appropriate function from 3DMark05.

NVIDIA GeForce 7800 GTX

NVIDIA GeForce 6800 Ultra

ATI RADEON X850 XT Platinum Edition

We could see no difference between the GeForce 7800 GTX and GeForce 6800 Ultra when they were doing tri-linear filtering only. With the RADEON X850 XT Platinum Edition the mip levels begin farther and the transitions between them are smoother than with NVIDIA’s cards.

NVIDIA GeForce 7800 GTX

NVIDIA GeForce 6800 Ultra

ATI RADEON X850 XT Platinum Edition

It’s hard to distinguish between the GeForce 6800 Ultra and GeForce 7800 GTX as concerns anisotropic filtering. But on a closer inspection you can see that the sharpness of textures is higher with the latter card. Here, the RADEON X850 XT Platinum Edition produces another picture, with less distinct mip levels. That is, the ATI card is better at doing anisotropic filtering than the GeForces. So, the GeForce 7800 GTX doesn’t bring us anything new in terms of anisotropic filtering, but its quality is higher than with the GeForce 6800 Ultra. The difference is negligible, though, especially in real gaming situations.

Transparency Antialiasing Quality

We took screenshots in two popular shooters, Far Cry and Half-Life, to see the effect of the new full-screen antialiasing methods.

 

NVIDIA GeForce 7800 GTX

ATI RADEON X850 XT Platinum Edition


no FSAA


no FSAA


FSAA 4x + TMS


FSAA 4x


FSAA 4x + TSS


FSAA 8xS + TMS


FSAA 6x


FSAA 8xS + TSS

Alas, we couldn’t find any great differences between ordinary FSAA and FSAA with enabled Transparency Antialiasing in Far Cry. Well, yes, there is a difference, but it is so negligible that you can notice it only after a careful examination of each screenshot.

NVIDIA GeForce 7800 GTX

ATI RADEON X850 XT Platinum Edition


no FSAA


no FSAA


FSAA 4x + TMS


FSAA 4x


FSAA 4x + TSS


FSAA 8xS + TMS


FSAA 6x


FSAA 8xS + TSS

Half-Life 2 is quite another story: Transparency Antialiasing makes the image much better in this game. We mean the Supersampling mode, while the Multisampling mode brings a less impressive picture, even though of an improved quality, especially in combination with FSAA 8xS. The most beautiful picture can be observed in FSAA 8xS + Transparency Supersampling mode: even the smallest details are drawn almost ideally. So, the new feature of the GeForce 7800 GTX – Transparency Antialiasing – can really improve the quality of certain details of the image, especially if combined with FSAA 8xS, but how does it tell on the speed of the card?

Performance Hit with Transparency Antialiasing

It turned out that TSAA didn’t affect the performance of card as negatively as ordinary FSAA modes. Take a look at the diagram:

As you can see, enabled Transparency Multisampling does not practically affect the performance even in the basic 8xS mode. This TSAA variety doesn’t however improve the image quality as much as Transparency Supersampling does, but even in the latter case the performance hit is less than 5%. So, if you prefer to play games with enabled full-screen antialiasing, you can turn Transparency Antialiasing on without worrying that you lose anything in speed.

GeForce 7800 GTX in Theoretical Tests

We performed a full cycle of theoretical tests to reveal the potential of the new graphics processor from NVIDIA. We made use of the following software:

Fill Rate

We traditionally begin our tests with Marko Dolenc’s Fillrate Tester.

The diagram shows that the GeForce 7800 GTX isn’t much faster than the GeForce 6800 Ultra at single-texturing. At multi-texturing, however, their graphs diverge more and the advantage of the new solution becomes evident. The GeForce 7800 GTX is by half faster than the GeForce 6800 Ultra when mapping three or four textures. Well, it means it does have 24 pixel pipelines!

Note also that the results of the GeForce 7800 GTX are rather far from the theoretical maximum which is probably due to the relatively low memory clock rate.

It’s generally the same picture with disabled Z writes. The RADEON X850 XT Platinum Edition is much slower than the GeForce 7800 GTX as well as the GeForce 6800 Ultra in both cases.

So, the GeForce 7800 GTX behaves similarly to its predecessor, but is evidently limited by the memory performance.

The GeForce 7800 GTX has 16 ROP units rather than 24, so its fill rate when working only with the Z buffer is almost the same as the fill rate of the GeForce 6800 Ultra. The newer solution is ahead because of a minor advantage in the core frequency (430MHz against 400MHz).

By the way, the GeForce 6800 Ultra and GeForce 7800 GTX both exceed the theoretical maximum in this test. We don’t know yet how to explain this fact.

Pixel Shader Performance

Marko Dolenc’s Fillrate Taster can also benchmark the performance of a graphics processor with pixel shaders of versions 1.1 and 2.0:

NVIDIA has really managed to achieve a considerable increase in the pixel shader performance. The G70 is about 1.5 times faster than the NV40 with complex pixel shaders that create per-pixel lighting. The gap becomes even wider on simpler shaders, yet does not become twofold. The RADEON X850 XT Platinum Edition outperforms the GeForce 6800 Ultra on simple shaders but the GeForce 7800 GTX with its 24 pixel processors gives no chance to the ATI card. Note also that even the new pixel processors from NVIDIA profit from using the half precision in version 2.0 pixel shaders.

The situation doesn’t change when we disable Z writes.

We get a most curious picture after we disable color writes: the GeForce 7800 GTX is more than 1.5 times faster than the GeForce 6800 on complex pixel shaders, but the gap is much smaller with simple shaders. This must be due to the architectural features of the GeForce 7800 GTX which has 24 pixel processors, but only 16 ROP units.

So, the new graphics processor from NVIDIA shows its best qualities in doing complex math1ematical commutations, and shaders that create per-pixel lighting are exactly an example of such a task.

The GeForce 7800 GTX crunches through complex pixel shaders at a tremendous speed as the results of the shader performance section of the new version of Xbitmark confirm. As expected, the new graphics card enjoys a big advantage over its opponents in subtests that involve complex math1ematical computations: in the shaders Dot Product Bump Mapping + Specular + Reflection, Factored BRDF + HDR, Metal + Phong, Wood and especially in both Dynamic Branching. The latter two are very difficult since they consist of more than 170 instructions, but the GeForce 7800 GTX maintains a frame rate above 90fps even on these shaders. By the way, these two shaders and NPR hatch 10textures PS3 are written in Model 3.0, so they don’t launch on ATI’s cards.

The advantage of the GeForce 7800 GTX over the GeForce 6800 Ultra is smaller on textures-heavy shaders like Cook Torrance + Texture + Fresnel or 27-Pass Fur, which is performed in 27 passes as its name suggests. This is the consequence of having 16 ROP units and 24 pixel processors. Anyway, the overall performance of the new product from NVIDIA is very high, and it is really the fastest consumer graphics card today.

This test is not the most complex possible. It uses only version 1.1 pixel shaders. The results of the two GeForces differ much only in 1600x1200 resolution.

The pixel shaders from the Advanced Pixel Shader test are more complex and the GeForce 7800 GTX is 1.5 times faster than its opponents here.

The GeForce 7800 GTX enjoys an even greater advantage over the previous-generation solutions in the analogous test from 3DMark03: up to 100% more speed in high resolutions! NVIDIA’s claims about a twofold advantage of the GeForce 7800 GTX at executing pixel shaders seems to be true to life.

The advantage of NVIDIA’s new architecture is also conspicuous in 3DMark05, in all resolutions. The pixel shaders from this test are math1ematically difficult, but it is here that the G70 shows its best qualities.

Vertex Shader Performance

Assuming that NVIDIA didn’t bring any serious improvements into the architecture of the vertex processors but just increased their number, the results are not surprising at all: the 6 processors of the RADEON X850 XT Platinum Edition work at 540MHz whereas the 8 vertex processors of the GeForce 7800 GTX are clocked at 430MHz.

It’s different in the analogous test from 3DMark03: the GeForce 7800 GTX is about 15% faster than the RADEON X850 XT Platinum Edition and 30% faster than the older GeForce.

The Simple Vertex Shader test performs transformation and lighting of six multi-polygonal models, each of which consists of 1 million vertexes. There is only one light source in the scene. The load on the vertex processors is high, but the vertex shaders employed are very simple; they could have been fitted into Shader Model 1, that’s why the test is called Simple. Here, the RADEON X850 XT Platinum Edition enjoys a certain advantage over the new best product from NVIDIA. The reason is the same – higher core frequency.

The Complex Vertex Shader test is – as its name suggests – much more complex. Vertex shaders are used here to create an infinite sea of grass, each blade being processed independently. The outcome is, however, similar to the previous test: the GeForce 7800 GTX is slower than the RADEON X850 XT Platinum Edition despite having more vertex processors.

The vertex processor test from Xbitmark resembles the Simple Vertex Shader test from 3DMark05, so the results are similar, too.

Thus, the new graphics card from NVIDIA is not as good at processing geometry as it is at executing pixel shaders. However, it is not much worse than the RADEON X850 XT Platinum Edition, so we don’t expect the performance of the GeForce 7800 GTX to degenerate much in games with complex geometry.

Fixed T&L Emulation

These tests seem to be obsolete, but we thought it necessary to include them into the review because modern graphics cards emulate the fixed-function transform & lighting unit with the help of vertex shaders.

When there’s a single light source, all modern top-end graphics cards have similar results, with a minor advantage on the side of the RADEON X850 XT Platinum Edition.

The GeForce 7800 GTX takes the lead as soon as we increase the number of light sources to eight. It’s like in the Vertex Shader test from 3DMark03. Both these tests use several light sources and the G70 architecture shows its best in such conditions.

The GeForce 7800 GTX is slower than the RADEON X850 XT Platinum Edition due to the low core frequency in this point sprites modeling test.

Relief Rendering and Other Tests

The GeForce 7800 GTX is by far faster than the GeForce 6800 Ultra at rendering relief with the EMBM method, but is slower than the RADEON X850 XT Platinum Edition. The only exception is 1600x1200 resolution where the new GPU from NVIDIA wins.

NVIDIA’s graphics processors have always been excellent at rendering relief with the Dot3 method, and the G70 is not an exception. It is faster than the previous-generation solutions in all resolutions.

The Ragtroll test is an indicator of the balance of the CPU-driver-GPU chain. It uses the physics engine Havok, version 1.1 vertex shaders, and version 1.4 pixel shaders. The graph of the GeForce 7800 GTX goes almost in parallel to the graph of the RADEON X850 XT Platinum Edition, and both graphics cards look somewhat worse-balanced than the GeForce 6800 Ultra. As for the absolute fps rates, the GeForce 7800 GTX wins this test.

Video Playback Performance

We used Windows Media Player 10 with the DirectX Video Acceleration for WMV HD content patch installed to compare the performance of the graphics cards at playing video in various formats. Here are our test clips:

Here are the data about the CPU load during playback:

The CPU load is really smaller with the GeForce 7800 GTX than with the GeForce 6800 Ultra, but the RADEON X850 XT Platinum Edition can boast the same results. The latter must be profiting a lot from its high core frequency – a CPU load of 47% during 1080p playback is very, very good.

The RADEON X850 XT Platinum Edition takes the first place in the DivX playback test. The peak CPU load is only 10% with this graphics card, while NVIDIA’s solutions have a maximum load of 16%.

The GeForce 7800 GTX and the RADEON X850 XT Platinum Edition are both successful with the DVD disc, while the GeForce 6800 Ultra puts a considerably higher load on the CPU.

According to the results of the tests, the GeForce 7800 GTX unloads the CPU during video playback better than its predecessor. At the same time, the results of the new product are analogous to those of the RADEON X850 XT Platinum Edition, so there is no breakthrough in this area. Anyway, the GeForce 7800 GTX ensures a higher playback quality of DVD and other interlaced content thanks to the adaptive de-interlace algorithms. Unfortunately, we can’t publish some screenshots to illustrate our point, but we’re going to do so in our upcoming articles soon.

Conclusion

So you have read the theoretical part of our review of the new graphics processor from NVIDIA and of the graphics card based on it. But is the GeForce 7800 GTX a truly new-generation solution with a unique architecture? It wouldn’t be correct to answer this question in affirmative, but it would be an oversimplification to regard the G70 as just a perfected NV40. NVIDIA’s engineers have done a huge amount of work on improving the GeForce 6 architecture and the GeForce 7 came out to be a highly potent architecture, especially as concerns processing pixel shaders.

Those 24 improved pixel processors are a power than leaves no chance to GeForce 6800 or RADEON X850 based solutions where complex math1ematical computations are to be performed. The eight vertex processors make up for the not very high core frequency, and the GeForce 7800 GTX almost matches the geometrical performance of the RADEON X850 XT Platinum Edition, the highest-frequency solution of today.

The GeForce 7800 GTX also continues the technological leadership of its predecessor by supporting a number of exclusive technologies. NVIDIA is still the only GPU supplier to offer such functionality as Shader Model 3.0 and High Dynamic Range in its products. ATI Technologies can’t offer anything like that, at least until the release of the R520 and graphics cards on it.

The only weak spots of the NVIDIA GeForce 7800 GTX seem to be the 16 ROP units and the relatively low memory frequency. In some cases the graphics processor may become limited by the number of raster operation units or by the memory bandwidth.

The GeForce 7800 GTX graphics card proper looks like an appealing solution. It is a single-slot device that consumes not so much power. The only disadvantage of this card is its dimensions: it is too long and you may have problems installing it into a small system case. On the other hand, the GeForce 7800 GTX is a high-end product, and we don’t think someone will put it into a microATX case or into a barebone system. Such graphics cards are born to live in huge, spacious and well-ventilated cases with a high-quality power supply that real PC enthusiasts certainly have.

The official price of the new product is $599, and that’s not very much for a device of that performance. But well, we are anticipating – we must check the GeForce 7800 GTX in real games before making our final verdict. And this is going to be the subject of the second part of our review of NVIDIA’s new graphics processor.