The G80 graphics processor can be viewed as consisting of two parts. We have described the execution core above. The other part is called Lumenex Engine and it is responsible for sampling and filtering textures as well as for full-screen antialiasing, HDR, and the output of the rendering results to the monitor. In other words, this part of the G80 incorporates texture caches, memory access interface, TMUs and ROPs.
The flowchart of the G80 shows that the 128 stream processors are organized into 8 groups with 16 processors in each. Each group has a corresponding texture sampling and filtering unit, consisting of 4 TMUs. So, the G80 contains a total of 32 TMUs each of which is designed like follows:
Each TMU contains one sampling and two filtering units. The speed of bi-linear and 2x anisotropic filtering is 32 pixels per cycle for each filtering type. Bi-linear filtering of FP16 textures is performed at the same speed whereas the speed of FP16 2:1 anisotropic filtering is 16 pixels per cycle. The GeForce 8800 GTX’s Lumenex Engine is clocked at 575MHz, so the theoretical scene fill rate is 18.4 gigatexels per second when both bi-linear and 2x anisotropic filtering are in use.
The raster operators, also part of the Lumenex Engine, are grouped in 6 sections, each of which can process 4 pixels (with 16 subpixel samples) per cycle, which provides a total of 24 pixels per cycle with color and Z values processing. If only the Z-buffer is employed, the max number of processed pixels is 192 per cycle in normal mode and 48 per cycle with 4x multisampling.
The ROP subsystem supports all kinds of antialiasing: multisampling, super-sampling and transparency antialiasing. In addition to the standard selection of FSAA modes, the new GPU offers 8x, 8xQ, 16x and 16xQ which will be discussed below. Antialiasing of textures in FP16 and FP32 formats is fully supported, so the problem of the GeForce 6 and 7 architectures that could not simultaneously use FSAA and FP HDR is solved in the GeForce 8.
Nvidia says the memory subsystem of the GeForce 8800 features a new controller, yet it hasn’t changed greatly since the GeForce 7 series. The number of sections has grown from 4 to 6, so the total memory bus width has grown from 256 bits (4x64) to 384 bits (6x64). Support for GDDR4 has been added, although even ordinary 900 (1800) MHz GDDR3 can provide a bandwidth of 86.4GB/s. The high frequencies of GDDR4 are not yet called for.