Anisotropic Filtering
As you know, graphics chips from NVIDIA have been using the same anisotropic filtering algorithm since GeForce3 (NV20) (see our Leadtek WinFast GeForce3 TD Review). By regarding pixel projection onto the polygon as an ellipse, not a circumference, NVIDIA GeForce3 and GeForce4 (NV20 and NV25/28) GPUs took not one but a few bi-linearly filtered points distributed evenly along the main axis of the ellipse and found the resulting color by averaging their colors. Depending on the anisotropy level, the number of the bi-linearly filtered points could change from 1 to 8 (1, 2, 4, 6, 8). The sketch below represents the variant with eight points:

The texturing units of NV20 and NV25 can take four samples at a time and perform bi-linear filtering on them at two neighboring MIP-levels simultaneously (eight samples and two bi-linear filtering operations in total). In other words, they performed tri-linear filtering in a single clock.
When it comes to anisotropic filtering, GeForce3 / GeForce4 require one clock to process each reference-point. So, the highest level of anisotropy may take eight cycles per pixel. The picture below illustrates a situation like that. Every reference-point represents the result of tri-linear filtering on N and N+1 MIP-levels. We put the GPU clocks marks next to the reference-points for your convenience:

It’s a known fact that the texturing units of GeForce3 / GeForce4 work at half of their full capacity when tri-linear filtering is not used: they just do one bi-linear filtering operation per clock.
But what if we “teach” texturing units to distribute the computational resources so that they were not idling when tri-linear filtering is off?
In this case, every texturing unit, able to take eight texture samples and perform two bi-linear filtering operations per clock, could process two reference-points at a time in case of enabled anisotropic filtering. The next picture shows this situation: we use one MIP-level; GPU clock-cycles are written next to the marked points:

When the GPU follows this scheme, it performs anisotropic filtering twice as fast as the combination of anisotropic and tri-linear filtering. If GeForce FX does use this particular work scheme, then 2x anisotropic filtering without tri-linear filtering or with tri-linear filtering in the “Aggressive” mode should be “cost-free” for the GPU from the performance point of view.
Of course, all our speculations about anisotropic filtering are nothing more than suppositions of ours. NVIDIA, of course, doesn’t tell the details of its “fast” anisotropic filtering algorithm. Nevertheless, these suppositions fit quite well with what we see in “Balanced” and “Aggressive” modes of GeForce FX. In these modes, the number of “true” tri-linear filtered pixels is small. That is, the substitution of “true” tri-linear filtering with the not fully-fledged one results not only in the smaller amounts of the texture data requested by the GPU, but also reduces computational costs of anisotropic filtering. The performance drop we witness when anisotropic filtering is enabled also confirms our “speculations”.
But let’s get back to the image quality. Another scene from Serious Sam: The Second Encounter is going to help us estimate the quality of texture rendering with enabled anisotropic filtering:

For the tests we used the same settings for both graphics cards: “Quality”, 32-bit color. We ran the “GFX: Extreme Quality” add-on.
NVIDIA GeForce FX was benchmarked in “Application”, “Balanced” and “Aggressive” modes.
It’s hard to pick up corresponding settings in the driver for ATI RADEON 9700 PRO: someone would certainly ask why this slide-bar is set this way, not the other way :)…
That’s why we bear all the responsibility for our decision to test this card in two “boundary” modes: “Speed”, when all the driver settings are set to maximum performance:

and “Quality” mode, which means that all settings are set to maximum quality:






