T&L and Vertex Shaders
Let’s see how well the vertex pipelines of the new NVIDIA’s chip work. First, we will watch the way GeForce FX performs the functions of the classical “Fixed-Function” T&L of DirectX 7:
With its three vertex pipelines, NVIDIA GeForce FX runs faster than four-pipelined RADEON 9700 PRO even at the reduced frequencies. The T&L functions are emulated in these chips by means of vertex shaders and it seems as if NV30 has some “shortcuts” or “rudiments” of the T&L unit, which account for its higher performance. Or it is just more efficient at compiling T&L commands into vertex shaders commands, providing the maximum speed of T&L commands execution.
GeForce FX loses its ground as soon as we turn to vertex shaders. It has fewer vertex processors than R300 (three against four); moreover, their efficiency turns to be lower than that of vertex units in RADEON 9700 PRO.
The picture remains the same with ver.2.0 vertex shaders: NVIDIA GeForce FX working at nominal frequencies is a little behind ATI RADEON 9700 PRO. The lag of GeForce FX is not a nice thing,, but at least there are no such surprises like those we saw in the pixel shader test.
The last synthetic test for today is Ragtroll from 3DMark03:
This test uses 1.1 vertex shaders for transformation of the models of falling trolls. It also features program calculation of collision physics. Thus, the engine of the test distributes workload between the GPU and CPU of the system.
The results suggest that the test is limited by graphics cards performance. ATI RADEON 9700 PRO has higher-performing vertex shaders units and surpasses GeForce FX 5800 Ultra in about the same proportion as in the vertex shader tests.