Render Back-End: Now with Accelerated Stencil!
Render back-ends or, in another terminology, raster operators (ROPs) are four rather complex processors capable of performing typical rasterization operations such as blending, antialiasing, processing the alpha channel, depth buffer, stencil buffer, etc. Each processor contains 4 alpha channel subunits, 8 depth/stencil subunits, 4 blending subunits, and 16 programmable subunits for antialiasing.
In ordinary terms, the R600 can be said to have 16 ROPs, but this just an approximation if you won’t count in the peculiarities of the render back-ends architecture.
Just like the texture processors of the R600, its render back-ends are complex devices consisting of separate subunits that perform different operations. Each render back-end contains:
- 4 subunits for processing alpha channel and fog
- 8 subunits for processing the Z- and stencil buffers
- 4 blending subunits
- 16 subunits for multisampling
Unfortunately, we cannot compare the R600 and G80 here since we don’t have detailed info about their architectures, but it’s clear that the raster back-ends of the new GPU series from AMD can work with the Z-buffer at a double speed. With four raster back-ends the Radeon HD 2900 can process 16 pixels with color and Z values or as many as 32 pixels if only Z-values are processed. This is not much in comparison with the GeForce 8800 GTX that can process up to 192 Z-values per clock cycle, but is an improvement over the Radeon X1950 XTX that had no means to accelerate the processing of the Z-buffer.
The render back-end architecture of the Radeon HD 2000 is more straightforward that the texture processor architecture. We can assume the Radeon HD 2900 XT to have 16 ROPs. The ability to process only 16 pixel per cycle as opposed to the GeForce 8800 GTX’s 24 or the GeForce 8800 GTS’ 20 should be compensated by the higher clock rate of the ROPs, which is 740MHz for the Radeon HD 2900 XT as compared with the GeForce 8800 GTX’s 576MHz and the GeForce 8800 GTS’ 513MHz. So, there should be no bottleneck at this point of the R600. We’ll check it out in our tests, though.
The antialiasing subunits shouldn’t be a bottleneck, either. Each render back-end of the R600 chip contains 16 such subunits for a total of 64. Being programmable, these subunits ensure high performance and wide capabilities in terms of full-screen antialiasing. We’ll tell you about the new MSAA methods supported by the Radeon HD family below.