GPU Performance Improvements: Which Route to Take?
Unlike microprocessors, which performance has been stagnating for several years now, graphics processing units (GPUs) consistently improve their speed substantially at least once a year. This happens not only because developers of graphics chips are free to implement new architectures much easier than designers of central processing units (CPUs), but also because scaling of graphics performance is very easy to achieve.
Speed in 3D applications essentially depends on a number of general factors featured by hardware: number of execution units and their clock-speeds. Since graphics processing is highly-parallelized, more units mean better performance, but since there is a plenty of complex operations to do, there will always be demand for higher performance on every single arithmetic logic unit (ALU), hence, efficiency and high clock-speed are always advantages. Given that graphics processors have been again and again improving clock-speeds and the number of execution engines for the most recent decade, we can only expect GPUs to continue doing so going forward.
But constant performance improvements cannot be done without understanding of modern games demands. Years back games created effects using multi-texturing and graphics chips needed as many texture mapping units as possible to demonstrate leading performance. Today games generate eye-candy using pixel shaders that consist of arithmetic and texture instructions. The recent trends show that modern pixel and vertex shaders tend to be a lot more math1-intensive than those a couple of years ago, thus, to offer a high-performance GPU, engineers need to shift their efforts towards the improvements of the number of engines that perform math1ematical tasks.
For designers of chips a difficulty is that it is pretty hard to determine the demands of future games while keeping performance of current titles on the high-level. As a consequence of that, we are seeing two ideologically different types of graphics processors developed by ATI Technologies and Nvidia Corp. The former unveils its Radeon X1900-series GPU that features 48 pixel shader processors, 8 vertex shader processors, 16 render back-ends (ROPs) and 16 texture units, the latter is expected to offer its GeForce 7900-series that sports 32 pixel shader processors, 32 texture address units and 16 ROPs this spring. As we see, ATI hopes that modern games will demand more arithmetic power than texture operations, whereas Nvidia seems to be sure that there is equal demand for texture and math1 power in new titles.
Which approach is better for today and tomorrow? We need to test both to find out.