Modern Server Processors
Enough of theory! It’s time we saw how the above-exposed theoretical premises are implemented in existing products. Let’s also limit the sphere we are going to cover, by the way. Evidently, one article can’t accommodate descriptions of all processors and platforms. So let’s put aside x86 CPUs - they have enough of our attention anyway - but talk about architectures that are usually just glanced over.
So we will talk about RISC architectures and their potential successor - the Intel Itanium.
Intel Itanium Platform
This is a famed platform. Once it was supposed to replace the “out-dated and slow” x86 platform. Today there’s less certainty about the Itanium being the x86 killer, though.
The main idea of the Itanium is making the processor perform more work per clock cycle. This is achieved by increasing the number of execution units operating in parallel. The work of this processor is described by the VLIW concept (Very Long Instruction Word). I won’t cite it in detail, just a few basic things:
- The CPU performance is increased by making it perform more work per clock cycle. Optimized code is necessary for that, which is created by a special compiler;
- Data and instructions are packed into long “words” and sent for execution;
- There’s no (!) paralleling logic – it is the compiler that must create the optimal and dense command stream. Thus, there’s no out-of-order execution – the proper planning of the execution order is also the compiler’s job;
- The processor consists of a set of execution units, buffers and cache. All other things being equal, it allows higher performance because it’s now possible to use the free space for enlarging cache or including more execution units;
- The Itanium contains many general-purpose registers, 328 in total. They are 128 general-purpose registers, 128 FPU registers and 72 “predicate” registers (see below). A unique register rotation mechanism is employed for reducing the load on this unit and increasing its efficiency.
So the general ideas of the VLIM concept are revealed above - we make the processor perform better by feeding it not chaotic code that the processor’s logic then tries to comb up “on-the-fly”, but pre-optimized code, created with a special compiler. The problem of efficiency is thus solved beforehand. Intel terms this concept EPIC - Explicitly Parallel Instruction Computing. This concept can be considered as a post-RISC concept to some extent.
The Itanium architecture started to be developed about twenty years ago, when Intel found itself obliged to offer its alternative to the leaders in the high-performance CPU sector of those times. It wanted a processor that could be used in top-end servers. Of course, that architecture had to be 64-bit - this requirement ensued from the need for a large address space and large amounts of supported memory. It had to be scalable both in frequency and in the number of processors. In perspective, if everything went right, this platform was to oust x86 CPUs (which were lagging behind all other processor architectures in performance). Thus, Intel conceived a smooth transition to the architecture of the future. It would be an architecture where the compiler, rather than hardware, played the crucial part, although hardware solutions would be important, too.