The first point of interest is the original mechanism of "register rotation". As I said above, the Itanium architecture has numerous architectural registers (328, to be precise). The speed of the register file in every processor is crucial for performance, since most operations engage registers. A register file of a large size is also harder to access quickly. Intel solved this problem in an elegant way, making the register file rotate with a definite period (to be more precise, they invented a mechanism to maintain an acceptable access speed to the register file). This is a compensation for the missing out-of-order execution mechanism (which actively employs the register renaming technique, by the way).
The second point of interest is the processor’s way of dealing with branches. If the compiler finds a branch in the program code, it leaves marks in special registers (“predicate” registers). There are 72 of registers like that. In this case, the processor can use its abundance of execution units and execute both branches without writing the results into the architectural registers for a while. After the situation clears up, the wrong branch is discarded, and the results of the right branch are written down. This is an interesting solution that smoothes out the performance loss in case when the system is waiting for the user reaction, for example.
Of course, the effort spent for developing the Itanium architecture should have its reward. And the reward came as the processor speed characteristics were revealed. Today the Itanium 2A 1500MHz (6MB L3 cache) overcomes the barrier of 2000 points in SPEC_fp base 2000 test (the leader, the HP Integrity Server rx2600 system, scored 2119 points). The same Itanium shows somewhat humbler results in SPEC_int 2000 base (1322, while the same HP Integrity Server rx2600 is on top), but on a level with other processors: only the Pentium 4 and Opteron surpassed it in this test.
In other words, the Itanium is the leader (with a big advantage) in algorithms that require a lot of floating-point calculations and one of the leaders in integer algorithms. These facts determine the spheres where Itanium-based systems are strong: scientific and CAD applications, databases where large addressing space and high performance of each given processor are necessary. Systems based on this processor are also among the leaders in many server performance benchmarks.
The table below shows some info on Itanium processors: their frequencies, L3 cache size and so on:
L3 cache size*
Max. number of processors
Itanium 2 for MP and DP servers (workstations)
1.3GHz, 1.4GHz, 1.5GHz
3MB, 4MB, 6MB respectively
4 (on a single bus)
Itanium 2 for DP servers (workstations)
Itanium 2 Low Voltage for DP servers (workstations)
* The sizes of the L1 and L2 caches for all Itanium models are 32KB and 256KB, respectively.