Hewlett Packard PA8700
Hewlett Packard teams up with Intel on the development of the Itanium platform, but it also has its own 64-bit architecture, the PX8x00 processor and an appropriate platform! The current generation is called PA8700, and this is the fourth generation of the HP 64-bit architecture. Moreover, after the mergence with Compaq, HP found itself the owner of one more 64-bit architecture, the legendary Alpha microprocessor (Compaq in its turn had inherited the Alpha when it bought DEC, the original developer of this architecture). Thus, one and the same corporation now has three 64-bit architectures at once - a whim of fortune!
Of course, even the glorious marketing department of Hewlett-Packard was taken aback by the necessity to differentiate the three competing architectures that targeted about the same market sector. It was all quite clear with the PA8700 and Itanium: according to the official doctrine of HP, all modern PA8700-based servers are compatible with the Itanium. And HP will transition to the Itanium in the future. This transition will be the easier as the HP UX operation system is compatible with both processor architectures. Besides that, the Itanium understands the PA8700 instruction set, so the PA8700 can be considered something like a precursor of the Itanium. Moreover, I think that the ability to make the Itanium binary-compatible with the PA8700 made HP join the EPIC platform project. As for the Alpha microprocessor, it seems to be unnecessary anymore. Of course, HP will provide support for buyers of the Alpha platform, as it took Compaq’s obligations along with its assets. This processor has no long-term perspective, though. Its developer team has already been dismissed (or, rather, it joined the team of Itanium developers). In other words, after the lifecycle of the existing platforms comes to an end, the customers will be offered to transition to the Itanium (there’s nothing bad about it, actually, as HP has devised a customer-loyal transition program).
We’ll talk about the Alpha soon, now let’s get back to the PA8700. This is an interesting processor, by the way, although it seldom catches the spotlight. This is a snapshot of its core:
You can notice some nontrivial characteristics of the chip right in the snapshot. For example, the PA8700 has a one-level cache - a strange solution in comparison to other processors. Architects from HP often stress the fact that they think it more useful to have one large cache, than a multi-level system of caches that require sophisticated internal arbitration. They also think that the commonly accepted system of “one small and fast L1 cache plus a big and slower L2 cache” suits only for benchmarks. They think that in real work, when the application always gasps for data, and other applications and services are running in the background, it is more effective to have a slower but large L1 cache, which would have enough capacity for the processor to receive data from memory without halting its operation. It is sad, but this argument will never be continued as the snapshot above shows you the last version of the PA8700 processor (the PA8700+ modification with an up to 1GHz frequency).
So the die of 304 sq. mm is manufactured with 180nm+SOI technology and includes 186 million transistors, a big chunk of which make up a 2-port four-channel partially-associative data cache of 1.5MB capacity and a four-channel partially-associative instruction cache of 0.75MB capacity. Each cache has a 128-bit bus that connects it to other processor units. By the way, this processor has the biggest L1 cache today and none is likely to surpass it in the near future. The core is a superscalar processor with the following execution units: two 64-bit integer addition/multiplication units, two shift/compare units, two floating-point addition/multiplication, and two different division/root extraction units. That is, there are four ALUs and four FPUs! Eight execution units are complemented with two load/store units. The instruction fetch unit can take up to four instructions from the instruction cache per clock cycle. The microprocessor can perform prefetch and contains (as the picture suggests) an out-of-order execution unit. This 56-instructions-long unit tracks interdependencies between instructions and data and sends ready-to-execute instructions to vacant execution units. By the way, the PA8700 has one more curious unit: the memory access reorder unit. It groups memory requests in such a way as to make the resulting execution time the smallest. The branch-prediction unit in the PA8700 keeps the history of 2K previous branches and uses the dynamic method of branch prediction.
Overall, this is an interesting processor. Let’s check its speed characteristics. It scored 642 in SPEC_int base 2000 (the PA8700+ model with 875MHz frequency; I couldn’t find the results for the 1GHz model) and 600 in SPEC_fp base 2000. We see that it can’t compete with modern server processors - now we understand why Hewlett Packard is so interested in transitioning to the Itanium platform. Moreover, the transition can be easily performed with PA8700-based systems as they use the same bus architecture as Itanium-based ones. Moreover, there are chipsets like the ZX1000 that support both processors (both systems use a 128-bit bus); only the Itanium works at a higher frequency.