The estimated cost of development and introduction of this architecture was too high for even such a semiconductor giant as Intel, so the corporation teamed up with another big shark, Hewlett-Packard. That company had an extensive experience of developing 64-bit CPUs (besides other things, Hewlett-Packard is the developer and manufacturer of HP PA-8xxx series processors, which I will discuss later) and a team of software developers - that was exactly what Intel hadn’t at that time.
The first version of the processor, Itanium, came out with a long delay, like two or three years (depending on the date from which you count up the delay). Intel had been warning beforehand that this first CPU would be rather a sample, just for the market to get to know the new architecture. Well, that’s exactly what we saw - there was no agitation about first Itanium-based systems. Of course, big corporations showed up with announcements of such products, but this was definitely not a commercial product. It was not one just because there was no software to run on it. Although Intel took care of the option of running x86 code on the Itanium, this ability was somewhat theoretical. Well, you could get a performance of Pentium 90 out of an Itanium processor with 800MHz frequency and nothing more.
Architecture of Itanium-Based Systems
Intel Corporation is conservative in designing architectures for multiprocessor systems. Up to four Itanium CPUs can be settled on a shared 128-bit bus (400MHz frequency, 6.4GB/s bandwidth). By the way, developing such a wide and at the same time high-speed bus is not a trivial task. Particularly, the bus is divided into electrically-independent segments of 8 bits, each of which has its own synchronous signal, to avoid cross-talk effects and other surprises of physics.
Computers with more than four processors use switches that join several systems, each of which consists of several processors on a shared bus. By the way, manufacturers usually install no more than two CPUs on each shared bus as the bus bandwidth is shared between the processors. The 4-way variant means a 1.6GB/s bandwidth chunk for each processor, while two processors on a shared bus will have 3.2GB/s each. To supply the processors with data from the system memory at an appropriate rate, multi-channel memory interleaving is employed (typically, eight-channel). PC1600 ECC registered memory is usually used.
Today, the number of processors in Itanium-based systems can reach as many as 128 (I mean mass-produced servers, while nothing prevents you from ordering a cluster with thousands of processors). The growth of this number is only limited by the fact that more processors would bring a small performance gain, but make the system much more expensive.