You may believe that by moving the memory controller into the CPU, they should have taken a lot of load off the processor bus that is this case doesn’t have to transfer data between the CPU and the memory any more. It is partially true, but only for single-processor systems. Nehalem microarchitecture is universal, it should be used for desktop and mobile as well as server solutions. That is why Intel designed a new processor bus that could suit for multi-processor systems and provide sufficient bandwidth and scalability. Intel engineers didn’t have a choice anyway, because the traditional FSB bus cannot be used in this case. Multi-processor systems on processors with integrated memory controllers should use NUMA memory model (Non-Uniform Memory Access) and hence require direct high-speed connection between the CPUs.
To accomplish this task they built special serial interface called CSI (Common System Interface) with point-to-point topology that was later renamed to QPI (QuickPath Interconnect). On the technical side, QPI consists of two 20-bit links transferring data forward and back. 16 bit are assigned for data and the remaining 4 bits serve some auxiliary purpose: they are used by the protocol and error correction. This bus performs maximum 6.4 mln transfers per second (GT/s) and has 12.8GB/s bandwidth in each direction, or 25.6GB/s total bandwidth.
The current bandwidth of the new QPI bus allows us to call it the fastest processor bus out there. The old Quad Pumped Bus can only reach 12.8GB/s total bandwidth at 1600MHz frequency. HyperTransport 3.0 bus similar to QPI and used in contemporary AMD processors can boast only 24GB/s peak bandwidth.
Depending on their market positioning, processors on Nehalem microarchitecture may come equipped with one or multiple QPI interfaces. As a result, each CPU in the multi-processor system may be directly connected to all other processors to reduce the latency when working with the memory connected to another controller. CPUs for single-processor desktop systems will have one QPI connecting it to the chipset.