Part 7: More Bus Bandwidth
Now, we have come to the processor bus and the memory subsystem of PowerPC970 platform. The situation is appealing here. The processor bus works at one fourth of the processor frequency (that is, it is 450MHz for the 1.8GHz processor). The bus uses DDR technology to produce an effective bandwidth of 900MHz. As the width of the bus is 64 bits (there are nuances here to be discussed below), we’ve got a theoretical peak bandwidth of 7.2GB/s. IBM says the weighted average bandwidth is 6.4GB/s. It is closer to reality, considering the effect of latencies, memory access, and chipset peculiarities. In other words, PowerPC970 platform features the today’s fastest bus among all modern processor architectures. In fact, however, the bus of PowerPC970 consists of two one-way buses, each 32bits wide. That is, you can send 3.6GB/s in each direction, but not 7.2GB/s in one direction (there is certain similarity with the HyperTransport “system bus” of the Athlon 64 processor). Another thing is that RAPID I/O bus (that’s how IBM calls the whole family of high-speed serial buses; although it is called Elastic I/O here) has another interesting property (at least, in theory): it can change its direction. It takes some time (a few hundred CPU clock cycles) for the bus controller in the chipset to switch the bus into a single-directional mode when the bus pumps data in one direction only. Unfortunately, I don’t know if this operational mode is implemented in the chipset of the new platform from Apple. It would be interesting to check out the performance of the concept as well as learn whether the data streams from and into the processor differ so much that it makes sense to use the bus in this way.
The memory subsystem changed, too. Instead of DDR333 SDRAM (PC2700), PowerPC9700 platform uses dual-channel DDR400 SDRAM. The difference is obvious: 6.4GB/s bandwidth instead of 2.7GB/s. Note that G4+ platform didn’t allow using the full bandwidth of DDR SDRAM as the processor bus could only transfer data at 1.3GB/s (that was similar to the Pentium 3 platform – DDR SDRAM provided no benefits). PowerPC970 platform actually owes much of its improved performance to the memory subsystem (and faster bus) – the previous platform, G4+, used a bus clocked at 166MHz (synchronously with the memory). That is, the memory bandwidth doubled and the bus bandwidth grew five times bigger.
Besides other things, PowerPC970 processor supports SMP mode (Symmetric Multiprocessor) so you can easily use these processors in pairs. By the way, the ability of PowerPC970 to work in multi-processor systems is actively employed by Apple in their marketing campaign. It is very interesting that in spite of the similarity between the Elastic I/O and HyperTransport buses, bus architecture is used for building a dual-processor system, similar to Intel systems. That’s curious, although quite reasonable: bus architecture is usually simpler to implement. Moreover, it is enough to build desktops or inexpensive workstations, while the NUMA version would require a serious redesign of the operation system. And this is a sore spot: as I have mentioned earlier, Apple suffers great pains moving to the new Max OS X.
Overall, PowerPC970 platform feels all right on this front.