For example, the Athlon MP has a shorter pipeline (10 stages) and the same manufacturing technology – its maximum clock rate lies around 2.2GHz. I don’t even mention the Xeon with its 3.2GHz. Manufacturers of RISC processors could learn something from their x86 counterparts in this respect. Of course, the frequency is not the only factor that matters - let’s see what the UltraSPARC III offers us in the way of performance.
Both modifications of the CPU contain an on-die memory controller, which makes them look similar to the Opteron. The UltraSPARC III uses SDRAM clocked at 150MHz with a 128-bit bus and 2.4GB/s bandwidth – quite good even by the modern standards. Special 144-pin ECC SDRAM modules are used and they cost a lot. Each processor can support up to 16GB of memory. The UltraSPARC IIIi uses DDR SDRAM, but via a narrower (64 bits) bus, which makes it about equal to the previous variant in performance.
Processors are linked with a broad and fast Fireplane bus (128 bits, 150MHz). The UltraSPARC IIIi allows building systems with 1-4 processors. The UltraSPARC III supports over a thousand processors per system, of course in a switch-based architecture.
The performance numbers follow: 642 points in SPEC_int base 2000 and 1074 in SPEC_fp base 2000. By the way, there is a big difference between “base” and “peak” modes in the SPEC_fp test (1344 for peak). It seems like the compiler from SUN is not perfect. Unlike other RISC processors, there is also a difference between floating-point and integer calculations (other RISC architectures usually have these numbers about equal). The lack of out-of-order execution probably accounts for this, but that’s only my supposition.
Clearly, the UltraSPARC III doesn’t show a miraculous performance in SPEC_int base 2000. It does better in SPEC_fp base 2000, but only against other RISC architectures. The Itanium is the leader here, and no one is likely to challenge its superiority in the near future.
In fact, it is the system architecture, rather than processor performance, that’s the strong point of the Sun concept. Thanks to the intelligent switch architecture, a big external cache and broad buses, Sun systems don’t stagger under increasing workloads, and that’s why they enjoy success.
The SPARC.V9 instruction set, employed in the UltraSPARC III, is licensed freely. Thus, there are processor clones that are compatible with the UltraSPARC III in the instruction set. The most popular and successful of them is the SPARC 64-GP from Fujitsu-Siemens. It has a somewhat different cache topology – it has a zero-level cache instead of L1 (its size is 16KB), then goes the L2 cache (L1 in the original topology) with a capacity of 256KB (128KB for instructions and data each), while the off-chip cache becomes a L3 cache with 8MB capacity.
There is an UltraSPARC IV processor project underway that’s going to be two UltraSPARC III CPUs in one die. So far, there’s no info about its performance and availability, so I don’t include this solution into this article.