Now let’s see what results can be obtained during our latency testing.
We measured the memory latencies on this graph in processor clocks. Therefore, you can only compare the results of the CPUs working at the same clock frequencies. However, we can still discover a few very interesting phenomena. For example, the latency of L3 cache in Pentium 4 Extreme Edition is not bad at all: only twice as big as that of the L2 cache.
To make the analysis more fair we will transform the latencies from processor clocks into time:
This graph give us more food for thought. First of all, you notice right away that the memory subsystem latency of the Athlon 64 FX processor is very low even compared with the dual-channel memory controllers used in Intel’s platforms. This is exactly where the dual-channel memory controller shows its very best. Note that this is far not the top limit for it. As is known, Registered memory features slightly higher latency than non-registered memory that is why the upcoming modifications of the Athlon 64 FX memory controller planned for next year will definitely speed up this processor family quite tangibly.
Here are the same data summed up into a table for your convenience:
Pentium 4 3.2
Pentium 4 3.2 EE (2MB L3)
Athlon XP 3200+
Athlon 64 FX-51
L1 cache latency, cycles
L2 cache latency, cycles
L3 cache latency, cycles
Memory latency, cycles
L1 cache latency, ns
L2 cache latency, ns
L3 cache latency, ns
Memory latency, ns
I would like to say that extremely low L1 cache latency is a powerful trump of the Pentium 4 architecture. Although this cache is also pretty small: only 8KB. L1 cache of Athlon 64 FX is slower but at the same time much bigger. The latencies of L2 caches of Intel and AMD processors have become nearly identical after the L2 cache bus had been improved in Athlon 64 FX. Therefore, the larger cache of AMD Athlon 64 FX processor may ensure a certain performance advantage of this processor over the rival.