Bookmark and Share

Articles: CPU

Pages: [ 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 ]

Intel Pentium 4 (Prescott)

A Pentium 4 on the Prescott core with a clock rate of 3800MHz and 2MB of L2 cache is going to be tested for the comparison’s sake, too. It is a single-core processor that supports Hyper-Threading technology. You may be interested in the results this processor will show with the algorithm we use in this test session. What surprises can there be? The two virtual processors of this Pentium 4 are physically the same core with the same L1 and L2 caches. It means that common data are processed faster by both threads. The results of reading the data from the cache are almost identical irrespective of their validity (modified or unmodified), so I will only publish two graphs – the reading of the modified data:


Pic.18: Intel Pentium 4 + HT. Sequential reading of non-modified data.


Pic.19: Intel Pentium 4 + HT. Random reading of non-modified data.

No surprises here. There’s a latency of 4 cycles when reading data blocks that fit into the L1 data cache which corresponds to the latency of this cache. A latency of 22 cycles, corresponding to the L2 cache latency, is observed at sequential reading of data blocks up to 2048KB and at random reading of blocks up to 256KB. The latency minimum at the delay chain length of 18 cycles is the same as the length of the replay loop. The latency growth at random reading of 512KB and bigger data blocks is due to the TLB size limitation (64 entries which are sufficient for only 256KB of memory). Otherwise the results are just as they should be for two threads running on a processor with a common cache.

Pages: [ 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 ]

Discussion

Comments currently: 1
Discussion started: 02/19/07 03:28:31 AM
Latest comment: 02/19/07 03:28:31 AM

View comments

You must log in to add comments.

Forgot password? Registration

remember me