Articles: CPU
 

Bookmark and Share

(21) 
Pages: [ 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 ]

Microarchitecture Comparison: Intel Core vs. AMD K8

Of course, Intel’s upcoming processors based on Core Microarchitecture will compete primarily with AMD K8 CPUs. These processors are today’s most advanced solutions. Let’s take a close theoretical look at Intel’s new Core Microarchitecture against the background of the good old AMD K8:

 

Intel Core

AMD K8

L1 data cache

32 KB

64 KB

L1 instructions cache

32 KB

64 KB

L1 latency

3 clock cycles

3 clock cycles

L1 associativity

8-way

2-way

L1 TLB size

Instructions: 128 entries
Data: 256 entries

Instructions: 32 entries
Data: 32 entries

Max. L2 cache

4 MB for two cores

1 MB for each core

L2 latency

14 clock cycles

12 clock cycles

L2 associativity

16-way

16-way

L2 cache bus width

256 bit

128 bit

L2 TLB size

?

512 entries

Pipeline

14 stages

12 stages

x86 decoders

1 complex and 3 simple

3 complex

Integer execution units

3 ALU + 2 AGU

3 ALU + 3AGU

Load/Store units

2 (1 Load + 1 Store)

1

FP execution units

FADD + FMUL + FLOAD + FSTORE

FADD + FMUL + FSTORE

SSE execution units

3 (128-bit)

2 (64-bit)

This table explains a lot of things right away. And the most important thing is that the processors with Core microarchitecture have “wider” architecture that allows processing more instructions per clock cycle than CPUs with K8 microarchitecture. Although the execution units of both competing processor architectures can process up to three x86 and x87 instructions per clock cycle, Core Microarchitecture should prove more efficient with SSE instructions. While K8 processors can perform only one 128bit command per clock, Core can process up to three commands like that.

Moreover, Core Microarchitecture boasts another great advantage: more advanced decoding system. Together with the four decoders, macrofusion technology allows decoding up to five instructions per clock (in an ideal case). The competitor processors can only decode three instructions simultaneously. All this indicates that the decoders of Core Microarchitecture based CPUs will be able to better load the processor execution units by performing up to four instructions per clock in the most optimal conditions. In this case the overall commands execution will go 33% faster than by K8 AMD processors.

Here I would like to also mention more efficient data processing algorithms of the CPUs on Core Microarchitecture. The advantages of this microarchitecture show themselves best in the data caching system. Although, the L1 cache of the Core based processors is smaller, it is more associative. And as for L2 cache, it is not only bigger but also has higher bandwidth. Moreover, the shared structure of the L2 cache memory is beneficial for multi-threaded workload.

An important addition to the data prefetch algorithms of the new Core based processors is the unique memory disambiguation technology that has no analogues in the competitor solutions. It makes the upcoming Intel processor more out-of-order (from the code prospective).

In fact, the only indisputable advantage of the AMD K8 microarchitecture that will survive the arrival of Core will remain the integrated memory controller that can definitely ensure lower latency during data processing. However, it is a very tough question if integrated memory controller will be enough for AMD to worthily oppose the Conroe processors, and we still have to answer it later. However, AMD engineers are not keeping their hands in pockets. The future Athlon 64 cores scheduled to come out in early 2008 will be free from some architectural bottlenecks. But, it is a different story and a different article.

 
Pages: [ 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 ]

Discussion

Comments currently: 21
Discussion started: 06/29/06 06:26:14 PM
Latest comment: 05/05/07 04:54:56 AM

View comments

Add your Comment