<%BANNER[top_768x90]%>

<%BANNER[banner_468x60_h]%>

AMD Athlon 64 Performance Preview

We managed to get our hands on an engineering sample of the AMD Athlon 64 2800+ processor. We couldn’t help testing it, of course! In this article we are going to discuss the major features and benchmarks results for this processor.

by Ilya Gavrichenkov
04/18/2003 | 05:04 PM

When Athlon processors came out in 1999, the competition in the processor market became much worse. The progressive architecture of this CPU allowed AMD to prove a worthy competitor to Intel and to give them quite a lot of causes for concern. The company’s processors didn’t yield to Intel’s solution in performance, and sometimes were even faster than those. However, it has become quite a hard task for AMD lately to retain this parity with Intel. During Athlon’s architecture life-time Intel managed to shift from Pentium III architecture to absolutely new Pentium 4 architecture, and then enhanced it significantly by increasing the L2 cache size and speeding up the system bus. Athlon also underwent certain enhancements, although, they never were that drastic. The most AMD did, included only system bus overclocking, L2 cache size increase, implementation of SSE support. As a result, Intel is now somewhat ahead of the competitor: Pentium 4 working frequencies grow very rapidly, while AMD has reached the top of Athlon architecture potential.

<%BANNER[article]%>

However, AMD has one very strong move reserved, which can change the situation in the market dramatically. The company prepares a new Athlon 64 processor to enter the desktop CPUs market. And even though they have been delaying this processor a number of times, so that the actual launch date has been postponed for almost a year since the initial schedule, the new architecture to be introduced in this CPU keeps exciting the public quite a lot. Take for instance the fact that Intel Company started paying quite a bit of attention in its confidential documents to the competitiveness of their Pentium 4 processor family against the upcoming AMD Athlon 64:


AMD Athlon 64 aka ClawHammer
is exactly Intel’s biggest cause for concern

But it is still pretty unclear, if the architectural advantages of the upcoming Athlon 64 will allow it to outperform Pentium 4 working at unbelievably high clock frequencies today. All information about the alleged performance of Athlon 64, which had leaked into the web so far, was either very poor and insufficient for any more or less definite conclusions, or concerned AMD Athlon 64 processors working at very low clock frequencies.

Luckily, today we will be able to partially make up for this drawback.

We managed to get hold of a much newer AMD Athlon 64 pre-production sample, which we are going to test very thoroughly today. We would like to remind you that officially, the launch of AMD Athlon 64 is scheduled for September this year. In this article we are not going to go into details concerning the Athlon 64 architecture, that is why we suggest that you check our article called “A Glance at the Future: AMD Hammer Processors and x86-64 Technology”, before you continue reading this review.


Closer Look: CPU

During the work on the new processor, AMD took the successful Athlon architecture as a basis. That is why the newly designed CPU has very much in common with the predecessor. Without going into details, I will list here the major differences between the ClawHammer core of the upcoming Athlon 64 processor and the contemporary Athlon XP.


It’s also an Athlon, but a different one.

Well, please meet the CPU, which we managed to get:


AMD Athlon 64 2800+

As we see, our CPU is marked according to the rating, i.e. as 2800+. The production date in the next line of the marking indicates the beginning of this year. Athlon 64 processor with this rating is most likely to become the slowest model in the new processor family, when it is finally launched.

As we see, AMD is not going to give up the idea of marking CPUs with the rating instead of writing their actual core frequency. Well, let’s find out what hides behind this marking.


Many of you may be disappointed:
the actual core frequency of this processor is only 1.6GHz

The actual working frequency of this processor is 1.6GHz. When AMD faced some problems with increasing the ClawHammer core frequencies, they had to make the L2 cache bigger, to ensure adequate performance of the new CPUs. That is why our Athlon 64 2800+, working at pretty low actual core clock, features a 1MB L2 cache. The first mass Athlon 64 processors will have exactly the same L2 cache configuration. A bit later AMD will also launch a less expensive version with 256KB L2 cache.

By the way, you shouldn’t forget that Athlon 64, just like Athlon XP, features an exclusive L2 cache. In other words, taking into account the 128KB L1 cache (64KB for instructions and 64KB for data), inherited from Athlon XP, the overall cache memory of the new Athlon 64 makes 1152KB. The 0.13micron SOI technology used for Athlon 64 production allows placing so much cache memory on the same die with the actual CPU. As you may know, Intel will implement 1MB L2 cache only in Prescott processors, which will be manufactured with 0.09micron technology.


As for the structure of cache memory, the L2 cache of the Athlon 64 (just like the L2 cache of Athlon XP) features 16 associativity fields and 64-Byte long lines.


L2 cache of Athlon 64 is built like the L2 cache
of the current Athlon XP

Speaking about the support of different technologies, we should resort to the results obtained with the help of SiSoft Sandra 2003 benchmark:


3DNow!, x86-64 and SSE2 are all here

As we have expected, the CPU supports old instructions sets, like: MMX, 3DNow!, and SSE, as well as SSE2 and x86-64, which are completely new for AMD processors. Also note that the CPU features a built-in thermal diode. Hopefully, Athlon 64 is a hard nut to cook. Also AMD has finally listened to users’ complaints and hid the fragile CPU die under a copper lid with nickel-plated surface.

The CPU we tested was built with the B0 core revision. It is a much newer core than those used in the very first samples working at 800MHz and 1.2GHz core frequency. AMD redesigned this core to increase the clock frequencies and improved the embedded memory controller a little bit. As a result, our processor supported not only DDR266/DDR333 SDRAM, but also DDR400 SDRAM. However, unlike the server Opteron CPU, Athlon 64 will have a single-channel memory controller that is why the maximum memory bandwidth of the Athlon 64 based systems will make 3.2GB/sec.

As for HyperTransport bus, connecting Athlon 64 with the mainboard chipset, our CPU featured one 16bit bus working at 800MHz, which provided the bandwidth of 3.2GB/sec in each direction. However, we doubt that Athlon 64 will really need such a high-performance bus, because unlike other CPUs, it will not transfer any data to/from the system memory. Anyway, extra bandwidth has never been a drawback.

Summing up everything we have just said, I composed the following table listing the major features of contemporary desktop CPUs:

CPU

Athlon 64

Athlon XP

Pentium 4

Core

ClawHammer

Barton

Northwood

Production schedule

Fall 2003

Early 2003

Early 2002

Frequencies

1.6GHz+

1.83-2.2GHz

1.6-3.2GHz

Production technology

0.13micron SOI

0.13micron

0.13micron

Platform

Socket 754

Socket A

Socket 478

Bus frequency

800MHz

333/400MHz

400/533/800MHz

L1 cache for data

64KB

64KB

8KB

L1 cache for instructions

64KB

64KB

12KB

L2 cache

1024KB

512KB

512KB

L2 cache frequency

Full core frequency

Full core frequency

Full core frequency

Additional instructions sets

MMX

+

+

+

3DNow!

+

+

-

SSE

+

+

+

SSE2

+

-

+

x86-64

+

-

-


Closer Look: Platform

Since the new Athlon 64 processors feature their own bus, they should be installed into a special processor socket: Socket754 with 754 pins and about 4cmx4cm dimensions.


754 pins. This is even more than by Xeon.
The reason is the integrated memory controller.

Therefore, you will need new mainboards based on new chipsets to be able to use the new Athlon 64 processor. For our test session of Athlon 64 2800+ we managed to get a mainboard from one well-known Taiwanese manufacturer. Unfortunately, we cannot share a picture of this mainboard with you, however, it has already been mentioned in many on-line sources multiple times, including our web-site.

This mainboard made an impression of the finalized sample: it worked quite stably and didn’t cause us any problems during work. The mainboard was based on VIA K8T400M chipset.


VIA K8T400M – is one of the major core logic
solutions for Athlon 64 platform.

This chipset boasts quite normal features. It supports AGP 8x, ATA/133, USB 2.0, etc.


Now they use VIA VT8235 South Bridge.
Later it will be replaced with VT8237
South Bridge with SerialATA support.

According to the chipset’s features, our mainboard was equipped with two DDR DIMM slots for DDR400/DDR333/DDR266 SDRAM, an AGP 8x slot, 5 PCI slots and 6 USB ports. Also there was an additional SerialATA/RAID controller from Promise, which added a couple of extra SerialATA-150 connectors to the mainboard as well.

The system built of this mainboard and an Athlon 64 2800+ processor worked very well in our lab. Here is the BIOS report, for instance, which we obtained during system boot up:


Our mainboard used AMI BIOS.

And this is a Windows XP message about the type of the system CPU:


Again the relatively low core clock catches our eye.

By the way, the BIOS Setup of this mainboard boasted a couple of very interesting issues. For example, this is the page for HyperTransport bus configuration:


As we have promised you: 16bit, 800MHz.

And this is the page for managing the memory controller integrated into the CPU:


DDR400 SDRAM is supported!

The mainboard’s BIOS Setup also offered some CPU overclocking opportunities. It was possible to change the bus frequency and the clock frequency multiplier. It is hard to say whether the mass Athlon 64 processors will be shipped with an unlocked multiplier. But our sample was exactly like that.


Testbed and Methods

First of all, I should point out that all our tests were run in 32bit Windows XP operation system and in 32bit applications. Unfortunately, 64bit operation systems and applications supporting x86-64 are not available yet. By the time Athlon 64 is out we have every reason to expect 64bit Windows XP version, as well as a few 64bit applications to be released. Among the first 64bit applications to come is the Unreal Tournament game, which upcoming announcement has already been disclosed by Epic. The use of 64bit applications may result into a significant boost of Athlon 64 performance due to involvement of additional registers and their expansion. However, there will hardly be too much software supporting x86-64 this year. Even AMD evaluates its prospects in terms of 64bit applications promotion as follows:


By the beginning of 2004 there will be
around 20% of 64bit software.

In other words, the results our Athlon 64 2800+ will demonstrate today, will not illustrate the performance these processors will show in 64bit operation systems and applications.

In our test session we will compare Athlon 64 2800+ with the following processors:

So, we will test the following systems:

 

Athlon 64 2800+

Athlon XP 2800+

Athlon XP 1.6GHz

Pentium 4 2.8C

Pentium 4 2.53GHz

Mainboard

Good one (VIA K8T400M)

ABIT NF7 (NVIDIA nForce2)

ASUS P4C800 (i875P)

Memory

512MB DDR400 SDRAM (Corsair XMS3200 v.1.1)

Graphics card

ATI RADEON 9700 PRO

HDD

Seagate Barracuda ATA IV, 80GB

The tests were run in Microsoft Windows XP SP1, the system’s BIOS was setup for maximum performance.

Well, before you go to the next page to get the whole bunch of benchmarks results, we would like to remind you that Athlon 64 is not yet ready, therefore, all the results brought up in this article are preliminary results. AMD may polish off its CPU a bit more, the mainboard guys may update their BIOS’s, and VIA may introduce some changes in the final chipset version. That is why the performance of mass platforms built with Athlon 64 processors after their official launch may grow higher.


Performance

First of all, we decided to take a closer look at the performance of the built-in memory controller. Especially since we haven’t yet had a chance to play with the DDR SDRAM controller built into a CPU, even though some similar technology is used by SoC solutions developers.

First let’s check the numbers obtained in Cachemem benchmark, which we always use when there is a new memory controller tested:

 

Athlon 64 2800+

Athlon XP 1.6GHz

Pentium 4 2.8C

Memory read speed, MB/s

2610.2

1747.8

3193.5

Memory write speed, MB/s

1099

1156.9

1320.5

Memory copy speed, MB/s

1541.7

1244.8

2678.6

Latency

96

165

260

This table sums up the data obtained for three different memory controllers: the one integrated into Athlon 64, the one from the nForce2 chipset and the one from i875P chips. All controllers were used with DDR400 SDRAM, nForce2 and i875P worked in dual-channel mode.

As we see, the single-channel Athlon 64 controller is much faster than the dual-channel controller of the nForce2 core logic, which cannot show its best because of the limited processor bus bandwidth. At the same time, Athlon 64 had hard time trying to compete with the bandwidth of i875P dual-channel controller. But as soon as it comes to latency, Athlon 64 becomes an indisputable leader having left all the rivals far behind. Thanks to this processor’s ability to work with the memory directly, the latency during work with the system memory turns out very low.

Almost the same conclusions can be drawn from the results of the memory test from ScienceMark 2.0:


If we compare cache-memory performance of Athlon 64
with that of Athlon XP, the results can be really interesting.

The diagrams above show the results obtained in an Athlon 64 system (top) and Athlon XP system (bottom). Both CPUs work at the same clock frequency: 1.6GHz. Besides the higher bandwidth and lower latency, the screenshots show that Athlon 64 boasts faster cache-memory. Some time ago there circulated rumors about AMD’s intention to implement a broader bus between the Athlon 64 processor core and the L2 cache-memory. Maybe higher cache-memory performance is exactly the outcome of this modification.

One more tool, which we will use to check the memory subsystem performance is SiSoft Sandra 2003 benchmark:


The results of Athlon 64 2800+ with DDR400 SDRAM
are simply impressive!

We haven’t yet seen a memory controller as efficient as this one! The practical DDR400 SDRAM bandwidth measured in Sandra2003 test equals 96% of the theoretical one. The integrated memory controller is evidently a very efficient solution.


Now let’s take a look at the results shown by other testing participants:

Well, Athlon 64 doesn’t surpass dual-channel i875P with 800MHz bus, but beats nForce2 even despite the single memory channel available.

Now let’s have a look at the “pure” performance of different Athlon 64 units. The benchmark from SiSoft Sandra 2003 test set allows us to do it. The results provided by the measuring units of this test package do not depend on the L2 cache or memory subsystem performance.

There is something to think of here. Take, for instance, ALU performance in Athlon 64, which has got 8% higher than that of the Athlon XP ALU, according to this test. This improvement was possible due to improved branch prediction and TLB. However, this is far not enough for successful competition with Intel Pentium 4, where ALU works at doubled frequency. As for the FPU performance, this unit remained unchanged that is why the performance is also the same as by Athlon XP. And this is more than enough to defeat Intel Pentium 4 2.8C, despite the support of Hyper-Threading technology. Well, the FPU unit of Athlon processors was made very powerful from the very beginning. As far as SSE2 unit of Athlon 64 is concerned, it turned out a disappointment. Our Athlon 64 fell quite far even behind Pentium 4 2.53GHz.

Will Athlon 64 2800+ be able to perform at least as fast as Athlon XP 2800+?

Well, it looks as if it could. Although a lot will depend on each particular task. The low clock frequency of Athlon 64 2800+ (1.6GHz) pushes it behind Athlon XP 2800+ working at 2.083GHz in terms of computational power. However, as soon as it comes to operations with the memory, Athlon 64 manages to get ahead.

Now let’s check the situation in real applications.

The complex Business Winstone 2002 test set illustrating the average performance in typical office applications, showed that the good old Athlon XP 2800+ was about 8.5% faster than the new Athlon 64 2800+. The only reason for this situation is the 30% higher working frequency of the Athlon XP 2800+ processor. Unfortunately, neither the huge cache, nor the fast memory subsystem of Athlon 64 help this processor to make up for lower core clock. At the same time, Athlon 64 2800+ outperforms Athlon XP 1.6GHz by the same 8%.


To our great disappointment, we didn’t manage to run Multimedia Content Creation Winstone 2003 test on the Athlon 64 based system. During this test the same error kept popping up, though it had luckily nothing to do with the problems of the tested platform.

AMD is not very enthusiastic about SYSmark2002 benchmark, believing that the testing algorithms used in it favor the competitor’s products. Maybe it is true. What is worth your attention here, however, it’s the continuing lag of the Athlon 64 behind Athlon XP of the same rating. But here it is not such a big lag, actually, as the one we saw in Business Winstone 2002. Our hero is only 2-3% behind Athlon XP 2800+.

During mp3 encoding with lame codec the new Athlon 64 suffers a complete failure. Sound files encoding is a task that requires high “pure” CPU performance, which is a problem for Athlon 64 because of its low core clock frequency. So, as you may see on the diagram, the result shown by Athlon 64 2800+ is close to that of Athlon XP 1.6GHz.

And during data compression with WinRAR utility, Athlon 64 shows its best. Due to a large cache, which can save the biggest part of the dictionary, and also low memory subsystem latency, Athlon 64 2800+ easily outperforms not only Athlon XP 2800+, but even Intel Pentium 4 2.8C with 800MHz bus and Hyper-Threading technology.

Video encoding into mpeg-4 format again changes the whole picture. Hyper-Threading technology allows Intel Pentium 4 2.8C to get ahead of the racers. While Athlon 64 2800+ is again behind Athlon XP 2800+, though only a little bit this time.

Windows Media Encoder 9 is not the best application for Athlon 64. Its performance here is close to that of Pentium 4 2.53GHz, which can hardly be called a great achievement.


Well, let’s find out what the things look like for Athlon 64 in games.

In 3DMark03 Athlon 64 2800+ is finally ahead of Athlon XP 2800+. However, Intel Pentium 4 2.8C is still the definite leader.

CPU Score is a parameter obtained during the vertex shaders emulation by the system CPU. That is why fast ALU and high memory subsystem bandwidth matter a lot in this test. As a result, Athlon 64 2800+ is again not only behind Pentium 4 2.8C, but also behind Athlon XP 2800+.

In 3DMark2001 SE Athlon 64 2800+ managed to almost catch up with Pentium 4 2.8C and to outperform Athlon XP 2800+ quite significantly.

The similar picture can be seen in Return to Castle Wolfenstein game built on Quake3 gaming engine.

Unreal Tournament 2003 has always run fast with Athlon XP processors. However with Athlon 64, this game is even faster. Athlon 64 2800+ is the indisputable leader here.


Now let’s take a look at the results of the scientific ScienceMark 2.0 test:

Athlon XP used to be fairly considered the best CPU for scientific calculations due to its powerful three-pipeline FPU. Athlon 64 retained this FPU unit also. While the clock frequency dropped. As a result, the newcomer cannot work as fast as its predecessor.

And what about 3D rendering?

Rendering in 3ds max5 is a typical computational task. The architectural improvements introduced in Athlon 64 can’t make up for its low working frequency, just like in scientific tests. As a result, Athlon XP 2800+ shows 12% higher rendering speed than Athlon 64 2800+. Nevertheless, the freshly introduced enhancements in Athlon 64 do pay back in this benchmark, since the clock frequency difference makes over 30%, while the results difference is much lower.

In Lightwave 7.5 the rendering speed depends a lot on the scene type. In some cases Athlon 64 can show its best due to larger L2 cache, SSE2 instructions support and low latency during memory requests. And in other cases it performance gets close to Athlon XP 1.6GHz.

Again, it is about too many calculations and Athlon 64 can’t cope with them successfully. By the way, as soon as Intel introduced Hyper-Threading technology support in its Pentium 4 processors, their performance in rendering applications grew up immensely.


Conclusion

The major conclusion, which we can draw as a result of this test session, sounds as follows. Even though Athlon 64 processors have internal architecture very similar to that of Athlon XP processors, they still differ from their predecessors quite significantly from the practical point of view. We can’t give you a definite answer to the question, if Athlon 64 has become any faster than Athlon XP. In fact, this is just a different processor.

Moreover, there is also not much we could say about the performance of Athlon 64 in 64bit applications or at least in 64bit operation systems and 32bit applications. Supposedly, x86-64 will ensure a significant performance improvement, but it is also quite possible that x86-64 will not receive a warm welcome from the software developers. AMD has already tried to promote its own instructions set a while ago, and this experience could hardly be regarded as a success, to tell the truth. 3DNow! instructions set failed to become widely spread even though it proved to be very convenient to work with. So, we can only wait here for the first signs for or against these suppositions.

Speaking about the performance of Athlon 64 in traditional 32bit applications we can say that this new CPU boasts a few very remarkable and strong features: large L2 cache, high-performance memory subsystem and SSE2 instructions support. On the other scale we see relatively low core frequency. As a result, we see either a performance boost or a performance drop depending on each particular application and its critical parameters.

For example, Athlon 64 is not very successful in traditional calculating tasks, such as scientific calculations or 3D rendering. But as soon as we get to games or info compression, it appears beyond any competition. In general, if we compare the performance of Athlon 64 2800+ with that of Athlon XP 2800+, we will have to admit that the latter appeared slower than our today’s hero in quite a bit of benchmarks.

And in conclusion, I would like to remind you once again that this article is none other but the first look at AMD Athlon 64 performance. By the time it is launched officially this fall, a lot of things may change. For instance, AMD may raise the core clock frequency of these processors.

<%BANNER[banner_468x60_f]%>