Benchmark results of AMD’s quad-core server processors in CPU 2006 benchmark suite from Standard Performance Evaluation Corp. were recently published at Spec.org, confirming that AMD’s new microprocessors have overwhelming advantage in floating point performance, but cannot boast with unprecedented integer performance.
The new test results obtained on dual-processor machines contradict those posted by Advanced Micro Devices earlier this year, primarily because Intel released a new compiler that boosts performance of its chips quite tangibly. As a result, many advantages that AMD’s quad-core Opteron microprocessors might have earlier this year faded away and AMD is likely to find it rather hard to compete against Intel going forward.
According to test results available at Spec.org, quad-core AMD Opteron processor does not have any advantage over quad-core Intel Xeon processor when it comes to integer computations at the same clock-speed. Nevertheless, the new micro-architecture of AMD’s quad-core processors allows the chip to outperform Intel quad-core Xeon central processing units by 26% when it comes to floating point computations.
Even though AMD may feel itself comfortable about floating point performance, as a server with two Opteron 2350 (2.0GHz) chips outperforms a similar server with two Intel’s Xeon X5365 (3.0GHz) processors by 12% in CFP2006 Rates, the company should definitely worry about performance of its parts going forward in CINT2006 Rates.

Later this year Intel is on track to release its quad-core Intel Xeon processors made using 45nm process technology that feature up to 12MB of level-two cache as well as 1600MHz processor system bus. While the clock-speeds of the newcomers with 12MB of cache will generally remain on the current level and are expected to be in the range of 3.0GHz, there will also be two models with 6MB of cache that will operate at 3.33GHz and 3.40GHz. As a result, larger caches, higher bus speeds and operating frequencies will allow Intel to strengthen its positions.
AMD will also not sit still: it already promised to deliver quad-core AMD Opteron processors clocked at up to 2.50GHz in the fourth quarter of the year. Perhaps, that clock-speed will allow to maintain slight lead over Intel’s top-of-the-range offering in SPECfp_rate2006, however, there are hardly many chances that AMD manages to outperform Intel’s forthcoming chips in SPECint_rate2006.
Given that there will hardly be any clear winners in terms of performance, performance per watt or in terms of any other criteria, the competition between the two leading makers of x86 microprocessors will only heat up.
Comments currently:
16
Discussion started: 09/27/07 02:31:12 AM
Latest comment: 09/30/07 12:10:48 AM
Expand all threads |
Collapse all threads
[1-10]
1.
So we have Intel's magical compiler that is not that useful in the real world, and has been known to insert code checking for GenuineIntel rather than CPU features...
Still, AMD use it as well, so it is still the best compiler for their platform, even if it doesn't give the same level of optimisations.
I think that when AMD gets the clock speeds up to 2.5GHz and above we will see the advantages of AMD's platform scalability - the 2.5GHz Barcelona will probably match the 2.66GHz Xeon in SPECint_rate or be quite close to it.
(You could add a clock scaling column to the tables, to show how well the systems are scaling with clock speed increases - for instance for a 50% increase in clock, the Xeon only improves 26% in SPECint. Even odder is that Barcelona gains 5% clock speed, but gets 7% faster in SPECint - must be memory controller quirks)
[
Posted by: Syko

|
Date: 09/27/07 02:31:12 AM]
+ expand thread (1 answer)
- collapse thread
I thought AMD predominantly used a compiler built by Portland Group?
[
Posted by: anon

|
Date: 09/27/07 05:46:32 AM]
2.
I can see a raw pair on integer and a strong win on FP for the new AMD chip. No doubt, these Barcelona are my need for speed!
[
Posted by: Giganticus

|
Date: 09/27/07 03:21:55 AM]
3.
The new Barcelonas are certainly very worthy chips. I cant wait to build a 3.0Ghz Phenom X4 chip in Q1'08. Heck, I may hold out for a socket AM3 box and things will certainly get interesting around that time. Hopefully AMD gets that to market before Nehalem.
Also I hope everbody sees that even the 2.0Ghz and slower K10's outperform the 3.0ghz Intel Xeons in FP applications. These slow K10s are not completely decimated in INT either.
[
Posted by: Wingless

|
Date: 09/27/07 08:17:06 AM]
4.
The data compiled here is using Clovertown.... Harpertown has already been shown to pretty much dominate INT and edge out in FP against the 2.5 GHz (comparing top bins), with perf/watt squarely Harpertown -- and this is crippled with FBDIMMs -- see Anand or Techreport -- the only two sites that had barcey samples at launch.
DT is gonna be squarely in Intel's corner, unfortunately ... DT workloads are not hampered by Intel's FSB limitations like server.
[
Posted by: JumpingJack

|
Date: 09/27/07 09:24:21 AM]
+ expand thread (1 answer)
- collapse thread
> see Anand or Techreport
You could as well say: see Intel's web page. Maybe you did not notice but reports on that pages are totaly biased toward Intel. For example in price/watt article on Anand, he put 7 coolers in AMD machine and 3 in Intel one (and intel heat is diss. is much more than AMD), also 8Gb in AMD and 4 GB in Intel in tests where huge RAM is not important (and everyone knows that intel eats energy when added 4Gb more).
Conclustion: dont trust sites where 60% of site is covered with Intel adds, and at bottom of the each article you have link to Intel site. If you continue to read that **** you will eventualy buy Intel and get garbage which costs 30% more than similar/better AMD platform/chip.
[
Posted by: BorgDrone

|
Date: 09/27/07 09:24:24 PM]
5.
I wonder how the 3GHZ Barcelonas will run on a Cray Super Computer...
[
Posted by: huh

|
Date: 09/27/07 10:19:40 AM]
6.
K10 > Core 2
Bulldozer vs Nehalem ?
[
Posted by: Simply

|
Date: 09/27/07 10:32:11 AM]
7.
Last I checked, SPECfp Rate does not equate to overall fp performance. It is a throughput measurment. Where are the SPECfp comparisons?
Just like AMD has been doing all along; focusing only on throughput of fp because raw computations of fp are not up to the same task.
[
Posted by: Venatici

|
Date: 09/27/07 01:57:26 PM]
8.
"the new micro-architecture of AMD’s quad-core processors allows the chip to outperform Intel quad-core Xeon central processing units by 26% when it comes to floating point computations."
The reference to general "floating point computations" in this statement is a fallacy. The Spec fp Rate test mentioned here is actually a floating point THROUGHPUT test, not a test of floating point computation. This is a common mistake, so I'm not surprised but you need to correct your article.
[
Posted by: ar

|
Date: 09/27/07 02:23:09 PM]
9.
what exactly is the floating point throughput?
[
Posted by: 31415

|
Date: 09/27/07 06:11:55 PM]
+ expand thread (4 answers)
- collapse thread
It's how fast the machine can do floating point calculations (ie: floating point performance). Yes, neither "ar" nor "Venatici" know what they're talking about.
Specifically, they run N copies of some benchmark at once (N being the number of cores in the machine), time how long it takes to finish, and then take the reciprocal (and multiply by a constant to give a particular machine a rate of 1). To get a non-rate number, you can more or less just divide by the number of cores. The only way to "cheat" on a rate number is to have more cores than the competitor (two quad core Opterons vs two dual-core Xeons), but that's not the case here (both are quad core).
Now, on to the interesting bits ... it's amazing how poorly Clovertown scales with clock speed. Despite a 50% increase in clock speed, the performance only goes up by 18%. This suggests that it's not the CPUs holding the system back, it's either the FSB or the memory. Given the huge (relative) performance hit the E5320 takes, I'd say it's the FSB. Looking at the green camp, Barcelona scales nearly perfectly going from 1.9 GHz to 2.0 GHz.
So, what about Harpertown? Real-world benchmarks so far show that Harpertown outdoes Clovertown (in a 2S system, running MP-capable benchmarks) by ~10% at 3 GHz. This would give Intel's 3.2 GHz Harpertown a specfp base rate of about 74, which is far different from the 86.3 they are claiming. So Intel's definately found a tweak to pull a lot higher specfp rate score, quite probably SSE4. This 74 number is more or less what a 2 GHz Barcelona gets, which suggests that for non-SSE4 apps (ie: most fp apps already released and to be released in the next year or two), a 2 GHz Barcelona will still be competitive with a 3.2 GHz Harpertown.
Finally, int performance is quite dissapointing, and a real concern. Barcelona market placement is going to be server-centric for a good while, and if it can't compete there then it's a big problem. On the other hand, Clovertown doesn't exhibit poor scaling in int, and Harpertown doesn't really add much in the way of integer performance, so AMD's got a little while to fix things up.
[
Posted by: Cynic

|
Date: 09/27/07 10:48:45 PM]
> What exactly is floating point throughput?
In engligh, throughput is exactly that - it measures how big your pipe is. Since in this test they are running N copies of the program - it IS measuring throughput. But once the stuff has come in over the pipe, the floating point hardware needed to compute the results is just as good or as fast.
It is the act of running N copies that makes it a throughput test, which admittedly is a bottleneck for intel. But you cannot say that intel's floating-point hardware which is used to do the actual computation is any slower.
[
Posted by: ar

|
Date: 09/28/07 03:31:35 PM]
Like I said, you clearly don't understand what you're talking about. "Thoughput" doesn't exclusively refer to memory bandwidth. "But once the stuff has come in over the pipe" - what pipe?
What throughput means in this case is simply the rate at which it can do floating point computations - I'm not sure how else you would want to measure performance. While specfp_rate might not be accurate for single-threaded fp calculations on a multi-core CPU, there's no point having a quad core CPU if you're only ever going to be using a single core.
[
Posted by: Cynic

|
Date: 09/29/07 12:06:47 AM]
I think you're being intentionally obtuse
[
Posted by: ar

|
Date: 09/30/07 12:10:48 AM]
10.
The SPEC benchmark is useless to compare processors and it is hard to know how well the processor will do in the real world. The speed of the processors matters when they are compiling data for today's programs. Each program is written differently. To me, the numbers are close, so it is hard to figure which one is better until real-life environment benchmarks are done. Though, In every benchmark, I ignore all SPEC results.
Anandtech just benchmark two systems, AMD Barcelona and Intel Clovertown, for database environment. AMD processors are worst for database environments because of the low cache memory and some other factors. AMD is better suited in algorithms which rely on floating point numbers for accuracy.
nVidia's GeForce8 series and AMD R500 or above will be better suited in super computers instead of processors.
[
Posted by: linuxnerd

|
Date: 09/28/07 03:17:17 PM]
[1-10]