It looks like there is a way to boost performance of AMD microprocessors and accelerated processing units in applications that rely on old-school x87 processing. Performance is there, just push the button. But does this additional performance have any value in real life?

Advanced Micro Devices has received a fair amount of critics after its long-awaited Bulldozer micro-architecture and then its Piledriver successor failed to meet performance expectations. While with optimized software AMD’s multi-core central processing units do show decent speed, they still cannot fight against Intel Corp.’s premium offerings in demanding applications. As it turns out, not only software makers need to tweak their apps. With some BIOS tunings, Stilt, a performance enthusiast from Finland, has managed to significantly boost performance results of AMD Piledriver-based microprocessor in SuperPi benchmark.

Apparently, AMD microprocessors with Bulldozer and Piledriver cores come with certain registers disabled and a certain block called NRAC enabled. In case registers are enabled and the module is disabled, performance in SuperPi benchmark rises by 14% - 18% on AMD A10-6800K APU.

“I was doing some low level testing for other purposes, I found something that did not make any sense to me. [...] I roughly know what it is and what it does, but still some questions remain: Why does this ’feature’ exist in the first place and why it is activated on all 15h [AMD Bulldozer and similar] family parts. I would normally assume it is a workaround for some errata, however no bulletin exists for this one either. Also this feature does not exist in any documentation, or it does but only AMD has access to the required level. I find it hard to believe that it would be a design issue as the affected instructions work fine (but slowly) and it existed since early Zambezi revisions and, currently is still present in Richland and probably beyond (within family 15h),“ wrote Stilt on XtremeSystems forum (the message was reposted on

According to performance tests conducted by Ilya Gavrichenkov, CPU performance analyst at X-bit labs, the downloadable tweak, which is available for download (not recommended by X-bit labs), does improve performance in SuperPi, which relies on x87 computing and many other things which influence performance in single-threaded applications (e.g. memory, cache size, cache latencies, etc.). However, it looks like the tweak not only does not have any effect on performance in modern applications in general since they rely on SSE/AVX processing, but it also does not work in multi-threaded apps in particular. In fact there are so few programs nowadays that depend on poor FPU/x87 performance that they simply do not matter.

It is obvious that the performance tweak works in SuperPi, but there are no evidences that it can boost performance in other applications. On the other hand, it is not obvious that the latest processors from AMD do not have any other undocumented features that improve their performance. Perhaps, with further tweaking it is possible to speed up many other apps, not only SuperPi. Unfortunately, a way to lift performance of recent AMD CPUs in all programs with a BIOS tweak has not been found yet.

AMD did not comment on the news-story.

There are many issues that result in AMD's processors not doing well in BENCHES. A poor BIOS implementation is only one. The benches have been poorly written for years (by design I might add), so that they show Intel processors in the best light at the expense of AMD processors. That is why in real world applications AMD processors always perform better than they appear in the benches. It's not difficult to understand who benefits from Intel processors looking superior in benches that the naive enthusiasts use to determine performance.

Years ago the GPU makers figured out how to write algorithms to inflate benchmark performance. Anyone in the know is aware of these games and is smart enough to use real applications to determine system performance, instead of faulty benches.
14 7 [Posted by: beenthere  | Date: 06/22/13 09:38:46 AM]
- collapse thread

show the post
5 10 [Posted by: amdzorz  | Date: 06/22/13 10:43:32 AM]
Basically just guessing here, but I think AMD could fix this if they really tried. They would have to have some pretty smart engineers though. Whether it be all hardware-side or maybe just get Microsoft to make some modifications to work with their hardware changes or whatever, I think it could definitely be done.

Implying it's not already fixed though.
5 2 [Posted by: TEELOT  | Date: 06/22/13 11:12:04 AM]
show the post
2 6 [Posted by: Tukee44  | Date: 06/22/13 02:28:52 PM]
There are plenty of smart engineers at AMD.

If you think illegally cornering the market and wasting billions of dollars in brute force design is 'smart' then you aren't fit to comment on this website.
5 3 [Posted by: mmstick  | Date: 06/22/13 05:48:48 PM]
Cornering the market is called "winning".
2 2 [Posted by: AnonymousGuy  | Date: 06/23/13 07:25:22 AM]
Yeah, like AMD did winning all consoles from Intel and Nvidia.
3 0 [Posted by: Atlastiamhere  | Date: 06/24/13 01:43:31 AM]
Winning is up for interpretation. Intel cheating their way to the top is a more appropriate way to define their idea of "cornering the market"
0 0 [Posted by: veli05  | Date: 06/25/13 01:12:22 PM]
Not sure if you're refering to Intel's pay-off to manufacturers not to use AMD, or to AMD's success in winning the contracts to supply chips for all 3 gaming consoles.
1 0 [Posted by: anubis44  | Date: 06/24/13 08:11:05 PM]
Well my friend. Your hypothesis about flowering garden of AMDs bright engineers could even stand out for itself. If we dont know how much people AMD fired recently and only manegerial c-rap on bonus steroids stay there to call that empty AMDs labs a home. They stink out for themselves. They dont need ""smart engineers"
0 0 [Posted by: OmegaHuman  | Date: 06/25/13 03:08:25 AM]
Jim Keller is. He created the Athlon 64.
4 0 [Posted by: TEELOT  | Date: 06/22/13 11:29:24 PM]
@amdzorz - Lisa Su recently spoke at Computex in a 45 minute talk launching Richland about "the competitor's" marketing in using old benchmark software to compare new processors.

Go to the 17:10 minute mark for her discussion on benchmarks.
3 2 [Posted by: linuxlowdown  | Date: 06/22/13 08:45:05 PM]
Lisa Su is just a ovepaid spokesWuman. For her it's a reach out to see a blue led and to know that PC is powered on.
1 2 [Posted by: OmegaHuman  | Date: 06/25/13 03:10:22 AM]
Are you sexist? She is a professor of electrical engineering, to name just one of her qualifications.
1 1 [Posted by: linuxlowdown  | Date: 06/25/13 03:31:12 AM]
This man is absolutely correct.

Benchmarks are often not only written for Intel, but they use Intel's compiler and are sabotaged without the author even knowing it.

Intel's compiler has been found to intentionally run slower on AMD processors. See these tests and the FTC filing on Cinebench:

In a related note, you can disable Intel's compiler from checking if your processor is AMD (no it doesn't break anything, it just makes it think all processors are Intel). Use "Intel Compiler Patcher" to remove it.

It's been found that AMD processors magically run a good 10% faster when Intel's rigged compiler is stopped.
14 3 [Posted by: TEELOT  | Date: 06/22/13 11:09:56 AM]
show the post
2 8 [Posted by: AnonymousGuy  | Date: 06/23/13 07:22:02 AM]
There is a difference between optimizing for your own products and purposely seeking out and sabotaging the competition's.

If you had read the links, you would see that a reverse-engineering of the Intel compiler reveals that it specifically checks if the processor is made by AMD. It does not check capabilities, generation, clock speed, or year. It checks the BRAND OF THE CHIP.




4 1 [Posted by: TEELOT  | Date: 06/24/13 04:51:00 PM]
show the post
1 4 [Posted by: amdzorz  | Date: 06/24/13 07:44:28 AM]
How the threads are assigned in Amd cpu's has nothing to do with the fact that most benches are specifically designed to artificailly augment intels bench scores by nerfing amd's. Yes amd cpu and intel cpu architechture are very different. The point Teelot, beenthere, and many other are trying to make is that Intel's fame is a load of shit in terms of how they perform in the benchmarks because they are rediculously biased they are. A fact that computer enthusiast who seek out the fastest hardware (read intel cpus in benches) on the market are willfully ignoring it seems
0 0 [Posted by: veli05  | Date: 06/25/13 01:19:45 PM]

as the real app come benchmark for all popular and not so popular cpu's it seems the x264 devs are continuasly optimising for optimal cycles as well as quality every single day for better performance of a given core including AMD.

and other than the very latest AMD FX8350 (keeping an eye on that branch in case it's not a one off improvement and may buy them in the future if the 200watts usage comes down a Lot...) Intel saves you time and power today.
0 0 [Posted by: sanity  | Date: 06/25/13 02:49:46 PM]
0 1 [Posted by: ViddyOCarrera  | Date: 01/13/14 10:04:22 AM]
My primary system has an FX8320. This is the FIRST AMD cpu I've ever purchased for my own personal use. And it is because AMD has produced a CPU that benchmarks better that most of Intel's lineup. It stomps on anything Intel's sub-$300 range, beats the $400 range comfortably, beats or matches the $500-$600 range, matches the $800 range and nips at the heals of the $1000 range. And that is before it is over clocked. Once you overclock, forget it. The 8320 & 8350 running at 4.4ghz and above beat Intels offering by a respectable margin, and all for 15% to 30% of the cost... I did my homework on this round of upgrades and the numbers said AMD had the winners. Both in raw performance and in "Bang for Buck" perspective. And before any of you call me a cheap-skate, I had budgeted for up to $550 for a CPU. I spent $170 on eBay[and $110 for the board on Newegg].

So far the actual performance number I'm getting match what has been posted on various websites.

Now to address this problem. I have had none of the issue being talked about here. Everything runs smooth and easy. Perhaps it's because I bought a nice board[AsRock 990 Extreme4]. Perhaps it is a chipset issue?
1 1 [Posted by: LexLuthermiester  | Date: 06/24/13 05:35:38 PM]
Liar :

your 8320 loses all 25 benches and draws more power. 315$
2 3 [Posted by: amdzorz  | Date: 06/25/13 07:43:43 AM]
LOL you can't down vote the truth away
1 2 [Posted by: amdzorz  | Date: 06/25/13 01:42:47 PM]
Ok that link points to one benchmark set which shows you can't read and that these sub-$200 CPU's hold their own against $300-$350 CPU's.

Thanks for putting your foot in your mouth for all to see. Good for a laugh, eh?
0 0 [Posted by: LexLuthermiester  | Date: 07/25/13 10:38:49 PM]
Yap, yap. Yaba daba doo. You're right benches are poorly written but who needs benches for a ral job??

This ""hidden features"" could only signify that multi-FPU coprocessors per module idea for x86 AMDs pushing is REALLY working (woow we discover a fire) but still only for FLOAT crunching and similarities that are heavily obsoleted nowadays (since AMD64 instructions). So why rant how crappy benchmark like SuperPI is great and how we discovered "hidden features". Similar thing but on memory level was show in days when Intel pushed rambus and their infamous Tualatins were better performs than actual P4-Willamettes and MMX only T-bird Athlons. But who did care when for longevity of PC itself memory bandwidth is what mattered. Same thing applies here when looking for FPU as only FLOAT cruncher is only here for compatibility reasons (pre 2004 apps) not for some wannabe geeks to show how big their weepee is.
1 0 [Posted by: OmegaHuman  | Date: 06/25/13 02:50:54 AM]
A BIOS does not relate to CPU performance. The BIOS is the first program that the CPU access to setup the CPU and other components in the right state and run diagnostics. Basing that benchmark programs are biased to Intel, so AMD gets screwed is just a laughable statement. AMD and Intel usually have different microarchitectures. Having different microarchitectures and running the same code will have different performance outcomes. The only times that AMD and Intel are not apples to oranges is during Stars VS Nehalem. Both have the same microarchitecture and it shows that Intel can create a microcode that makes the Nehalem microarchitecture run more efficient than Stars microarchitecture. The microcode is software that tells how the microarchitecture to function. AMD can make good hardware, but their microcode just stinks compared to Intel.

What Stilt did is change the state of some registers to run a particular benchmark better. These registers could enable some features of the microcode that may make the processor being benchmark to not pass a certain 80x86 test. These benchmark enthusiasts goes to the extreme and with out the care of the computer to function in production environments.
1 0 [Posted by: tecknurd  | Date: 06/25/13 07:32:37 PM]

so explain the real life highest visual quality x264 encoder that is SIMD asembly integer optimised (and each new seperate speed patch routine pico second benched with their open tools before being included in to the core codebase etc )for AMD and Intel and even semi optimised for arm cortex SIMD (needs more work yet though oc) and generic multithreading always running best on Intel so far beenthere
3 1 [Posted by: sanity  | Date: 06/22/13 10:29:53 AM]
- collapse thread

The only instance where AMD beats comparable Intel on x264 encoding is if you do 2 pass (average bitrate encoding/ABR). Most people who use x264 cares about video quality, and will opt for CRF or QP encoding instead (both of which are 1 pass).
3 3 [Posted by: trumpet-205  | Date: 06/22/13 02:39:12 PM]
Phoronix would like a word with you:
3 2 [Posted by: mmstick  | Date: 06/22/13 02:47:12 PM]
show the post
1 4 [Posted by: trumpet-205  | Date: 06/22/13 02:57:05 PM]
Way to go by ignoring the entire Phoronix Benchmark Suite.... the only reliable set of benchmarking tools on the planet.
4 3 [Posted by: mmstick  | Date: 06/22/13 04:59:24 PM]
And what if you use the version of Handbrake that is optimised for OpenCL? Intel's best can't keep up because its GPUs are THIRD rate.
4 4 [Posted by: linuxlowdown  | Date: 06/22/13 08:49:32 PM]
Software encoding produce better image quality than openCL specially when encoding to video to small size/low bit-rate. Personally, I never use openCL or quicksyn for encoding.

Also, intel graphics are not bad in OpenCl. HD4600 very competitive with A10 APU.

Here check some openCL benchmarks

1 3 [Posted by: maroon1  | Date: 06/23/13 04:14:55 PM]
"Software encoding produce better image quality than OpenCL"

What do you mean by this? OpenCL is about programmers offloading floating point calculations onto the GPU. AMD worked with Handbrake to get this up as an example of where an APU excels.
4 1 [Posted by: linuxlowdown  | Date: 06/23/13 06:00:35 PM]
OpenCL (or any hardware accelerated encoding) takes a big hit on video quality when compared against CPU on the same bitrate.

If you want speed, you go with OpenCL/CUDA. You want quality, you go with x264 CPU CRF/QP encoding (1-pass).
1 0 [Posted by: trumpet-205  | Date: 06/29/13 05:48:06 PM]
What drugs are you on? Or are you simply ignorant?

OpenCL IS software! It's purpose is to shift the calulational tasks from one hardware environment to another, IE from the CPU which is not great at FP to a GPU which is. That's it. The task is otherwise identical and the results likewise identical.
0 0 [Posted by: LexLuthermiester  | Date: 07/25/13 10:53:00 PM]
maroon, it depends on the software used for the encoding cuda based encoding has gotten alot better, as has opencl encoding.

its not the nature of using a gpu or the like to speed up encoding thats the issue, its just a maturity thing, encoders have come along way in the last couple years.(i say this as a guy whos done ALOT of encoding)

mpeg2 endcoding can be optimized for various cpu's, biggest thing I see that speeds/slows it though, isnt cpu speed, its drive speed and bandwidth, if you use a ram drive it can help, intel has more memory bw on desktop though tri and quad channel vs dual.

oh and btw, opencl encoding is still software encoding, its software running using cpu and gpu/apu rather then just the cpu, but its still software.

if you want hardware encoding, you gotta dig around for cards that support h264/avc(cost way to bloody much) or use a card with hardware mpeg support.....though why you would pay extra for something even my ancient athlon XP could do at way faster then real time...i dont know.
2 1 [Posted by: Azure Sky  | Date: 06/24/13 02:24:31 AM]
"Stirring with a red spoon produces worse batter than if it had been stirred with a blue one."

Sorry? They do the same exact job and spit out the same data, so I don't see how OpenCL could produce "worse quality" encodings?
3 0 [Posted by: TEELOT  | Date: 06/24/13 04:53:30 PM]
its actually very simple really....
Only the decoders are mandated to provide bit exact output.

the (H264 etc) encoders are perfectly able to use whatever options they care to invent to process a given input in whatever fashion they like to get the given visual quality or not as the case may be..

there's no mandate or rule that states their final output in fact be compatable with a given codec standard, that's why the so called GPU encoders produce poorer visual output as H264 and virtually everything else video related is integer based not GPU floating point based.

its really obvious and yet noone has seen fit to do it to date (lets hope they do asap before we need to encode real ultra HD 7000x4000 + 128 audio channels or what ever it will be for real By 2020) but if some GPU vendor be it x86 or ARM SoC vendor finally puts a full set of integer video related and targetted SAD SIMD etc hardware microcode on a future GPU core then we might finally see some good performance and visual quality X264/ffmpeg/avconv patches being written that the downstream apps can then use properly at speeds better than the CPU alone.
0 1 [Posted by: sanity  | Date: 06/25/13 04:41:59 PM]

Great links with interesting discussions posted by TEELOT
4 1 [Posted by: Tud Bar  | Date: 06/22/13 12:51:05 PM]

the solution: HSA heterogeneous - is a strong occasion for AMD and others to develop an universal and invariant to the platform, compiler
3 3 [Posted by: Tud Bar  | Date: 06/22/13 05:28:08 PM]

I think a bios update or a patch from our friend here is in order.. If they won't fix it, we will.. right..
2 1 [Posted by: Derek Smith  | Date: 06/22/13 05:59:10 PM]

AMD should not allow the Benchmarks with proprietary code that are not OPEN SOURCE.

They should attack in justice the big vendors of benchmarks software with proprietary code, to justify the results, the weightings of sub-tests, etc .

It is not correct that just some insignificant companies to manipulate the public at the scale of billions $.

Why the weighting of benchmarks does not use the statistics of frequency of the use of different domains of softwares: website (55%), video-playback (20%), photo (15%), Office (5%), etc....

and based on this kind of statistics to partition the score.

Why should i take in consideration the benchmark involving Video Encoding? it is a practice for idiots ! And we get that Futuremark for example takes a lot in consideration Video Encoding, Image processing in Photoshop, etc .
3 3 [Posted by: Tud Bar  | Date: 06/22/13 06:11:22 PM]
- collapse thread

actually Tud bar every single video you have ever watched IS Encoded in one form or another and by definition it can't be "a practice for idiots" as you put it, Google encode every single video uploaded there with x264 before you the other end user can view it....

and here's an interesting c&p you might enjoy from the actual developers commit message from the x264 change log as referenced above

"commit 3a5f6c0aeacfcb21e7853ab4879f23ec8ae5e042 r2286 Author: Steve Borho <>

Date: Thu Feb 21 12:48:40 2013 -0600

OpenCL lookahead OpenCL support is compiled in by default, but must be enabled at runtime by an --opencl command line flag. Compiling OpenCL support requires perl. To avoid the perl requirement use: configure --disable-opencl.

When enabled, the lookahead thread is mostly off-loaded to an OpenCL capable GPU device. Lowres intra cost prediction, lowres motion search (including subpel) and bidir cost predictions are all done on the GPU. MB-tree and final slice decisions are still done by the CPU. Presets which do not use a threaded lookahead will not use OpenCL at all (superfast, ultrafast).

Because of data dependencies, the GPU must use an iterative motion search which performs more total work than the CPU would do, so this is not work efficient or power efficient. But if there are spare GPU cycles to spare, it can often speed up the encode.

Output quality when OpenCL lookahead is enabled is often very slightly worse in quality than the CPU quality (because of the same data dependencies).

x264 must compile its OpenCL kernels for your device before running them, and in order to avoid doing this every run it caches the compiled kernel binary in a file named x264_lookahead.clbin (--opencl-clbin FNAME to override).

The cache file will be ignored if the device, driver, or OpenCL source are changed. x264 will use the first GPU device which supports the required cl_image features required by its kernels.

Most modern discrete GPUs and all AMD integrated GPUs will work.

Intel integrated GPUs (up to IvyBridge) do not support those necessary features. Use --opencl-device N to specify a number of capable GPUs to skip during device detection.

Switchable graphics environments (e.g. AMD Enduro) are currently not supported, as some have bugs in their OpenCL drivers that cause output to be silently incorrect.

Developed by MulticoreWare with support from AMD and Telestream."

0 0 [Posted by: sanity  | Date: 06/25/13 07:26:42 PM]

If i'll be AMD i would not allow in relation with OEM suspicious and proprietary Benchmarks to evaluate the cost of the CPU.
2 3 [Posted by: Tud Bar  | Date: 06/22/13 06:13:38 PM]

AMD cannot compete Intel, because many developer compiled the software with Intel Compiler, and single threaded design. So many software not use all of AMD Features..
4 1 [Posted by: iTon  | Date: 06/22/13 07:25:34 PM]

@Tukee44, Jim Keller left Apple and went back to AMD last August. In the late '90's, AMD's server CPUs were better than Intel's which Jim Keller was responsible for . Jim also designed the x64 instruction set BEFORE Intel and the Hyper-Transport Technology which was the reason the Athlons blew away the crappy Pentium 4s. He is regarded as the best CPU architect in the business by everyone at Nvidia, AMD, Qualcomm, Apple and even Intel!!!

Jim has been working on the design of SteamRoller, though not the primary purpose of his focus at AMD now. With the design changes he has made, SteamRoller promises to be extremely fast. Jim put AMD on top once before, and will do it again.

Maybe you should pull your head out of your ass and know what he hell you're talking about before running your mouth. durh
6 1 [Posted by: bigbrave  | Date: 06/22/13 11:46:21 PM]

show the post
3 8 [Posted by: AnonymousGuy  | Date: 06/23/13 07:22:52 AM]


I would introduce in Europe a rule that nobody to be allowed to have market share bigger then 75% related to the others competitors, because are deviations to monopoly.

Even when the market share is obtained correctly to be over 75%, anyway that company becomes dangerous. And Intel it is.
2 2 [Posted by: Tud Bar  | Date: 06/23/13 10:08:21 AM]

Also, AMDshould give support only for Open-Source Coreboot BIOS, to renounce permanently from the proprietary UEFI Bios.

UEFI is also something strange proprietary platform that somehow in underground ways got the adoption from the majority of OEM.
3 2 [Posted by: Tud Bar  | Date: 06/23/13 10:34:10 AM]
- collapse thread

Tud Bar: UEFI wasnt adopted by some strange underground ways, infact, alot of linux developers where involved in features that where decried by many linux/foss fans who didnt do their research.

secure boot for example is something almost everybody wanted because it was a way to prevent virus/hyjacking issues, the way it is implemented on the other hand.....questionable.

I have UEFI and BIOS based boards, and honestly UEFI despite all the fanfare is nothing new, UEFI's biggest "oh kool" feature is mouse driven, and AMI had this decades ago, my old amd 5x86 had AMI mouse driven bios infact....

I personally dont care if the bios are FOSS or ami/award, I dont care for pheonix or the like but, thats because i like to overclock and....pheonix branded bios are mostly targeted at servers and OEM systems that block clocking.

2 1 [Posted by: Azure Sky  | Date: 06/24/13 02:19:25 AM]


You should not think to a vendor just as an abstract entity, it is driven by persons, and those persons could be sometimes subjects of corruption facts.

In 2006 and before, you think Intel was so convincing to the vendors only by technical point of vie not to sell AMD products? Nooooo! Intel did a more simple thing: they prestidigitated with amounts in some pockets...
3 1 [Posted by: Tud Bar  | Date: 06/23/13 11:51:03 AM]

Within AMD is huge mess. Groups responsible for hardware, BIOS, drivers do not work together. I wonder if groups that design CPU, work in tight cooperation.
3 2 [Posted by: Tristan  | Date: 06/24/13 01:27:34 AM]

I'm wondering if Microsoft/Intel have something similar implemented when calculating the Windows Experience Index..
I have a Z77 board with an i5K series processor and using the inbuilt 4000 graphics obtained a score of 6.5.
I installed a AMD 6950 series graphics card and re run the test, it came back with a score of

wait for it

The graphics score was the lowest score so it was based on that, every thing else was over 7.6 out of 7.9 maximum.
4 0 [Posted by: caring1  | Date: 06/24/13 11:33:26 PM]

The guy's nickname is "The Stilt", not just "Stilt", for reference his 8GHz result in
0 0 [Posted by: Spoeghe  | Date: 06/25/13 12:18:29 AM]


Industry is usually full of blacklegs, theory of benchmarking is a game with hiding black and white cards.
0 0 [Posted by: tbaracu  | Date: 06/25/13 07:29:34 AM]


