AMD Trinity for Desktops. Part 2: Socket FM2 Platform and AMD A10-5800K Processor Review

Trinity demonstrated some very inspiring performance in graphics tests last week. However, AMD’s traditional weakness is its x86 cores. Let’s see if the company engineers managed to resolve this problem in the new Piledriver microarchitecture that found its way into promising hybrid processors.

by Ilya Gavrichenkov
10/01/2012 | 09:00 PM

A few days ago we started getting acquainted with the AMD Trinity processors that have finally come into desktops. The first review we posted on our web-site talked exclusively about the graphics components of these promising products. By posting the first review before the official launch date, we agreed to certain conditions: we had to carefully avoid any mention of the traditional processor performance. The manufacturer’s logic behind this approach to the product launch was quite transparent. Every product has its strengths and its weaknesses, and the company obviously wanted to take advantage of the strengths of their new product, which we did our best to cover in our first article. Compare with the previous generation Llano APU, the graphics core in the new Trinity processors has become about 30% faster, which allowed the newcomers not only to knock down the newest integrated competitor offerings in 3D performance tests, but also to question the need for the discrete graphics accelerators in the sub-$60-$70 price range.

 

However, AMD can’t delay revealing the rest of the features of their new products indefinitely. And today, on the official launch day of the first group of desktop AMD Trinity processors, we are ready to discuss their other aspects, such as specifics of the new Socket FM2 platform, performance of the x86 cores and overclocking potential.

Closer Look at Socket FM2 Platform

The first problem resulting from the introduction of AMD Trinity processors into the desktop segment is the appearance of another new platform – Socket FM2 (codenamed Virgo). As a result, the number of currently active AMD sockets has surpassed Intel, which has traditionally been criticized for lack of unification. Of course, AMD claims that now they have a very clear and understandable product structure. And this is partially true: enthusiasts using high-performance discrete graphics accelerators get the Socket AM3+ platform and FM-series processors, mainstream solutions (in AMD’s vision) are represented by new Socket FM2 and A-series processors, while compact systems should use mainboards with integrated E-series processors.

However, the same ranking has already existed before, and the launch of Socket FM2 didn’t really add any new order to the AMD processor line-up. On the contrary, since Socket FM1 and Socket FM2 processors are incompatible neither on electrical nor on the mechanical level, and the products for both these sockets are sold within the same series, this could actually cause a lot of confusion. At the same time, it is absolutely unclear why they needed to replace Socket FM1 with Socket FM2 at all. Socket AM3+ has proven that AMD processors with different microarchitecture can clearly remain compatible with the same platform, and the introduction of Trinity didn’t really add any principally new functionality that would require additional pins for whatever reason. They have the same number of memory channels and PCI Express lanes, as well as the same bus connecting the APU with the chipset. Moreover, both processor sockets, the new and the old one, look very similar and have almost the same number of pins: 905 and 904.


Socket FM1 (left) and Socket FM2 (right)

Of course, the old chipsets from Socket FM1 platforms could also be used with Trinity generation of processors. Therefore, a lot of inexpensive Socket FM2 mainboards will be built around the well-familiar AMD A75 or AMD A55 chipsets. Of course, with the launch of their new processors, AMD also released a new A85X Fusion Controller Hub, but in reality it is just another variation of the same thing.

The differences are truly minimal. Unlike previous chipsets, which came out with the Socket FM1, AMD A85X chipset offers two more SATA 6 Gbps ports (offering the total of eight ports now), supports RAID 5 and CrossFireX technology. Actually, the value of these innovations is quite doubtful, to put it mildly, particularly because we are talking about an inexpensive APU platform, which is primarily intended to use graphics core integrated into the processor.

In order to somehow make up for the loss of compatibility between Socket FM1 and Socket FM2, AMD assured us that they would not change the platform during the next change of APU generations.

Socket FM2 Processors

By introducing a new platform with a socket that’s incompatible with the previous products AMD has to ensure that there is a variety of Socket FM2 processors available immediately. All these processors are based on Trinity design, which means that they combine one or two dual-core modules with Piledriver microarchitecture and a Devastator graphics core.

In fact it is a “replica” of the mobile Trinity processors launched earlier this year. Unlike the mobile products, limited by the strict thermal requirements, the desktop APU modifications could be sped up to much higher frequencies. In other words, the processor part is presented by a slightly improved Bulldozer, while the graphics part is a Cayman-like graphics core with highly efficient VLIW4 architecture. However, it is a significant step forward compared with the previous AMD APU, because Llano used much earlier versions of the components. At the same time, AMD didn’t change the production process and Trinity are still manufactured using 32 nm SOI technology. As a result, the new processors have very similar transistor count compared with Llano, i.e. the Trinity performance was improved without increasing the transistor budget.

AMD singles out four processor families within Trinity design: A10, A8, A6 and A4. A10 includes quad-core processors with the top graphics core modification, A8 – quad-core APU featuring fewer GPU streaming processors working at lower frequency, A6 and A4 include dual-core models, with only half the graphics resources. In our previous article we presented a table with the formal characteristics of the APU in the Trinity line-up. Today we would like to complete it with pricing information:

As we can see, the prices are very democratic. AMD positions their A10 processors as competitors to Intel Core i3 CPUs, while the products from A8, A6 and A4 families will obviously be competing against Pentium and maybe even Celeron.

However, it turned out that A10, A8, A6 and A4 were not the only processors coming out within Trinity series. Besides them, AMD will also offer Socket FM2 Trinity processors with fully disabled graphics cores. They will be marketed under Athlon brand name. at this point we know about three models like that:

AMD’s idea to offer Athlon X4 in Socket FM2 form-factor is slightly puzzling. In fact, they are analogues to quad-core FX CPUs, but unlike their Socket AM3+ counterparts, they cannot be replaced with anything faster in the future, because of the original platform positioning. It is really hard to figure out what could be the potential target group for products like that. Of course, these Athlon X4 processors currently boast a more appealing price point than almost the same FX 4000-series processors, but the old Athlon II X4 or Phenom II X4 for Socket AM3 based on Stars microarchitecture become even more attractive entry-level product. While they feature four fully-functional cores instead of two dual-core modules “bundled” together, they are not any slower than the new Trinity Athlon X4 processors and on top of that they work in the same exact platform as six-core and eight-core Bulldozer/Piledriver processors, i.e. offer a lot of opportunities for system upgrade.

 

Meet Socket FM2 Platform: Closer Look at Asus F2A85-V Pro

AMD presented the mainboard makers with a pretty tricky task. On the one hand, Socket FM2 mainboards should be quite feature-rich: the functionality of the new AMD A85X and the overclocking potential of the new Trinity processors clearly point at this. However, on the other hand, Socket FM2 processors are priced below $130, which makes it questionable whether there is real need for mainboards, which do not fall into the entry-level category and offer more than just basic functionality. Of course, different manufacturers will end up finding their own balance between manufacturing quality, functionality and low price, so we should see a great variety of Socket FM2 mainboard modifications.

While we were working on this review, we were offered multiple implementation of the new Virgo platform from various mainboard makers. But for our today’s tests we selected Asus F2A85-V Pro.

This mainboard is an excellent illustration of what I have just said above about the balance between functionality and price, because Asus F2A85-V Pro is the top Socket FM2 mainboard in the Asus line-up of products, but here we do not see anything excessive or extremely creative. The features of this product are rather reserved: there are hardly any additional controllers, and the voltage regulator circuitry and cooling system are not too complex.

Just take a look at the official mainboard specifications:

Well, the list is pretty modest. If you have an idea about the structure of Asus mainboard models, then you understand that F2A85-V Pro doesn’t even reach the “Pro” level compared with the Intel mainboards families. And there is a tremendous gap between the functionality of the Asus F2A85-V Pro and the top LGA 1155 mainboards from Deluxe or Premium series. Of course, all this results from the positioning of the Socket FM2 platform, which requires Asus F2A85-V Pro to cost no more than $120: a more expensive product would hardly become popular.

However, luckily, the hunt for lower price point didn’t affect the quality of this mainboard. From our experience with Asus F2A85-V Pro we can state that it is a well-built product, which made our testing of the new Socket FM2 go smooth and problem-free.

Moreover, with Asus F2A85-V Pro you can build a system as powerful, as the new AMD Trinity processors allow. The expansion capabilities are as good as they can get: Asus F2A85-V Pro even has three PCI Express x16 slots, which could be occupied by graphics cards: one, two or three. However, it is important to remember that the first (blue) slot only works as x16 if the white slot next to it remains empty. In CrossFireX configurations the first two slots will work as x8, and the third black slot always works as x4. Moreover, the PCI Express bus works only in 2.0 mode, but this isn’t Asus’ fault: for some reason AMD engineers didn’t implement PCI Express 3.0 controller in their new Trinity processors.

However, the focus of the Asus F2A85-V Pro design is not only on the discrete graphics accelerators. If you are planning to utilize the integrated graphics core of your processor in this mainboard, then check out its backpanel: you will simply love it. There you will find all four types of ports for connecting displays: analogue D-Sub, supporting up to 1920x1600@60Hz, Dual-Link DVI-D, HDMI, and high-speed Display Port supporting resolutions up to 4096x2160@60Hz. The only thing to keep in mind is that Trinity processors support only three simultaneously connected monitors, so you won’t be able to use DVI-D and HDMI outs together at the same time.

Among the definite advantages of the Asus F2A85-V Pro mainboard we should point out their proprietary digital design of the processor voltage regulator circuitry. It is capable of setting voltage for all APU units with highest precision, and on top of that can control the intensity of Vdroop under heavy operational load. Overall, the voltage regulator consists of six phases for the computing part of the APU and two phases for the graphics core. Of course, it doesn’t make the due impression against the background of multi-phase voltage regulators of the LGA 1155 mainboards. However, high-quality electronic components made the circuitry exceptionally efficient and prevented it from heating up a lot under heavy loads. Therefore, we have no complaints about the cooling system implemented on Asus F2A85-V Pro. Yes, it is very simplistic, and they used not very reliable plastic push-pins with springs to fasten the heatsinks, but it does do its job well.

The memory DIMM slots on Asus F2A85-V Pro are powered via special dual-phase voltage regulator, and the slots are connected to the processor socket using their proprietary T-topology design that equals the trace length and therefore data travel time in different channels. It provides better stability with high-frequency memory modules, which is important for Socket FM2 platforms, because the performance of the graphics integrated into the processor depends heavily on the DDR3 SDRAM speed.

We have always liked Asus mainboards for their attention to various details. The Asus F2A85-V Pro doesn’t have that many interesting peculiarities, but its overall layout definitely came out good. At least, we didn’t have any problems during system assembly. Yes, enthusiasts will miss the POST controller, convenient Power On, Reset and Clear CMOS buttons, and maybe even voltage control points for the manual voltage monitoring using a multimeter. All these features are common in expensive mainboards, and a Socket FM2 product cannot be one of them by mere definition. However, there was enough room on F2A85-V Pro for a different type of functionality: the board has a MemOK! button for fixing memory settings, DirectKey for automatic BIOS access and BIOS Flashback for automatic BIOS updating.

The graphics UEFI BIOS interface is completely flawless. It is typical of all Asus mainboards and therefore is multi-functional, well-optimized and very convenient to work with. This is what the basic section with the settings for the major system knots – processor and memory – looks like:

The users get so many settings and parameters to play with, that we can regard Asus F2A85-V Pro also as an overclocker product. You can adjust base clock frequency (with 1 MHz increment), as well as processor multipliers. If you have an unlocked CPU with “K” index in the model name, you will be able to configure multipliers for the processors computing cores, its unified North Bridge and graphics core. The memory, however, may be overclocked with any processor. The DDR3-800 through DDR3-2400 with 266 MHz increments are the available and fully operational memory modes.

You can also configure all memory timings. Moreover, the mainboard supports not only the pretty rare AMD Memory Profile, but also the widely spread Intel’s XMP profiles.

There are no limitations in the voltage department. The processor Vcore may be pushed as far up as 1.9 V. the memory DIMM voltage may also be raised to 2.1 V. It is also important that you can set the voltage below nominal values, too, as it allows using DDR3L memory modules or lower the power consumption and heat dissipation of the platform manually.

The board also offers great hardware monitoring options. It allows controlling two temperatures: general system temp and processor temp (and since the situation with the thermal sensors inside the AMD processors is still quite messed up, the board uses its one diode for that matter). It also controls the rotation speeds of all five fans that can be connected to it.

You can adjust the rotation speed of these fans in the BIOS, or by using the bundled software utilities. However, it is important to remember that the fans will only be adjustable if they use a four-pin power connector.

Piledriver Microarchitecture: Is It Really Better Than Bulldozer?

In our previous Trinity review we discussed the architecture of the Devastator graphics core and arrived at the conclusion that the change towards VLIW4 architecture was a very positive move. Now it is time to talk about its computing cores. Compared with Llano, computing cores also underwent significant changes. Instead of the x86 Husky cores with Stars microarchitecture, they now use modules with Piledriver microarchitecture, which is yet another iteration in the Bulldozer evolution. As you know, with the introduction of Bulldozer AMD changed their priorities dramatically. Unlike Stars, this microarchitecture reduced the number of instructions per clock, but allowed reaching higher clock speeds. However, far not everyone was happy about this outcome therefore three quarters after the first versions of Bulldozer processor shit the streets, AMD prepared a microarchitectural refresh – the “corrected” Piledriver.

Trinity processors use Piledriver cores, which is the first time this microarchitecture goes public. AMD believes that the made improvements should be enough to make Trinity noticeably faster than Llano. Does it mean that the new computational cores will allow AMD to successfully compete against Intel’s products? This matter is particularly acute because in about three-four weeks AMD will release new FX processors with similar Piledriver cores inside. And while with Trinity processors it is still possible to claim that their performance is “quite sufficient” by hiding the actual x86 speed behind the high graphics performance, the same trick will not work with the new FX processors. Therefore, the first thing we would like to investigate is how superior the new Piledriver is over the “classical” Bulldozer microarchitecture.

However, do not pin too many hopes on the new Piledriver. In structural terms, this microarchitecture is exactly the same as Bulldozer, i.e. consists of relatively dual-core modules with two sets of integer execution units, while some of the resources are shared between the two cores. Among these shared resources are cache-memory, instructions fetcher, instructions decoder and floating-point unit. As a result, the module can process two threads simultaneously, but its peak performance is capped by the throughput of the shared decoder, which can only decode no more than four instructions per clock per two cores. For your reference: Intel Core processors have a decoder with comparable performance, but it is individual for each core in the processor. It means that the number of instructions Piledriver can process per clock couldn’t increase dramatically. The real changes will occur only in the next generation of the microarchitecture aka Streamroller: supposedly, AMD will provide an individual instructions decoder for each core of their dual-core modules. So far, all improvements in the new Piledriver are based on optimizations on the operational algorithms in individual internal modules, but do not affect the design as a whole.

According to AMD, the major improvements in the Piledriver design are the following:

All above listed improvements cannot speed up instructions decoding, but nevertheless, they do accelerate things a little bit. In order to estimate how efficient Piledriver is compared with the predecessor, we carried out a short practical test session. We will be comparing the new quad-core A10-5800K processor with Piledriver microarchitecture against a quad-core FX-4170 processor with Bulldozer microarchitecture. For a more illustrative result, both processors were working at 4.0 GHz frequency and their Turbo Core technology was disabled for the time of tests. Note that unlike A10-5800K with two-level cache-memory, FX-4170 has an 8 MB L3 cache, which cannot be disabled. Therefore, we will simply keep in mind that the Bulldozer based processor had a slight advantage. Both systems were equipped with DDR3-1867 SDRAM with 9-11-9-27-1T timings and Nvidia GeForce GTX 680 graphics card.

First let’s check out the memory sub-system performance in Cache & Memory Benchmark from Aida64 suite.


Bulldozer


Trinity

As we can see, A10-5800K processor doesn’t do that well here. Bulldozer provides higher practical bandwidth and lower latencies. However, it is not because of the shortcomings of the Piledriver microarchitecture. In reality, we compare processors working in two different platforms. Trinity has been optimized to ensure that the memory can be shared efficiently between the computing and graphics cores. More complex algorithms of the DDR3 SDRAM controller, which require additional requests priority arbitration, cause certain delays making Trinity yield to Bulldozer in tests. Unfortunately, even if the Socket FM2 system is equipped with a discrete graphics card and the graphics core integrated in the APU is not used, Trinity’s x86 cores still do not work with the memory fast enough.

Now let’s take a look at the computing performance:

As we can see from the obtained results, Piledriver microarchitecture is just a little faster than Bulldozer from the practical prospective. The highest performance advantage is only 7%, and on average the new design is only 1.5% faster in the benchmarks above. However, it is important to keep in mind that the Piledriver model we tested didn’t have an L3 cache and had a slower memory controller. This is exactly why we see its performance drop in some benchmarks that work intensively with large data volumes. However, we do not think that the processors with new microarchitecture for Socket AM3+ form-factor will change this situation dramatically. The number of instructions they can process per clock cannot really increase that is why 5-10% performance boost is probably as good as it can potentially get when new Vishera processors come out.

Testbed Configuration and Testing Methodology

Our previous Trinity test session was solely dedicated to the graphics core and its performance. I believe there are no unanswered questions left there: the performance of AMD’s integrated graphics accelerator is exceptionally good. However, the rest of the hybrid processor is also very important. This will be the main topic of our today’s test session. Therefore, the majority of benchmarks, which we will discuss here, have been run with an external graphics accelerator, and the integrated GPU of the Trinity processors had no influence whatsoever on the obtained results. In other words, we are going to find out how x86 cores with new Piledriver microarchitecture can work in typical everyday tasks.

In real world we will deal with mass production processors, which are not only based on different microarchitectures, but also work at different frequencies, in different platforms and use different automatic overclocking technologies. That is why we selected testing participants for this round of tests based not only on the processor features, but mostly on their market positioning.

AMD provided us with an A10-5800K processor – the top desktop Trinity model. Moreover, AMD presents the entire A10 family as an alternative to Intel’s Core i3, as you can clearly see from their recommended pricing. Therefore, our today’s hero will be primarily competing against Intel’s dual-core CPUs from the still popular Sandy Bridge as well as the newer Ivy Bridge generations. However, we have also included the results of the junior Core i5 models, which, just like A10-5800K, are quad-core and not dual-core processors. On top of that there are AMD products for other platforms as well. Of course, Trinity was also compared against its predecessor – Socket FM1 A8-3870K processor with Llano design, as well as against two Socket AM3+ products. The first one is a quad-core Bulldozer FX-4170, which is equivalent to A10-5800K in price. The second one is a six-core Bulldozer FX-6200, which is also priced comparably with A10-5800K, although it may seem pretty unusual.

As a result, we used the following hardware and software components for our today’s test session:

For our tests of the AMD A10-5800K platform we installed KB2645594 and KB2646060 OS patches, which adapt the scheduler operation for Bulldozer and Piledriver microarchitectures.

I have to say that Trinity processor lose are no longer hybrid once there is a discrete graphics accelerator installed (the same is true for the Intel CPUs). In this configuration the GPU integrated into the processor gets disabled, so it becomes impossible to utilize its resources via OpenCL or DirectCompute. However, the applications that support these interfaces, can always take advantage of the resources of the discrete graphics accelerator.

Performance

General Performance

As usual, we use Bapco SYSmark 2012 suite to estimate the processor performance in general-purpose tasks. It emulates the usage models in popular office and digital content creation and processing applications. The idea behind this test is fairly simple: it produces a single score characterizing the average computer performance.

If we compare the new Trinity processor against its predecessor, we can’t help noticing a tremendous improvement. The newcomer is 25% faster, according to SYSmark 2012. In other words, the transition of hybrid processors to Piledriver microarchitecture is clearly a good thing. As a result, A10-5800K outperforms FX-4170 with almost the same characteristics, but based on the older Bulldozer microarchitecture. However, it is still unable to catch up with Intel Core i3 CPUs. Even Core i3-2125 with the previous Sandy Bridge design works 10% faster than A10-5800K.

Let’s take a closer look at the performance scores SYSmark 2012 generates in different usage scenarios. Office Productivity scenario emulates typical office tasks, such as text editing, electronic tables processing, email and Internet surfing. This scenario uses the following applications: ABBYY FineReader Pro 10.0, Adobe Acrobat Pro 9, Adobe Flash Player 10.1, Microsoft Excel 2010, Microsoft Internet Explorer 9, Microsoft Outlook 2010, Microsoft PowerPoint 2010, Microsoft Word 2010 and WinZip Pro 14.5.

Media Creation scenario emulates the creation of a video clip using previously taken digital images and videos. Here they use popular Adobe suites: Photoshop CS5 Extended, Premiere Pro CS5 and After Effects CS5.

Web Development is a scenario emulating web-site designing. It uses the following applications: Adobe Photoshop CS5 Extended, Adobe Premiere Pro CS5, Adobe Dreamweaver CS5, Mozilla Firefox 3.6.8 and Microsoft Internet Explorer 9.

Data/Financial Analysis scenario is devoted to statistical analysis and prediction of market trends performed in Microsoft Excel 2010.

3D Modeling scenario is fully dedicated to 3D objects and rendering of static and dynamic scenes using Adobe Photoshop CS5 Extended, Autodesk 3ds Max 2011, Autodesk AutoCAD 2011 and Google SketchUp Pro 8.

The last scenario called System Management creates backups and installs software and updates. It involves several different versions of Mozilla Firefox Installer and WinZip Pro 14.5.

Note that A10-5800K doesn’t always lose to Core i3. It does way better in 3D Modeling and System Management scenarios than in any other cases. However, do not forget that although we compare these processors directly, Core i3 is dual-core, while A10-5800K is considered a quad-core product. The fact that two contemporary AMD cores perform as fast as one Intel core (which is not even always the case) makes us once again wish AMD increased the number of instructions per clock for their contemporary microarchitectures.

Gaming Performance

As you know, it is the graphics subsystem that determines the performance of the entire platform equipped with pretty high-speed processors in the majority of contemporary games. Therefore, we do our best to make sure that the graphics card is not loaded too heavily during the test session: we select the most CPU-dependent tests and all tests are performed without antialiasing and in far not the highest screen resolutions. In other words, obtained results allow us to analyze not that much the fps rate that can be achieved in systems equipped with contemporary graphics accelerators, but rather how well contemporary processors can cope with gaming workload. Therefore, the results help us determine how the tested CPUs will behave in the nearest future, when new faster graphics card models will be widely available.

The picture here is dramatically different from what we saw in our previous test session, when we tested processors with their integrated graphics cores. With a discrete graphics card that doesn’t cap the power of the x86 cores Trinity doesn’t look so rosy any more. Of course, it does outperform Llano, just the way it should. However, it doesn’t bring it any closer to Intel Core i3, which run much faster under gaming load. You could object that we are testing Trinity in a non-typical usage scenario, and that they will primarily be used with activated integrated graphics, for which the x86 cores performance would be quite sufficient. But we disagree. AMD obviously anticipates that their new APU will also be used with discrete graphics. Otherwise, why would they roll out A85X chipset with CrossFireX support and add processor models with the GPU disabled on the hardware level to their product line-up?

All in all, as we can see, a Socket FM2 platform with a pretty fast discrete graphics accelerator is not such a good fit for a gaming system. And the main reason for that is the low performance of the Piledriver cores in games. However, I would also like to remind you that Trinity processors do not support PCI Express 3.0 bus.

In addition to our gaming tests we would also like to offer you the results of the Futuremark 3DMark11 benchmark (Performance profile):

Here the quad-core Trinity APU is very close to Llano, because 3DMark 11 uses floating-point calculations, and each Piledriver module has one shared block for calculations of this type. Therefore, we can’t expect it to be very fast: this defect was first present in Bulldozer, and Piledriver now has inherited it, too.

Performance in Applications

To test the processors performance during data archiving we resort to WinRAR archiving utility. Using maximum compression rate we archive a folder with multiple files with 1.1 GB total size.

WinRAR is sensitive to the processor speed when working with the memory sub-system, and as we have seen, Socket FM2 platform does have issues here. Besides, Trinity doesn’t have L3 cache memory, therefore it is not surprising that A10-5800K falls behind all competitors except Llano.

The processor performance in cryptographic tasks is measured using a built-in benchmark of the popular TrueCrypt utility that uses AES-Twofish-Serpent “triple” encryption. I have to say that this utility not only loads any number of cores with work in a very efficient manner, but also supports special AES instructions.

Encryption is an example of a task where Core i3 processors reveal their weaknesses. Sufficient number of computational cores and AES instructions support are necessary to ensure that encryption algorithms will work fast. Core i3 with Ivy Bridge microarchitecture doesn’t have either of these. However, A10-5800K looks great here. And that is because AMD adds support for all new instructions sets to their new processors irrespective of their market positioning.

Now that the eighth version of the popular scientific Mathematica suite is available, we decided to bring it back as one of our regular benchmarks. We use MathematicaMark8 integrated into this suite to test the systems performance:

Wolfram Mathematica is traditionally one of those applications that work well on Intel processors. Therefore, we are not surprised to see that dual-core Core i3 are way ahead of the quad-core and six-core AMD processors with any microarchitectures. Here Trinity catches up with Socket AM3+ Bulldozer processors and is 6% faster than A8-3870K.

We measured the performance in Adobe Photoshop CS6 using our own benchmark made from Retouch Artists Photoshop Speed Test that has been creatively modified. It includes typical editing of four 24-megapixel images from a digital photo camera.

Trinity’s result in Photoshop is not that good. Although it outperforms Llano by 6%, it can’t compete against Intel CPUs. The reason is obvious: low performance of the Piledriver microarchitecture in floating-point calculations inherited from Bulldozer. As a result, A10-5800K is 27% behind Intel Core i3 -3220.

The performance in Adobe Premiere Pro CS6 is determined by the time it takes to render a Blu-ray project with a HDV 1080p25 video into H.264 format and apply different special effects to it.

We have finally got those applications where A10-5800K demonstrates similar speed as Core i3 CPUs.

In order to measure how fast our testing participants can transcode a video into H.264 format we used x264 HD Benchmark 5.0. It works with an original MPEG-2 video recorded in 1080p resolution with 20 Mbps bitrate. I have to say that the results of this test are of great practical value, because the x264 codec is also part of numerous popular transcoding utilities, such as HandBrake, MeGUI, VirtualDub, etc.

New AMD microarchitectures suit perfectly for video processing. And even though A10-5800K is not that much faster than A8-3870K, this advantage is more than enough to significantly outperform any Core i3 models.

Following our readers’ requests, we’ve added a new HD video benchmark to our tests. SVPmark3 shows the computer performance in the SmoothVideo Project application which makes videos smoother by adding new intermediary frames. The numbers in the diagram reflect the speed of processing Full HD videos without the graphics card’s help.

This is another confirmation of everything we have just said above. Trinity copes great with video processing. And by the way, there are a lot of utilities that can use graphics core resources via OpenCL. So, A10-5800K is an excellent choice for an inexpensive system, which will mostly be used for HD video content processing.

We will test computational performance and rendering speeds in Autodesk 3ds max 2011 using the special SPECapc for 3ds max 2011 benchmark:

The processors usually perform the same way in rendering tasks as they do during video transcoding. It means that A10-5800K outperforms Intel Core i3, although the advantage is not very convincing. However, when it comes to the computing performance score, Trinity has a problem: this APU loses to all other participants except Llano.

We use special Cinebench 11.5 benchmark to test final rendering speed in Maxon Cinema 4D suite.

Back in the days, the performance of Bulldozer processors in Cinebench was almost a joke. And it is really sad that the new Piledriver microarchitecture hasn’t really changed a thing. However, keeping in mind that A10-5800K costs just as little as the junior Core i3 CPUs, there is no reason to complain any more: the price totally justifies the performance.

Power Consumption

In our previous Trinity test session we didn’t reveal any improvements in the energy-efficiency of the new Socket FM2 platform compared with Socket FM1. However, AMD claims that it is not true. It could be the fact that last time we only tested the system power consumption when the load fell on the graphics core. Let’s see what will happen once we install a discrete graphics card into our test system and all the work will be in the hands of x86 Trinity cores.

To get a better idea of how greatly the processor’ energy-efficiency actually improved we performed a round of special tests. The new digital power supply unit from Corsair – AX1200i – allows monitoring consumed and produced electrical power, which we use actively during our power consumption tests. The graphs below (unless specified otherwise) show the full power draw of the computer (without the monitor) measured after the power supply. It is the total power consumption of all the system components. The PSU's efficiency is not taken into account. The CPUs are loaded by running the 64-bit version of LinX 0.6.4-AVX utility. Moreover, we enabled Turbo mode and all power-saving technologies to correctly measure computer's power draw in idle mode: C1E, C6, Enhanced Intel SpeedStep and AMD Cool’n’Quiet.

In idle mode any contemporary processors will switch to special power-saving states, in which their power consumption will be very low – only a few watts. In this case the power appetites of other system components and the efficiency of the voltage regulator circuitry on the mainboard start to matter more. Therefore, AMD A10-5800K is at the top of the diagram. Asus F2A85-V Pro mainboard we used for our tests has one of the most efficient voltage regulators under low operational loads. However, this circuitry design is partially inspired by the peculiarities of the processors themselves. Trinity features fine-grain power gating, which powers off a lot of knots inside the processor in idle mode.

You can clearly see how it works during single-threaded load:

Trinity consumes more power than Llano. However, in fact, it should be compared against the quad-core Bulldozer processor with the similar microarchitecture. This is where it becomes clear that the energy-efficiency improvements are definitely there.

No doubt that it is Core i3 3000-series processors that allow building the most energy-efficient systems. However, among all today’s testing participants from AMD, A10-5800K demonstrates the best peak power consumption readings under heavy computing load. This progress strikes as particularly impressive taking into account that they didn’t switch to finer manufacturing process.

Overclocking

Socket FM1 platform wasn’t particularly overclocking-friendly. Llano processors had low overclocking potential, increasing the frequency of the base clock generator would often cause instability, there were few supported dividers for the memory frequency. Overall, AMD didn’t really encourage overclocking of their previous generation APUs, so even their proprietary Overdrive utility wasn’t really optimized for them. But their attitude to overclocking has changed a lot with the launch of the new Socket FM2 platform and the APU for it. Now overclocking functionality is one of the official strengths of the hybrid processors. And in this respect, Trinity processors are indisputably better than the competing Core i3, which cannot be overclocked at all.

Overclocking-friendliness of the Socket FM2 platform manifests itself in many aspects. Namely, there are a lot of processor models with unlocked frequency multipliers, which are marked with a letter “K” in the model name. However, even overclocking by simply raising the base clock generator frequency should be fairly easy. In most cases, it can be increased by up to 80% above the nominal without losing system stability. The only two things to keep in mind in this case are: first, significant increase in the clock generator frequency causes mainboard D-Sub out to stop working; and second, to ensure that there are no disk sub-system issues in AHCI mode, you have to use the AMD driver instead of the default Microsoft driver from the OS suite. Here we also have to add that there are a lot of memory dividers available, and the fact that the poor Overdrive utility has finally started to work with hybrid processors.

All above mentioned overclocker improvements are purely theoretical. In reality, computer enthusiasts obviously expect Trinity to demonstrate increased overclocking potential, because they are based on Piledriver microarchitecture, Bulldozer’s successor, which is capable of working at pretty high frequencies.

However, we failed to achieve any remarkable results during our overclocking experiments. Yes, Trinity, just like Bulldozer, is very sensitive to computing core voltage increase and the stability threshold is pushed back every time to take another step higher. However, the heat dissipation also increases dramatically, which requires extra-ordinary cooling solutions. At the same time, it is almost impossible to monitor the acceptable thermal conditions. Thermal sensors integrated into the previous AMD processor cores have never been particularly precise, but in case of Trinity they are nothing short of a catastrophe. For example, in moments of idling they can easily report temperatures close to 0, and under light operational loads they would often show temperature readings below the actual room temperature. Therefore, we had to use the temperature readings taken off the mainboard diode, which is extremely inertial.

Anyway, we managed to overclock our A10-5800K processor with the NZXT Havik 140 cooler to 4.5 GHz in the testbed built on ASUS F2A85-V Pro mainboard. However, this is not maximum overclocking, but the frequency, at which the system is operational in 24/7 mode. To ensure stability in this mode we increased the processor Vcore by 0.15 V above the nominal reaching 1.5 V.

So, in terms of resulting clock speeds, Trinity overclocking is very similar to Bulldozer overclocking. Similar microarchitecture provides similar overclocking potential. The use of a principally different platform doesn’t affect the results in any way.

The primary distinguishing feature of the Trinity processors is the ability to overclock not only its computing part, but also the graphics core. And I have to say that this is a pretty logical scenario for integrated systems in its practical aspect. Graphics accelerator integrated into the Trinity processors works almost as fast as discrete Radeon HD 6570 and overall is not the dream come true just yet. Therefore, many advanced users who will decide to go with a Socket FM2 processor will most likely want to overclock their graphics. And it will pay off to some extent. However, it is also important to remember that overclocking system memory could also help improve 3D performance of the graphics core in Trinity processors, so it can definitely compliment the straight-forward increase in the graphics core frequency.

During our experiments we managed to get the Radeon HD 7660D graphics core in our A10-5800K processor to work stably at 1085 MHz, which is 285 MHz higher than the nominal frequency. Note that to ensure successful overclocking like that, you should respectively increase the corresponding voltage, which in the mainboard BIOS is usually referred to in as voltage of the North Bridge integrated into the processor, but in reality to also applies to the graphics core. In our specific case it was increased from 1.175 V to 1.4 V.

Together with the graphics core overclocking, we increased the memory frequency to the level of DDR3- 2400. This is the maximum memory mode for Trinity processors working at the nominal base clock generator frequency. To ensure stability the only thing we had to do was to set Command rate to 2T.

Let’s take a look at the following screenshot showing our overclocking success:

By overclocking our computing cores, graphics core and memory we managed to score over 2000 points in 3DMark (with Performance profile). This way, the graphics performance of our A10-5800K based system increased by about 25% and reached the level of a discrete Radeon HD 6670 compared with the nominal graphics performance of the same processor. Not bad, don’t you think so?

However, the overclocking of the computing part alone seems to have no obvious practical value like that. By comparing the Physics Score results we can show that increasing the clock frequency of our A10-5800K to 4.5 GHz produces only 11% boost in computing power. AMD has set the clock frequency of their new Trinity processors high enough right from the beginning, so only extreme cooling methods will allow boosting the overclocked performance significantly. For example, according to the manufacturer, the use of liquid nitrogen should enable Trinity processors to reach frequency as high as 6.5 GHz, but there is evidence that AMD A10-5800K has already conquered 7.3 GHz speed.

Conclusion

Back in the days we were not very optimistic about AMD’s attempt to push their first hybrid Llano processors into the desktop segment. Of course, they were unique and interesting in their own way, but mostly in theoretical perspective. In reality, it was fairly difficult to picture what type of desktop systems they could dominate.

However, Trinity is a completely different story. Their graphics performance is not just high according to the contemporary integrated solutions’ standards: they allow running 3D games in FullHD resolution! And it is a true qualitative leap forward making Trinity a worthy option for an entry-level gaming system. The x86 performance of Trinity’s computational cores is also quite decent, as we have just seen. The top members of this new family are almost as fast as Intel Core i3 CPUs. In other words, they will offer sufficient speed for contemporary general purpose systems.

Summing up these two aspects we see that Socket FM2 platform and Trinity processors have a good chance of taking over a good share of the home systems. Of course, hardcore enthusiasts and dedicated gaming fans will hardly fall in love with this product, but the average mainstream users who enjoy occasional gaming, surf the Web, work with some multimedia content and maybe some relatively simple specific applications may find Trinity an excellent product for their needs. In this case, Socket FM2 platform will not only save you some money, but will allow building a compact, quiet and energy-efficient system (including SFF or HTPC). However, to make this possible AMD shouldn’t limit the supplies of their processors with 65 W TDP, as they did with Llano for some reason, and the mainboard makers should come up with a variety of miniature mainboards with an attractive price tag.

At the same time, it is important to keep in mind that according to our tests, Trinity processors are only attractive while there is no discrete graphics card in the picture. The thing is that x86 Piledriver cores are still not that fast in games compared with the competitors. And while it is not very important when the integrated graphics core is involved, because the gaming performance is in this case limited by the 3D graphics accelerator and doesn’t hit the bottleneck of the computing resources, then with faster external graphics this disappointing issue may surface pretty quickly.

As for the idea of heterogeneous calculations implemented in the hybrid AMD processors, we doubt that it will help Trinity’s market success. Overall, the software eco-system is not quite ready yet for calculations acceleration via OpenCL or Direct Compute. We cannot deny, there is some progress in this direction. For example, WinZIP archiving utility is now able to utilize the graphics core resources, Photoshop CS6 and GIMP have a few corresponding filters, and there are a few utilities for image and video processing (such as Musemage or vReveal). However, it is still too early to take the whole APU ideology for a new standard. Most resource-hungry applications, which we work with on everyday basis, continue to utilize primarily the x86 cores. Besides, AMD hasn’t yet come up with a solution that would allow using the GPU resources if there is a discrete graphics card installed into the system, and third-party technologies, such as LucidLogix Virtu MVP, are not that easy to work with and can’t boast neither stability nor flexibility.