AMD Trinity for Desktops. Part 1: Graphics Core

New hybrid AMD Trinity processors for desktop systems haven’t been officially launched yet. However, we prepared a detailed review of their graphics omponent.

by Ilya Gavrichenkov
09/26/2012 | 09:00 PM

I am sure no one doubts that the today’s fastest PC processors do not come from AMD. And this didn’t happen overnight. Since Intel had switched from Pentium 4 to various Core processors, AMD products rolled back to the second place. In fact, all current AMD processors are either entry-level or some special niche products, which are not particularly interesting to the majority of users out there, who value performance the most. However, not very high performance, as well as decrease in the market share are not reasons enough to give up the achievements of AMD processor division for lost. This company’s engineers are known for their ability to produce some unique solutions from time to time, which allow AMD not only preserve their market positions, but to influence the entire industry. And the examples are, in fact, quite recent: 64-bit extensions of the x86 microarchitecture, multi-core processor design, integration of the memory controller and chipset North Bridge into the processor – all these things have been developed and first implemented by AMD and not by the current processor market leader.

 

This is exactly why we continue to closely monitor all the innovations cooked in the heart of this company. And it looks like they have found that new goldmine, which should not only give them a positive boost, but also inspire the entire processor market. This goldmine is APU (Accelerated Processor Unit) – the ideology, according to which the traditional computational cores are combined with a high-performance graphics core inside a single semiconductor die. And I am not talking about placing them into mutual vicinity, but about a complete symbiosis, i.e. the merger of their mutual resources for execution of joint tasks.

The APU category includes several different AMD products released back in 2011. The most interesting one of them is the A-series of processors codenamed Llano, which are used in Lynx and Sabine platforms and targeted for a wide variety of desktop and mobile systems. Although these processors and platforms are sort of a “trial run” used to polish off the APU principles, the market gave them a really warm welcome. Llano turned out particularly demanded in the mobile segment, which immediately increased AMD presence in the notebook segment. And this is indeed the case. While a few years ago mobile AMD platforms were a pretty rare occurrence in the notebook market, today there are numerous products with AMD APU inside in any computer store.

However, increased interest of the mobile market in AMD processors is not caused entirely by their hybrid design. In fact, it is more of a side effect. In reality, pretty powerful graphics core combined with decently fast computational cores is exactly what Intel lacks in their current product line-up. And taking into account extremely affordable prices of AMD APUs, it is not surprising that they became a perfect fit for inexpensive notebooks. Namely, they allow assembling sufficiently fast and contemporary systems without external graphics accelerators and respective additional costs. As a result, the whole APU concept was broadly popularized. Its advocates from the AMD camp worked closely with software developers and at least had at their disposal real applications benefitting fully from the resources and potential of the hybrid processors. And the May refresh of the AMD’s mobile A-series processor line-up with the new Trinity design, which offered higher computational as well as graphics performance, became an extra argument in favor of this attractive concept. So, the share of notebooks with an AMD Vision logotype will continue growing.

However, the story of the desktop AMD APUs is completely different. The desktop users’ demands are very different from what the notebook users want and need, and they were not too thrilled about the whole APU thing right from the start. Very powerful graphics was the driving force for the growing popularity of the first generations of hybrid processors in the notebook segment, but it didn’t strike as “oh so powerful” in the desktop PCs. Desktops support much higher screen resolutions, where AMD A-series processors failed to reach acceptable performance. In other words, desktop users do not see that much difference between the graphics core in AMD’s Llano processors and the integrated graphics from Intel: both of them are not a very good fit for an entry-level gaming rig. However, the computational cores in the hybrid AMD processors are significantly slower than those from Intel, thus making it impossible for Llano to find its way into a number of home and office systems. Even as the heart of a media-center, AMD APU will hardly withstand the competition. This is where they suffer from higher heat dissipation and the lack of technologies that could accelerate HD video content transcoding.

However, the toughest obstacle yet preventing Llano from succeeding in the desktop segment was the specially designed Socket FM1 platform with very uncertain future. It is only compatible with Llano, which makes it a “thing-in-itself”. On the one hand, it doesn’t encourage future upgrade, and on the other – it has limited life span. Of course, this combination of features is not particularly appealing to desktop users, because the market is flooded with competitor LGA 1155 solutions with a much longer shelf life for any user with any budget.

However, AMD has absolutely no intention to hand the integrated desktop processors market to the competition, especially since Intel clearly sees the indisputable potential of the APU concept and hurries to pack their own graphics cores with more power. Therefore, about a year after the Llano launch AMD is finally ready to introduce to us their second generation of the A-series processors, which has been refreshed and enhanced in many aspects. The new desktop APUs do not have specific or utilitarian design. It is Trinity, which has been running in in the mobile segment since early summer. However, in the desktop modifications they have significantly increased the frequencies of the computational and graphics components of their APUs, and therefore they are confident that many desktop users, including enthusiasts, should really like them this time.

Overall, we are about to believe AMD on this one, as Trinity design is undoubtedly better than Llano in many aspects. As we have already seen in the mobile APUs, the Trinity computational cores with Piledriver microarchitecture work somewhat faster than Husky cores from Llano, which microarchitecture is already the thing of the past. The performance of the graphics core has also improved a lot, and its internal structure has been significantly modified. And most importantly desktop Trinity processors are now compatible with the new Socket FM2 platform, which should be free from all the old issues. AMD is ready to guarantee its stable presence over the next several years, and the line-up of compatible processors will include a large variety of models in different categories.

In other words, if we compare Trinity and Llano, the new processors are undoubtedly better. However, are they good enough to encourage the adoption of the APU concept in the desktop segment, where users are still pretty skeptical about solutions like that? Today we will try to partially answer this question by taking a closer look at the functionality and performance of the graphics core in the new generation of AMD’s desktop hybrid processors, and see if they are capable of powering entry-level gaming systems.

Unfortunately, the second part of our Trinity discussion dedicated to its computing component needs to be postponed for now. However, it isn’t our fault. The thing is that the new desktop A-series processors haven’t been officially launched yet. Therefore, we are still under the NDA on that side. However, we are free to talk about Trinity microarchitecture, so first let’s take a look at what AMD engineers have done to make new APUs a reality.

Trinity Design

According to the original concept, any APU consists of three major components. And here Trinity doesn’t change anything: new generation of hybrid processors consists of processor cores, integrated graphics accelerator and a small, but very important component – unified North Bridge. This is exactly what links a bunch of versatile cores into a balanced system and, together with the DDR3 SDRAM controller, ensures that computing and graphics cores communicate with each other and with the system memory seamlessly and are capable of working jointly with the same data.

 

Overall, the Trinity structure remained the same as it was in Llano, but all the individual components have been modified. Moreover, all changes have been made without increasing the size of the semiconductor die dramatically: AMD’s didn’t change the production process and continued using 32 nm Globalfoundries SOI technology, but increasing the price of the APUs positioned as affordable products also didn’t seem like such a good idea. As a result, Trinity dies got only 8% larger and is now 246 mm2 in size. The transistor count also increased only a little bit and now reaches 1.303 bln (it used to be 1.178 bln). Moreover, even the distribution of the transistor budget between the computing and graphics components hasn’t really changed that much either: they occupy about the same area of the die in both cases.

Nevertheless, this is where the discussion of similarities between Llano and Trinity ends. For example, computing cores in the new APU generations have been changed a lot. From now on hybrid processors will use Bulldozer microarchitecture, and to be more exact its second generation called Piledriver. Dual- and quad-core Trinity processors contain one or two quasi dual-core modules, which as you remember, contain two sets of execution units and can process two threads simultaneously, but at the same time share cache memory, instruction fetcher, instruction decoder and floating point unit. However, unlike FX processors on Bulldozer microarchitecture, which do not have integrated graphics, Trinity not only has fewer cores, but also has no L3 cache.

However, the second generation Bulldozer microarchitecture used in the new APUs and nowhere else, boasts a number of minor improvements boosting performance, reducing leakage currents and ensuring stability at high clock speeds. The front-end now features more precise branch predictor and a larger instruction window. Execution units acquired an improved scheduler and are now able to execute certain instructions faster, particularly such as integer and floating-point division. Moreover, the developers mention having increased the L1 TLB and having improved arbitration and data prefetch algorithms in the L2 cache. All this provides Trinity processors with about 25% computing performance boost compared to Llano (according to the manufacturer).

The unified North Bridge has also undergone significant modifications. First of all, engineers revised the access priorities for the shared memory, by giving the top spot on the list to computing cores, which in reality generate relatively small portion of requests. Besides, AMD made sure that there is support for new memory types including DDR3-1866 in the nominal mode or DDR3-2400 in overclocked mode. Internal data busses were expanded. Now the graphics core can communicate with the memory controller along the 256-bit Radeon memory Bus, while all the communications outside the chip use PCI Express protocol replacing Hyper-Transport.

However, the changes made to the graphics core are the most interesting. The thing is that AMD managed to boost its performance quite substantially without really increasing the transistor budget or dramatically modifying the architecture. In other words, they managed to increase the density of the effective GPU units by sequestering some extras. In my opinion, this finding deserves special attention, especially since Trinity’s integrated graphics core is our primary focus today anyway.

Devastator Graphics Core

The most intriguing part about the design of the GPU integrated into Trinity processors, codenamed Devastator, is the fact that it is based on VLIW4 architecture. Since Llano graphics core was based on VLIW5 architecture, this AMD’s decision may strike as somewhat strange, and frankly we would expect to see Trinity use CGN architecture typical of the latest graphics accelerators. However, in fact, it is VLIW4 that makes it possible to improve the specific efficiency of the graphics core, which is artificially limited by the transistor count. AMD has already resorted to this “trick” with their Radeon HD 6900 graphics cards and it did work pretty well back then.

VLIW5 arranges ALU in groups of 5 per streaming VLIW-processor. However, it turns out not quite efficient and one ALU is idling in most cases. Therefore, VLIW4 structure of the Devastator, that implies the use of four ALU in the streaming VLIW processor, allows for more optimal utilization of the resources. Of course, the back side to this picture is smaller total number of execution units and lowering of theoretical peak bandwidth of the core, but the practical specific performance per square millimeter increases. And this is the best optimization approach for a hybrid processor die that accommodates not only the graphics core, but also the computing cores.

Overall, the Trinity graphics core has six SIMD engines, each of which consists of four texturing units and sixteen streaming VLIW processors. It adds up to 384 ALU, which is 16 ALU less than in the Sumo graphics core of the Llano processors. However, simple arithmetic doesn’t quite apply here, because Devastator’s ALU are usually heavier utilized than their predecessors. Moreover, relative simplicity of the streaming VLIW processors allows setting higher clock frequencies for the graphics core. For example, while the graphics core in the top Llano processor worked at 600 MHz, the graphics core in Trinity may reach up to 800 MHz speeds.

Since Devastator has twenty four texturing units (four TMU per each SIMD engine) and eight raster operation units (ROP), we can conclude that this graphics core equals to one fourth of the Radeon HD 6970 GPU. And this is really good, even taking into account that its operational frequency is somewhat lower and there is no dedicated memory bus with high bandwidth. In other words, AMD doesn’t mislead us by saying that their Trinity processors have integrated graphics core of a “discrete” quality. We can really expect new generation of hybrid processors to demonstrate very good 3D speed.

I am sure you won’t be surprised to hear that the Trinity graphics core is compatible with DirectX 11, OpenCL and DirectCompute software interfaces. Radeon HD 6900 using the same architecture as well as Trinity’s predecessors, Llano processors, also supported them. However, new integrated graphics inherited some features of the latest solutions with CGN architecture, too. Namely, Devastator has improved tessellation unit and supports all popular antialiasing modes, such as SSAA, EQAA and MLAA.

Trinity developers paid special attention to multi-media functionality of the graphics core in their new hybrid processors. The new core has the same AMD HD Media Accelerator unit as the latest GPUs. This unit includes hardware video decoding engine (UVD3) and H.264 hardware video encoding engine (VCE). The latter is particularly important for Trinity’s success in competition against Intel’s hybrid processors that have long been featuring Quick Sync for high-speed HD video transcoding. Now AMD processors also boast something similar, but unfortunately, we couldn’t yet test the VCE engine in action, because of driver support and software compatibility issues.

When AMD was working on their new hybrid processor for the desktop market, they wanted to make sure that the users wouldn’t feel deprived of the extensive monitor connectivity, which discrete graphics cards could provide. Namely, an integrated system with a Trinity processor allows connecting up to four independent monitors at the same time and supports all connection types including analogue VGA and digital DVI, HDMI and DisplayPort 1.2. It also supports four independent audio streams. However, there are only three actual outputs available and you will need to use the DisplayPort chain to connect the fourth monitor.

However, most impressively Trinity graphics supports Eyefinity. Of course, it will be a challenge to find a game capable of running at an acceptable fps rate on three-four monitors connected to the Devastator, but the mere fact that this feature is available indicates that AMD took the features and functionality of the second generation APU very seriously and made sure it was loaded to the maximum before rolling it out into the market.

Trinity Model Line-Up

Speaking about the graphics core of the desktop Trinity processors, we should also touch upon the model line-up within the new family. The thing is that different Trinity processors in the A-series may have different modifications of the Devastator core inside. Their distinguishing features are pretty standard, though: trying to differentiate their products based on price, they disabled one or more SIMD-engines in the junior versions of the core. As a result, the above described functionality, including 384 execution units, will only exist in the top APU modifications.

The model line-up of the new desktop Trinity processors looks as follows: the fastest models with the fully-functional Devastator core bearing Radeon HD 7660D marketing name belong exclusively to the new flagship A10 series. All other modifications with graphics cores featuring fewer streaming processors and working at lower frequencies belong to the “simpler” A8, A6 and A4 series, where they replace Llano processors.

The complete model line-up of the new Trinity based processors is summed up in the following table:

 

Even the graphics core in A8 processors is theoretically more than 35% slower than the fully-functional Devastator core. Not to mention even slower A6 and A4. And it means that the best candidates for gaming systems will be primarily A10-5800K and A10-5700 models. They could fit into an entry-level gaming system without the discrete graphics card bets of all. The processors from the more junior series will hardly be an option for universal gaming PCs that is why they should be considered for multi-media centers and home entertainment systems, which aren’t supposed to run resource-demanding 3D games.

Therefore, today we are focusing on the top hybrid processor, A10-5800K, with the integrated Radeon HD 7660D graphics core. This processor has two Piledriver modules at its disposal that is why diagnostic utilities and operating system see it as a quad-core processor. However, we have to point out that there is an alternative interpretation, which describes this processor as a dual-core one with the ability to process up to four threads simultaneously. Although it contradicts AMD’s statements, it seems to describe the market positioning of the new A10-5800K most precisely. In terms of price, this processor falls into the same range as the Core i3 from Intel, which are also dual-core CPUs, but with Hyper-Threading support.

Taking into account that this processor supports Turbo Core 3.0, its clock frequency should vary between 3.8 and 4.2 GHz. However, during our test session we saw that under heavy operational load it spends most of the time in the middle of this range – at 4.0 GHz.

Radeon HD 7660D graphics core integrated into A10-5800K works at 800 MHz and this frequency drops to 300 MHz in idle mode. Although AMD promised that turbo mode would also work for the graphics core, the graphics core frequency never rises beyond the declared 800 MHz.

Testbed Configuration and Testing Methodology

Today we are going to investigate the performance of the graphics core in the AMD’s new hybrid processors. We will use the obtained results to find out if the newest processors with integrated graphics will become a good option for entry-level gaming systems without the discrete graphics accelerators inside.

During our test session AMD A10-5800K processor with Radeon HD 7660D graphics core will compete against other currently available integrated chips with 3D graphics and acceptable performance. The first competitor is AMD Llano processors, which will eventually become outdated once Trinity is out, but are still pretty current. Llano family will be represented by AMD A8-3870K with Radeon HD 6550D graphics core. The second competitor is Intel Ivy Bridge family, which top graphics core modification, HD Graphics 4000, offers very promising 3D performance, according to the manufacturers. The honor of this family will be defended by dual-core Core i3-3225. We chose this particular CPU over the quad-core Core i5, because AMD positions their APU as an alternative specifically to Intel’s dual-core products. According to the preliminary information, AMD A10-5800K will cost about the same as the junior Core i3 CPU models.

Moreover, do not forget about the conclusions we drew in our recent articles revealing higher specific efficiency of the Intel graphics cores. Quad-core processors with Sandy Bridge microarchitecture competed quite successfully against eight-core Bulldozer processors, and I doubt that things have seriously changed with the release of new generation Ivy Bridge and Piledriver microarchitectures.

Although A10-5800K is much faster than AMD A8-3870K, it falls behind Core i3-3225 and Core i3-2125 processors, not to mention a serious beating it takes in the computing performance test from the quad-core Core i5-3330. So, the comparison of the new AMD APU against the dual-core Core i3 in graphics tests is totally justified.

In order to estimate the performance of the integrated graphics core in contemporary processors in respect to the discrete graphics accelerators, we also added the results of a system with a discrete graphics card. We chose Radeon HD 6570, which is currently priced at about $70 for the modification with GDDR5 memory, like the one we had on hand. We tested it in an A10-5800K based system.

As a result, we put together test platforms with the following hardware and software components:

For our tests of the AMD A10-5800K platform we installed KB2645594 and KB2646060 OS patches, which adapt the scheduler operation for Bulldozer microarchitecture.

The primary focus of our today’s test session will be obviously on gaming performance of the processor integrated graphics. Therefore, most of the benchmarks we use today are special gaming tests. With these goals in mind, we will primarily concentrate on the performance of graphics solutions in the de facto standard FullHD resolution of 1920x1080. Therefore, most tests were performed in this particular resolution with low or medium image quality settings.

3D Performance

3DMark Vantage

3DMark scores are a very popular way of estimating average gaming performance of the graphics cards. Therefore we decided to start with 3DMark. Let’s check out the performance of our testing participants in Vantage version, which supports DirectX 10.

We immediately see the huge progress AMD APU made upon transitioning from Sumo to Devastator graphics core. The advantage of the Trinity processor over the flagship Llano solution is about 40%. As a result, the graphics performance of the A10-5800K based system is almost as high as that of the system with the discrete AMD Radeon HD 6570.

3DMark 11

Newer version of 3DMark benchmark measures DirectX11 performance. Before, Intel processors couldn’t participate in tests like that and AMD APU would have to fly solo. However, new Intel HD Graphics 4000 core integrated into Ivy Bridge processors finally supports all contemporary software interfaces, so the Core i3-3225 processor is also present on the diagrams.

3DMark 11 produced extremely interesting results. According to this benchmark, the graphics core in A10-5800K managed to outperform the discrete Radeon HD 6570 graphics card. This is an excellent illustration of how efficient the VLIW4 architecture in Devastator actually is. I would like to remind you that Radeon HD 6570 graphics card uses an 800 MHz Turks graphics processor with VLIW5 architecture and has 480 streaming processors, while Devastator core has only 384 of them. However, as we can see, more execution units doesn’t always translate into better practical performance, which means that choosing VLIW4 design for Trinity was a very smart decision.

Aliens vs. Predator (2010)

Although graphics core of the A10-5800K processor managed to outperform the discrete Radeon HD 6570 in the synthetic 3DMark 11, things turn out completely different in real games. Here discrete graphics accelerator is far ahead of any integrated graphics core, including the Radeon HD 7660D. The memory bus with insufficient bandwidth obviously remains the bottleneck of any integrated graphics cores. However, it is important to point out that we are comparing Radeon HD 7660D against the Radeon HD 6570 graphics card featuring GDDR5 memory with high bandwidth. But had we used a “simpler” graphics card with DDR3 SDRAM in our tests, the new Devastator core would have undoubtedly demolished it.

Batman: Arkham City

The performance difference between the old and the new graphics core in AMD processors is about 30%. So in terms of graphics performance, the transition from Llano to Trinity is a justified move with great outcome. And this move hasn’t been encouraged by the growing competition with Intel: even the newest and fastest Intel GPU doesn’t stand a chance against AMD. Of course, intends to eliminate all entry-level graphics accelerators with DDR3 memory, such as Radeon HD 6570 or GeForce GT 630.

Battlefield 3

Of course, Radeon HD 7660D is not the same as a high-end or mainstream discrete graphics card. This solution is much slower. However, as we can see, the new integrated graphics core from AMD allows us to play contemporary games in FullHD resolution quite comfortably. Yes, we often have to use low image quality settings, but in return we get decent average fps rate. Radeon HD 7660D also doesn’t show any unexpected performance drops. For example, in Battlefield 3 the minimal momentary performance with low image quality settings is at a very acceptable level of 18 fps.

Borderlands 2

Even the newest first-person 3D shooter, Borderlands 2, runs without any serious problems on the new A10-5800K based system. Of course, you will not be able to enjoy all the graphics beauty, but unlike Intel processors with integrated graphics, the new AMD APU will let you play this game in 1920x1080 resolution without an external graphics accelerator.

F1 2012

The race simulator games are usually not very graphics demanding. F1 2012 is quite typical in this respect. This game will run on integrated systems quite well even if we use high image quality settings and FullHD resolution. And although Radeon HD 7660D is almost 35% faster than Radeon HD 6550D from the Llano processor, the discrete Radeon HD 6570 is still a little faster. However, any integrated products from AMD look way better than the top graphics core of the competition – Intel HD Graphics 4000. As you can see, A10-5800K processor is about 60% faster than Core i3-3225 in F1 2012 game.

Far Cry 2

We didn’t exclude Far Cry 2 from our testing suite on purpose. This four-year-old shooter shows that the contemporary Trinity APU works remarkably well in the previous generation games. For example, we managed to play in 1920x1080 resolution with maximum available graphics quality settings and over 30 fps average framerate (without FSAA, of course). The minimal recorded fps was 23 frame per second.

Sleeping Dogs

Unfortunately, the graphics core of the A10-5800K processor is again unable to defeat the discrete Radeon HD 6570 graphics card: it falls about 10-15% behind in this game. The reason for this is quite obvious: it could use higher bandwidth memory. Therefore, the growing popularity of solutions like Trinity may actually revive the DDR3 SDRAM market. The performance doesn’t depend that much on memory in general purpose applications, but the systems with integrated graphics may really benefit from fast memory sub-system. However, we are going to dwell on this particular aspect later in this review.

Sniper Elite V2

Of all existing integrated GPUs the Radeon HD 7660D version of the Devastator core is the fastest. The results in Sniper Elite V2 benchmark once again confirm this. The new modification of the integrated graphics core from AMD is 26% and 43% faster than the previous Sumo modification depending on the image quality settings. As a result, Radeon HD 7660D turns out twice as fast as Intel HD Graphics 4000. In other words, AMD remains way ahead of their competitor when it comes to integrated GPUs. Moreover, AMD Trinity completely overshadowed the progress Intel made with their Ivy Bridge launch. So, in the end APU from both makers rank totally differently.

Cinebench R11.5

All games above use DirectX software interface. However, we also wanted to see how the graphics accelerators would cope with OpenGL tasks. Therefore, we also ran a few tests in professional graphics suite – Cinema 4D.

The situation is quite typical. Trinity performance in OpenGL application doesn’t differ really from its performance in gaming DirectX tasks. Radeon HD 7660D graphics accelerator integrated into the new AMD A10-5800K processor is ahead of its predecessor and Intel competitor, but falls behind discrete Radeon HD 6570 graphics card. At the same time the OpenGL performance we see here makes the idea of using integrated graphics for professional applications quite feasible. Moreover, AMD even offers special “professional” Trinity processors, which will be available under FirePro name.

Performance Dependence on the Memory Frequency

When we tested the Radeon HD 7660D graphics core performance in games, we suspected that it sometimes lacked memory bandwidth. It is quite easy to check, since the Trinity memory controller works perfectly fine with high-speed DDR3 SDRAM. So we decided to compare the performance of the AMD A10-5800K based system with different memory types from DDR3-1333 to DDR3-2400. Note that when you select DDR3-2133 or higher memory mode, Trinity memory controller requires Command rate to be set to 2T. Nevertheless, the system remained totally stable. As a result, we used the following memory timings for different test modes:

And here are the obtained results:

Graphics core performance proved amazingly scalable as the memory frequency and bandwidth increased. By simply raising the memory frequency by 266 MHz, we could boost the fps rate by 10-15%. Of course, as the memory frequency increased, this dependence becomes less prominent, but nevertheless, if you are building a Trinity based system and intend to use its graphics core for 3D applications, you must pay special attention to finding high-speed DDR3 SDRAM. This is excellent new for overclocker memory makers, because it creates potentially larger market for them. It is a very convincing argument that you can easily boost the gaming performance of your AMD A10-5800K processor by as much as 15-20% by simply replacing the common DDR3-1600 with DDR3-2400 in your Socket FM2 system.

AMD Dual Graphics Technology

Just like Llano, Trinity supports Dual Graphics technology that allows building an asymmetrical CrossFireX configuration from the graphics core integrated into the processor and an external graphics card. Although this technology is not quite polished off in the new platform yet, and the integrated Radeon HD 7660D can now only be paired with a discrete graphics card from the HD 6000-series. It is also important to remember that for the best result the performance of the discrete graphics card should be about the same as the performance of the processor graphics core. In other words, the best choice for pairing with A10-5800K would be Radeon HD 6670 or Radeon HD 6570.

We tested this technology with a Radeon HD 6570 graphics accelerator. It can be activated very easily. When you install the add-on graphics card and enable Multi-Monitor mode in the BIOS, the driver offers you to connect additional resources.

This is how performance changes in this case:

Ideally, Dual Graphics technology should produce a positive and very noticeable effect. In our specific case, when we had two GPUs with almost the same potential, the tandem’s performance improved by about 50%. However, unfortunately, this impressive improvement revealed how greatly this technology depends on driver optimizations. For example, the performance may actually drop in new games, which driver has not yet been optimized. So, even though Dual Graphics remains a very interesting way of improving the performance of Trinity systems, we wouldn’t encourage resorting to it just yet, unless you are an experienced user and are ready to check the Dual Graphics efficiency every time you launch a new game.

GPGPU Performance

AMD constantly stresses that their Llano and now Trinity processors are in fact APU. It means that their architecture has been optimized for different types of tasks, which can be executed by joint effort from the traditional x86 cores and the streaming processors of the integrated graphics core. Of course, special software is necessary for these principally different computing resources to operate in harmony. And if a year ago it was a death sentence for the APU concept, now things have started to change dramatically. The developers of multiple popular applications started to take advantage of the hybrid solutions. As of today, the current or upcoming version of several applications will be able to benefit from the computing potential of the integrated graphics cores. These applications are: Adobe Flash 11.2, Adobe Photoshop CS6, GIMP, ArcSoft MediaConverter 7.5, CyberLink MediaEspresso 6.5, Handbrake ? WinZip 16.5.

Today we cannot test the new Trinity processor in any of these applications, but we can definitely try and estimate the practical performance of the Devastator core under GPGPU load created via OpenCL and Microsoft DirectCompute interfaces. This is where we decided to use SiSoftware Sandra 2012.10.18.74 suite.

The computing performance of the Devastator graphics core looks really good. VLIW4 architecture provides superb efficiency during general-purpose calculations, allowing Radeon HD 7660D to get way ahead of the previous version of the graphics core in Llano processors and Intel HD Graphics 4000 core, but also of the discrete Radeon HD 6570 graphics card. As a result, we can expect Trinity to do really well in applications supporting OpenCL.

The situation is quite similar in the encryption test. Namely, by using VLIW4 architecture in their new hybrid processors AMD was pursuing a specific goal: to demonstrate the advantages of combining x86 computing cores with streaming graphics core. Since software developers continue checking out hybrid processors potential, it is a very timely move. At this point AMD should not only demonstrate that this approach is possible, but also prove it efficient and advantageous.

Power Consumption

Trinity graphics core indisputably outperforms the graphics in contemporary Intel processors. However, what is the price of this success? As we know, AMD processors have never been particularly energy-efficient. And the APU could actually use low power consumption: these hybrid processor often end up in compact systems, such as HTPC. AMD doesn’t talk much about energy-efficiency in respect to Trinity. It is also manufactured using the same production process as Llano, and the official TDP also remained unchanged. Does it mean that power consumption and heat dissipation of the new Trinity processors are the same as by Llano?

To answer this question we performed a round of special tests. The new digital power supply unit from Corsair – AX1200i – allows monitoring consumed and produced electrical power, which we use actively during our power consumption tests. The graphs below (unless specified otherwise) show the full power draw of the computer (without the monitor) measured after the power supply. It is the total power consumption of all the system components. The PSU's efficiency is not taken into account. The graphics cores of the tested processors were loaded by FurMark 1.9.1 utility. Moreover, we enabled Turbo mode and all power-saving technologies to correctly measure computer's power draw in idle mode: C1E, C6 and AMD Cool’n’Quiet.

In idle mode the Socket FM2 system with AMD A10-5800K processor consumes about the same power as the system with AMD A8-3870K, i.e. a little less than an Intel system with Core i3-3225.

Under full graphics load Trinity starts to lose its ground a little bit. The graphics in this APU is faster than that in Llano, but it is also more power-hungry. The power consumption difference is very noticeable and reached as much as 20 W. Intel’s processor with HD Graphics 4000 becomes an example of energy-efficiency here. No wonder, since it is not only manufactured using finer technology, but also has half the TDP of AMD processors.

Another type of operational load – HD video playback. Here all platforms consume about the same amount of power as they do in idle mode. As a result, Trinity scores really high here: the power consumption of the A10-5800K based platform is very close to that of a platform with Core i3-3225. In other words, the graphics core of the new Trinity processors is only hungry for power under serious operational load.

Conclusion

The times when integrated graphics just had to work are long gone. Since the graphics cores settled inside the CPUs, AMD and Intel started increasing their potential very aggressively, thus ousting entry-level graphics accelerators from the market and opening new usage models for their CPUs. AMD is currently at the head of this race of integrated GPUs: the fastest graphics cores from Ivy Bridge processors are still unable to surpass the Llano graphics, not to mention the new Trinity. However, this situation didn’t slow AMD down en route to innovation. This company is not fighting against a specific product from their primary competitor, but is trying to reshape the attitude towards hybrid processors in general. This requires not just higher scores over the competition in specific benchmarks, but a completely different level of product quality.

It looks like desktop Trinity processors, which we introduced to you today is this specific qualitative leap forward. AMD A10-5800K is not just a hybrid processor with the today’s fastest graphics core. Moreover, this core is fast enough to deliver acceptable performance in almost any contemporary 3D games in FullHD resolution. Of course, you can’t use the highest image quality settings in this case, but the fact in undeniable: Trinity looks very good against the background of entry-level discrete 3D graphics accelerators in the $60-$70 price range, which the new hybrid processor can easily replace. In fact, it would be fair to state that graphics cards like Radeon HD 6570 and GeForce GT 630 will reach their EOL once Trinity is out, at least this is true for their DDR3 modifications.

Today we have discussed only the graphics component of the new highly promising AMD initiative. And this component is its undeniable strength. In terms of general performance, Trinity most likely won’t be as impressive. Even the 25% performance boost promised by AMD may not be enough to let A10-5800K and other members of this family to successfully compete against Intel Ivy Bridge generation. Of course, we can expect AMD to succeed in global popularization of the APU concept and their hybrid processors will speed up at the expense of graphics core resources. However, even if it happens, it won’t be soon. Therefore, we will have to keep in mind that Trinity also has a weakness.

So, what does it mean? Think: most users buying desktop Intel processors do not really care about their graphics performance. They are ready to put up with any graphics speed, because they value high speed of the x86 cores. Trinity, on the other hand, could approach this matter from the other end. If this APU offers attractive 3D performance, does it really make sense to worry so much about lower speed of its x86 cores? The answer to this question can easily be “no”: the current Trinity performance will most likely be more than sufficient for the majority of common tasks.

However, let’s not rush into conclusions and wait for the official embargo lift on the complete performance analysis. While you are reading this article, we continue working on our next review.