Bookmark and Share


Advanced Micro Devices' approach towards creation of many-core microprocessors based on the Bulldozer micro-architecture is rather fresh and innovative. But in some cases the company's "dual-core modules" that form the chips will not work with maximum possible efficiency. In order to ensure highest achievable performance all the time, AMD will work with software vendors, the company said.

A software developer recently asked AMD whether it made sense to partition a multi-threaded algorithm to pairs of closely interacting threads, and schedule each pair to AMD Bulldozer module of two closely interacting INT cores. Apparently, AMD is already working with Microsoft and other makers of operating system to ensure that Bulldozer core pairs operate efficiently.

"For the majority of software, the OS will work in concert with the processor to manage the thread to core relationships. We are collaborating with Microsoft and the open source software community to ensure that future [...] operating systems will understand how to enumerate and effectively schedule the Bulldozer core pairs. The OS will understand if your machine is setup for maximum performance or for maximum performance/watt which takes advantage of core performance boost," said John Fruehe, the director of product marketing for server/workstation products at AMD.

In Bulldozer micro-architecture-based designs cores will be able to dynamically share fetch and decode blocks, caches and other units. At least in initial designs, multi-core chips will consist of several modules, each of which will have two independent integer cores (that will share fetch, decode and L2 functionality) with dedicated schedulers and two 128-bit FMAC pipes with one FP scheduler. This means that each major block is, according to AMD, essentially a tightly-linked dual-core microprocessor with shared fetch, decode and floating point units.

At this point AMD claims that two integer cores in a Bulldozer module would deliver roughly 80% of the throughput of a dual-core processor with similar architecture. It is understandable that some workloads will deliver lower performance and some will offer higher performance. But according to AMD, it is possible to optimize software for Bulldozer's modules.

For example, if a workload with a main focus of querying data and two threads are sharing a data set that fits in Bulldozer's L2, then having them execute in the same module could have some advantages. On the other hand, if a multi-threaded application is not optimized to target the L2 (or possibly the L3 cache), or one has distinctly separate applications to run, then to get better performance a developer will need to have them scheduled on separate modules.

Tags: AMD, Bulldozer, Zambezi, Orochi, Interlagos, Valencia, Phenom, Athlon


Comments currently: 0

Add your Comment

Related news

Latest News

Thursday, August 21, 2014

10:59 pm | Khronos Group to Follow DirectX 12 with Cross-Platform Low-Level API. Khronos Unveils Next-Generation OpenGL Initiative

10:33 pm | Avexir Readies 3.40GHz DDR4 Memory Modules. DDR4 Could Hit 3.40GHz This Year

12:10 pm | AMD to Lower Prices of A-Series APUs for Back-to-School Season. New Prices of AMD A-Series APUs Revealed

Wednesday, August 20, 2014

10:53 am | AMD to Cut Prices on FX-9000, Other FX Processors: New Prices Revealed. AMD to Make FX Chips More Affordable, Discontinue Low-End Models

10:32 am | LG to Introduce World’s First Curved 21:9 Ultra-Wide Display. LG Brings Curved Displays to Gamers, Professionals

9:59 am | AMD Readies FX-8370, FX-8370E Microprocessors. AMD Preps Two New “Mainstream” FX Chips