AMD Discloses Peculiarities of Next-Generation Jaguar Micro-Architecture

AMD Jaguar: Multi-Core Architecture for Low-Power Devices

by Anton Shilov
09/04/2012 | 08:15 PM

Advanced Micro Devices last week unveiled the first details about its next-generation Jaguar micro-architecture for low-power and ultra low-power devices. The new architecture promises to enable feature-rich multi-core x86 microprocessors with extremely low consumption of energy, but with improved performance compared to today's x86 micro-architectures. AMD hopes to enter new markets with ULP chips powered by Jaguar.


In order to significantly improve performance of Jaguar-based over the Bobcat, AMD decided to go into virtually all logical directions: increase the amount of cores, increase clock-speed, add support for modern instructions, increase amount of executed instructions per clock (IPC). AMD also decided to improve power efficiency thro clock gating and unit redesign in a bid to ensure lower idle power consumption compared to existing low-power designs. The first chips to utilize Jaguar are code-named Kabini and Temash that will be made using 28nm fabrication process, but going forward it is likely that future Jaguar-based products will be produced at 20nm nodes. 


The amount of cores in one Jaguar-powered accelerated processing unit can be up to four, as suggested by design. In this case, the 2MB shared cache unit (second level cache) will be shared between the cores, but will be divided into four 512 KB data banks, which can independently be switched off to save energy.

In order to boost clock-speed by 10% compared to today's Bobcat-powered chips, Jaguar micro-architecture features longer pipeline.

The third major improvements incorporated into the architecture is the up-to-date set of supported instructions, as reported, Jaguar features SSE4.1, SSE4.2, AES, PCLMUL, AVX, BMI, F16C as well as MOVBE.

In a bid to boost IPC by 15%, Jaguar introduces 128-bit floating point unit (FPU) with enhancements and double-pumping to support 256-bit AVX instructions as well as  an innovative integer unit with new hardware divider, larger schedulers and more out-of-order resources.

Finally, AMD added new technologies to trim power consumption; for example, AMD implemented a new CC6 state with even deeper energy economy, with each core able to go there independently.

Each Jaguar core implemented in 28nm process technology will be 3.1mm2 large, which is smaller compared to Bobcat core in 40nm, which is 4.9mm2. In order to improve performance, AMD will try to increase the amount of cores as well as clock-speed, whereas in order to offer something competitive for tablets (in Temash form), AMD will leave clock-speeds on current levels and will only put two cores into system-on-chip.