Brief Overview of Ivy Bridge Microarchitecture
Although we’ve said that the Ivy Bridge microarchitecture has some significant differences from its predecessor Sandy Bridge, it’s very easy to see that they are closely related to each other. There are in fact no differences at the very top level of the overall CPU structure. The improvements are in small details. You can refer to our special report for a detailed description of the innovations. We’ll just glance over the key points here.
First of all, the Ivy Bridge series doesn’t change the platform. These CPUs are installed into the same LGA1155 socket as their predecessors and are fully compatible with existing mainboards. Intel has recently rolled out its 7 series chipsets, spearheaded by the Z77 model, but you don’t really need a mainboard with the new chipset to use an Ivy Bridge CPU. Like the Sandy Bridge series, the Ivy Bridge employs a 20Gbps DMI 2.0 bus. So again, the new CPUs will work problem-free on any LGA1155 mainboard.
Ivy Bridge processors have the same functional subunits as their Sandy Bridge predecessors: two or four cores equipped with individual 256KB L2 caches; an integrated graphics core; up to 8 megabytes of shared L3 cache, a dual-channel controller of DDR3 SDRAM; a controller of the graphics PCI Express bus; and a system agent responsible for the Turbo technology and auxiliary interfaces. Every component of an Ivy Bridge CPU is linked to a ring bus, just like in a Sandy Bridge.
The key difference of Ivy Bridge CPUs from their predecessors is the new 22nm manufacturing process. Besides being “thinner”, the transistors have a different internal design. Intel calls it tri-gate, which implies the use of a tall silicon fin cutting through the gate and coated with High-K dielectric.
As a result, semiconductor devices with tri-gate transistors can work at lower voltage and dissipate less heat. According to the official datasheet, the Ivy Bridge enjoys a 50% advantage over the Sandy Bridge in terms of performance per watt.
The increased energy efficiency is most welcome considering that one of the most purposes of the Ivy Bridge series is their massive use in ultra-mobile computers. To reinforce the effect, Intel engineers have introduced new power-saving technologies: deeper sleep states, the option of powering down the memory controller, support for low-voltage DDR3L SDRAM, and the so-called configurable TDP. As a result, the numerous Ivy Bridge modifications include a whole class of ULV products with a TDP of 17 watts that can be further lowered to 14 watts.
The new manufacturing technology means smaller semiconductor dies. The die of a quad-core Ivy Bridge is 160 sq. mm or 35% smaller than a quad-core Sandy Bridge.
Despite that, the new CPU incorporates 1.4 billion transistors as opposed to the predecessor’s 995 million.
The additional transistors are usually utilized for cache memory but not in the Ivy Bridge CPUs which have the same L1, L2 and L3 caches as the Sandy Bridge. So, this time around, the extra transistors can mostly be found in the integrated graphics core. It is almost a totally different thing from the previous-generation Intel HD Graphics 3000/2000.
The HD Graphics 4000 core is up to today’s requirements and standards. It complies with DirectX 11 (DirectCompute and Shader Model 5.0) and supports GPGPU via OpenCL 1.1. The HD Graphics 4000 supports up to three displays while its performance has been improved by increasing the number of execution devices from 12 to 16. That’s why Intel expects more computers with Intel CPUs to be used without a discrete graphics card, mostly in the mobile market segment.
The integrated graphics isn’t so important for desktop users, though. They want higher performance from the new CPUs as regards pure computing. Unfortunately, the Ivy Bridge series can’t offer much in this respect. Clocked at the same frequency, an Ivy Bridge is only expected to be some 5% faster than a Sandy Bridge. The execution cores have remained largely intact in the new microarchitecture with but a few minor improvements. To be specific, the Ivy Bridge is somewhat faster in terms of integer and floating-point division. The speed of data transfers between registers has been improved, too. The static distribution of internal buffer resources between different instruction threads for Hyper-Threading is now replaced with dynamic distribution.
To check out the practical benefits of these changes we ran the synthetic tests from the SiSoft Sandra benchmark. They implement simple algorithms and allow to evaluate the performance of a CPU at different types of operations. As a preliminary test, we compared quad-core Sandy Bridge and Ivy Bridge CPUs clocked at 4.0 GHz with Hyper-Threading disabled.
Indeed, the minor improvements in the execution core design can hardly be observed in the performance tests.
The different kind of improvements – those concerning the memory and PCI Express bus – may turn out to be more interesting, though. The PCI Express controller integrated into the Ivy Bridge supports the third version of the bus, which means a twofold increase bandwidth compared to PCI Express 2.0 – up to 8 GT/s.
The 16 PCI Express lanes supported by the Ivy Bridge can be split up in two or three groups: 8x+8x or 8x+4x+4x. The latter variant may be interesting for triple-GPU configurations especially as PCI Express 3.0 is quite capable of providing the required bandwidth even via 4 lanes.
As for the memory controller, its key specs haven’t changed since the Sandy Bridge series. It still supports dual-channel DDR3 SDRAM but has been enhanced in terms of frequencies. The top frequency it supports is now DDR3-2800 SDRAM and you can use change the clock rate using a step of 200 or 266 MHz.
The actual performance of the memory controller has changed somewhat, too. We can see that in benchmarks. Take a look at the results of AIDA64 Cache & Memory Benchmark on Sandy Bridge and Ivy Bridge CPUs clocked at 4.0 GHz.
Sandy Bridge 4.0 GHz, DDR3-1867 (9-11-9-30-1T)
Ivy Bridge 4.0 GHz, DDR3-1867 (9-11-9-30-1T)
The Ivy Bridge is somewhat better in terms of practical memory latency, but its advantage is negligible. The benchmark reveals another fact: the new CPUs seem to have a faster L3 cache. Well, this is not really so. The difference is due to the speed of execution of the benchmark instructions. As a matter of fact, the L3 cache latency is 24 cycles with the Ivy Bridge, which is 1 cycle more than with the Sandy Bridge. In other words, the L3 cache has got somewhat slower in the new CPUs but you can hardly observe this in practical applications.