Haswell’s Memory Controller
At the first approach the memory controller of today’s LGA1150 processors, known under the codename of Haswell, doesn’t seem to differ much from its predecessors in the Sandy Bridge and Ivy Bridge CPUs. Intel’s memory communication algorithms have been evolving long and in many ways and seem to have reached a point of perfection in the latest CPU designs. What puts Intel’s modern memory controllers head above all other solutions is the connection of all CPU subunits with a common Ring Bus which was introduced in the Sandy Bridge. Thanks to the Ring Bus, all of a CPU's computing and graphics processing resources have the same fast access to both L2 cache and memory controller. As a result, the practical memory subsystem bandwidth got higher whereas its latencies were reduced.
This Ring Bus architecture has been adjusted in the Haswell design, though. In the earlier CPU designs, the Ring Bus and the L3 cache worked in sync with the CPU's computing cores, which provoked some problems with switching to power-saving states. The L3 cache and Ring Bus might lower their performance along with the x86 cores even though they were still required by the graphics core. To avoid this problem, the Ring Bus and the L3 cache are implemented as a separate and independently clocked domain in the Haswell.
With the Ring Bus now capable of being clocked asynchronously, there appeared unavoidable latencies in terms of L3 cache and memory controller access, but Intel tried to make up for that by various improvements in the microarchitecture. For example, the L3 cache acquired two parallel queues for processing requests of different types while the memory controller got longer queues and an optimized scheduler.
In fact, the Ring Bus, L3 cache and memory controller do not work asynchronously often. Apart from power-saving states, their clock rate is almost always identical to the clock rate of the x86 cores. It is only when the CPU switches to a Turbo mode or is overclocked that there is a difference. Yet even in the latter case the L3 cache and the Ring Bus are comparable to the x86 cores in their clock rate as the difference amounts to 300-500 MHz only. Practice suggests that it has no great effect on the resulting performance.
If we directly compare the Haswell’s and the Ivy Bridge’s memory controller, we will find that their bandwidth and latency are comparable at the same settings. This can be illustrated by the AIDA64 test, for example:
Ivy Bridge, 4 cores, 4.0 GHz, DDR3-1600 9-9-9-24-1N
Haswell, 4 cores, 4.0 GHz, DDR3-1600 9-9-9-24-1N
Yet we can notice that, despite Intel’s efforts, the Haswell’s memory controller is a little slower than the one we had in the previous-generation LGA1155 configurations with Ivy Bridge CPUs. The practical memory bandwidth is almost the same but the Haswell’s memory latency is 9% higher. That’s the tradeoff of the asynchronous design.
The second important innovation in the memory subsystem of the LGA1150 platform concerns mainboard design. Intel's reference design for DIMM wiring is now based on the T topology which ensures equality for DIMM slots connected to each memory channel. Making the memory controller more stable, this also ensures broader compatibility with memory modules and their configurations. Particularly, the Haswell's memory controller supports high-speed operating modes even if you install four dual-sided modules into all the available DIMM slots. Considering that the maximum capacity of currently selling DDR3 products is 8 GB, the LGA1150 platform can run up to 32 gigabytes of overclocker-friendly system memory at high clock rates and with low timings.
Otherwise, the controller has remained the same. Being dual-channel, it can work in both symmetrical dual-channel and single-channel modes. It supports Flex Memory technology to ensure dual-channel access with asymmetric memory module configurations (when the capacity and specifications of modules on different memory channels differ).
Like the Ivy Bridge series, the Haswell sets its DDR3 SDRAM clock rate with a step of 266 or 200 MHz, offering some flexibility in configuring your memory and expanding the number of clock rates supported by the controller. Although formally compatible with DDR3-1333 and DDR3-1600 SDRAM only, it allows using system memory at much higher frequencies on the LGA1150 platform. With the available frequency multipliers, you can enable even DDR3-2933 mode and you won’t have any stability issues at that.
We can also add that the Haswell's base clock rate can be increased from 100 to 125 MHz, so the top memory frequency is as high as 3666 MHz. And there's a lot of evidence on the internet that select overclocker-friendly memory can indeed be clocked at such a high frequency on the LGA1150 platform.
As you probably know, the Haswell introduced dramatic innovations into the power subsystem design. It has an integrated regulator that produces all the voltages necessary for the CPU. The mainboard now only has to yield two voltages: the processor's input voltage Vccin and the Vddq voltage for memory modules. All of the CPU's internal voltages, including those of the Ring Bus, L3 cache and memory controller, are produced by the CPU’s own integrated regulator. It means the memory voltage is not limited by the CPU, so the Haswell lets you set it above 1.65 volts safely. In other words, the LGA1150 platform allows you to overclock your memory at high voltages without fearing that the CPU's voltage regulator might fail.
All of these innovations make the Haswell’s DDR3 SDRAM controller highly efficient and suitable for overclocker-friendly memory modules, meaning that enthusiasts have got a lot of flexibility in choosing system memory for LGA1150 platforms with the aim of higher performance.