Pages: [ 1
It is believed that memory frequency and timings do not have that much influence over the performance in contemporary systems. So, it only makes sense to invest into high-seed memory modules if the other system components, such as CPU, graphics card and hard drive, are already running at the top of their ability. This reasoning didn’t emerge out of thin air. True, the tests show that improving the memory sub-system settings in Phenom II, Core i7 and Core i5 systems will only provide about 3-7% speed gain, which is a minor improvement.
However, these conclusions were first made a while back, so they are primarily true for the previous generation platforms. As for the influence the memory sub-system speed has on the overall performance of contemporary LGA1155 systems, we haven’t yet discussed anything like that. And it obviously makes little sense to translate the old results onto the new Sandy Bridge based platform. Back during our first discussion of this new innovative microarchitecture, we pointed out that the implementation of the memory controller in Sandy Bridge is dramatically different from the way it was implemented in the older Westmere and Nehalem processors. Namely, now the memory controller is located inside a different functional unit than the L3 cache and uses a new ring bus to connect to the computational cores of the processor. All this could have any sort of effect on the memory subsystem contribution to the overall system performance. So, we decided to set up a special test session and find out which memory would be the most optimal choice for LGA1155 processors.
Closer Look at the Memory Controller
Sandy Bridge processors have the same formal characteristics of the memory controller as their predecessors. According to the manufacturer, it is compatible with up to 32 GB of DDR3-1066/1333 SDRAM. And just as before, you can use one or two unbuffered memory modules per channel. ECC technology is not supported in desktop processors. the single-channel and asymmetrical modes also remained unchanged: you do not have to have identical number of modules with identical specs, but you can only achieve maximum performance if you have an even number of identical DDR3 SDRAM modules in your system.
Although the specifications of the Sandy Bridge memory controller look exactly the same as those of the earlier memory controllers used in LGA1156 processors, its internal structure is totally different. I would like to remind you that the first processors on Nehalem microarchitecture had their memory controller and L3 cache located within the same Uncore functional unit working at its own frequency and voltage settings and using a crossbar bus to connect to the computational cores. Later, when Westmere processors came out, the memory controller was combined with the processor graphics core and was placed on an individual semiconductor die (separate from the computational cores). The internal processor structure in Sandy Bridge CPUs changed drastically again. They introduced ring bus to connect all processor units with one another, and L3 cache memory became an individual functional entity. As a result, the memory controller stopped wandering around different parts of the processor, turned into an independent unit and took its place next to the System Agent.
This internal reorganization had exceptionally positive effect on the memory sub-system performance. The memory controller got closer to the computational cores, so to speak. Now it is not only in the same semiconductor die as the L3 cache and the CPU cores, but also isn’t logically separated from them, because it is connected to the same ring bus, which connects all inter-processor components with one another. This connection to the ring bus working at the processor frequency ensured a significant increase in bandwidth between the processor cores, L3 cache and memory controller.
We clearly see that processors from the Sandy Bridge generation work with the memory much faster now. To illustrate this statement we resorted to popular synthetic benchmarks, such as AIDA64 Cachemem and MaxxMem2, to measure practical bandwidth and latency of the memory sub-system in LGA1156 and LGA1155 platforms. We used Lynnfield CPUs (on Nehalem microarchitecture), Clarkdale CPUs (on Westmere microarchitecture) and Sandy Bridge CPUs working at the same frequency of 3.2 GHz and equipped with the same dual-channel DDR3-1333 SDRAM with 7-7-7-21 timings.
As we see, Sandy Bridge not only fixes the issues of the Westmere microarchitecture, where the memory controller performance became almost the biggest disappointment, but goes even beyond that. From the practical values standpoint, Sandy Bridge memory controller became even faster than that in Lynnfield processors, although the latter has been considered the industry’s most efficient memory controller until recently. Its advantage is the most significant when it comes to memory bandwidth, which is, in fact, quite logical, considering that the memory controller is connected directly to the Sandy Bridge ring bus. So, it looks like the memory sub-system speed in LGA1155 systems may actually have a much more serious effect on performance than in previous generation platforms, because there are now fewer bottlenecks along the path between the computational cores and the actual memory.
Pages: [ 1
Comments currently: 14
Discussion started: 05/17/11 12:16:10 AM
Latest comment: 09/03/15 01:48:54 PM
Add your Comment
Enter your username and e-mail address. Password will be sent to you.