by Ilya Gavrichenkov
01/07/2012 | 04:50 AM
The arrival of the LGA 2011 platform put computer enthusiasts in a pretty difficult situation. On the one hand, this platform the best-performing solution Intel offers today to the desktop users. On the other hand, as we have already seen in our practical tests, this platform has a lot of drawbacks and we can, actually, argue whether LGA 2011 performance can make up for them. Another factor that may encourage you to hold off from transitioning to the new LGA 2011 is the “isolation” of this platform that requires not only its own processors and mainboards, but also special cooling systems and quad-channel memory. Of course, in the end LGA 2011 systems look very impressive, but are they worth the substantial financial investments? Objective arguments won’t really help you answer this question definitively, because the appeal of the new platform right now is more of an emotional matter. And if the desire to have the best-of-the-best computer system is stronger than the voice of reason telling you that LGA 2011 systems are far from ideal in terms of price-to-performance, inefficient in terms of power consumption and simply not practical, then all we have to do is help you find the best LGA 2011 mainboard, most efficient cooler and an optimal memory kit. We are going to post all these recommendations in the corresponding sections of our web-site very shortly, and today we are going to focus on choosing the optimal memory for the new Sandy Bridge-E processors.
Especially, since this is not a trivial task after all. LGA 2011 platform is the first desktop platform supporting quad-channel architecture. And while LGA 1155 and LGA 1366 platform didn’t seem to seriously depend on the memory frequency and timings, now things can change dramatically. It must have been for a good reason that Intel enabled its new platform to officially support higher memory frequencies and even allowed memory overclocking. On the other hand, the memory controller in Sandy Bridge-E processors is also a very interesting device. For example, at first we saw that four memory channels provided almost no performance gain compared with the dual-channel mode and therefore concluded that it would be possible to use old two- and three-module DDR3 SDRAM kits from the older systems in the new LGA 2011 platforms. However, during our continuous communication with the Intel people we uncovered a few peculiarities of the Sandy Bridge-E memory sub-system, which will allow us to explain why the quad-channel memory controller didn’t perform that well in the practical bandwidth tests after all. But let’s not get ahead of ourselves and at first talk a little bit about advantages and drawbacks of the memory sub-system structure in LGA 2011 systems.
This time we tested the performance of different memory sub-system configurations in LGA 2011 and LGA 1155 systems. We used the following two testbeds:
LGA 2011 platform:
LGA 1155 platform:
The following hardware and software components were identical on both platforms:
Sandy Bridge-E LGA 2011 processors are blood relatives to the conventional Sandy Bridge CPUs that we are very well familiar with. Nevertheless, there are certain differences between them that originate from the fact that Sandy Bridge-E processors were initially designed for servers and high-performance workstations and then were adapted for the high-end desktop systems, too. This is exactly why Sandy Bridge-E processors do not have an integrated graphics core, but their semiconductor die may have up to 8 computational cores, a QPI bus controller, a PCI Express bus controller supporting up to 40 lanes, and our today’s hero – a quad-channel memory controller. Those LGA 2011 Core i7 processors that may be of interest to the desktop users, are also from the Sandy Bridge-E class, but have been slightly simplified and modified from their original versions. They have only 6 cores, a disabled QPI bus controller, and a modified memory controller to meet the needs of the desktop users.
In respect to the memory controller in Sandy Bridge-E it means that it is originally designed to support Registered and Load-Reduce memory modules allowing to install up to three DIMMs into each memory channel. This way Intel engineers were trying to please those server makers who need to install large amounts of memory without using the solutions like FB-DIMM, which have never really taken off that well. However, configurations like that would hardly be acceptable for computer enthusiasts, because Registered memory modules are more expensive than regular ones and their performance is lower. Besides, the ability to support up to 12 DDR3 DIMM modules can hardly be considered an advantage for a desktop platform. There are a lot of modules with up to 8 GB capacity available in the today’s market, so even four DIMM slots are usually more than enough for a desktop mainboard.
Therefore, Sandy Bridge-E Core i7 processors use a memory controller that has been specifically modified to meet the needs of computer enthusiasts. It kept a few server features, but most importantly became compatible with the conventional unbuffered DDR3 DIMMs. AT the same time, it retained its most significant distinguishing feature – quad-channel architecture. However, now each memory channel supports only one dual-bank module. Note that even though many mainboard makers equip their products with eight DIMM slots, Intel doesn’t officially guarantee fault-free operation of more than one memory module per channel. Theoretically, it may produce certain stability problems in systems using eight double-sided DIMMs at the same time.
According to the official specification, Core i7 processors for LGA 2011 systems support DDR3-1067/133/1600 SDRAM and can work with even faster modules during overclocking. As we can see, Intel raised the official frequency of the compatible memory modules, which combined with the increase in the number of supported memory channels generated gigantic gain in the theoretical bandwidth. However, it will be fairly difficult to see this impressive memory sub-system performance boost in real life. Increase in memory performance wasn’t the primary reason for implementing quad-channel memory access. Contemporary Sandy Bridge-E processors have a very large cache, which smoothes out the effects of the memory sub-system performance on the overall system speed. However, there are very few actual applications that require streaming access to large amounts of data and therefore could benefit from increased memory bandwidth. Therefore, the primary goal of the quad-channel memory structure is to support more DIMM slots, which will enable server and high-performance workstation makers to equip their systems with enormous amounts of memory, especially since the price of contemporary DDR3 SDRAM has dropped dramatically. Optimization of the quad-channel memory controller performance has become a secondary task in this case and was addressed with focus on multi-threaded server loads.
This is very sad news for the desktop users. Since most desktop applications do not use multi-threaded memory access, you may not even notice any actual performance improvement when switching from the dual-channel memory controller in Sandy Bridge to the quad-channel memory controller in Sandy Bridge-E . This is what we have already pointed out during our first review of the new LGA 2011 platform. We measured the practical bandwidth of the quad-channel DDR3 SDRAM in LGA 2011 systems using common utilities for memory sub-system tests and we hardly saw any difference from the results demonstrated by the dual-channel memory in LGA 1155 platforms. To illustrate what I have just said, allow me to provide the results of our practical bandwidth and latency tests performed on the Sandy Bridge-E memory sub-system working in dual-, triple- and quad-channel mode and compare them with the results taken off a Sandy Bridge based system with dual-channel memory inside. The CPUs in both platforms were working at 4.4 GHz frequency and the memory was configured as DDR3-1867 with 9-11-9-27-1T latency. We used AIDA64 Cache & Memory Benchmark utility.
The results on the diagram seem like a paradox. Not only the dual-channel
We can also show you the results of a memory test from SiSoftware Sandra 2011 suite, which uses a different algorithm generating multi-threaded requests.
The situation here is completely the opposite. Using quad-channel memory access allows to almost double the practical bandwidth compared with the dual-channel mode. And here we very clearly see that Sandy Bridge-E is superior to the regular
We will get a much more interesting and informative picture if we resort to a canonical multi-platform benchmark called Stream, which allows measuring actual memory bandwidth during a random number of parallel requests.
As you can see from the obtained results, the old dual-channel
There is hardly anything that could better reveal the peculiarities of the memory controller in LGA 2011 processors. Its original optimization for servers and high-performance workstations obviously determines in which cases its strengths show their best. The theoretical bandwidth of the quad-channel memory does look very impressive, but the only way to take advantage of it in a desktop LGA 2011 system is if the data exchange is performed in multiple parallel threads. As for standard single-threaded algorithms, the memory sub-system in Sandy Bridge-E processors will be even slower than in LGA 1155 platforms.
As we have just seen, installing memory modules into all four channels on LGA 2011 systems may generate no performance improvement compared with the dual- or triple-channel memory configurations. The memory load should be multi-threaded, and only in this case the multiple channels of the Sandy Bridge-E memory controller will translate into performance growth. It is all pretty straight-forward in theory, but what does it look like in real applications? What types of requests are generated by typical desktop tasks? In order to answer this question we once again tested the LGA 2011 system with DDR3-1867 SDRAM working in dual-, triple- and quad-channel modes in popular general-purpose benchmarks. To get a good idea of the relative performance drop upon reduction in the number of channels we also added the results for the sale LGA 2011 platform equipped with slower quad-channel DDR3-1600 SDRAM working with the same CL9 timings.
However, I think we should start with the Stream results: this test shows really well what a given memory configuration is capable of in ideal testing conditions.
As you can see from the obtained results, the loss of one memory channel during multi-threaded load is about twice as bad as reducing its frequency by 266 MHz. the same can be concluded from the results of the theoretical bandwidth tests. Quad-channel DDR3-1867 SDRAM provides 59.7 GB/s bandwidth, while triple-channel memory – 44.8 GB/s. Quad-channel DDR3-1600 SDRAM may be able to ensure up to 51.2 GB/s bandwidth. However, it is important to remember that DDR3-1867 also has an advantage in latency, independent of the number of utilized memory channels.
Now let’s check out the results of complex benchmarks:
The most important conclusion made from the graphs above is that reducing the number of memory channels in LGA 2011 platforms doesn’t lead to any catastrophic consequences. Yes, the performance drops, but only as little as by 1-2%, which you will barely feel in real applications. Moreover, strange as it might seem, but reduction in memory frequency almost always affects the performance more seriously than the use of fewer memory channels. In other words, if you already have a high-speed dual- or triple-channel DDR3 SDRAM kit, then the only way to replace it with a quad-channel one for your LGA 2011 system is if the memory frequency doesn’t get any lower than it used to be. Otherwise, high-speed DDR3 SDRAM will deliver higher performance in most applications even though it will be using fewer channels.
In fact, the only benchmark where we saw the true advantages of the quad-channel access was the synthetic eight-thread Stream. And it indicates clearly that real desktop applications do not use enough parallel threads when working with the memory sub-system to allow Sandy Bridge-E memory controller optimized for server workloads to show its hidden potential.
Regular Sandy Bridge processors officially support DDR3-1067 and DDR3-1333 SDRAM. Sandy Bridge-E processors also acquired official support for DDR3-1600 SDRAM. However, the processor memory controller in both, LGA 1155 as well as LGA 2011, has much more dividers, which allow clocking the memory at even higher frequencies. Speaking of Sandy Bridge-E, we can say that this processor allows configuring the memory as DDR3-1867, DDR3-2122 or even DDR3-2400 SDRAM. Almost the same was true for the LGA 1155 systems, but the new LGA 2011 boast yet another feature – they allow changing the base clock generator frequency, too.
Unlike LGA 1155 systems, the new LGA 2011 platform uses an enhanced clocking algorithm for the CPU and all connected system knots, which allows the user to set the BCLK frequency not only the standard 100 MHz, but also to 125 or even 166 MHz. The frequencies of the system busses and controllers do not change in this case, but the CPU and all other units in it get proportionally overclocked. So, it also refers to the memory controller: by setting higher base clock generator frequency, you get wider range of supported memory frequencies to choose from.
This way the LGA 2011 platform offers more flexible memory overclocking than LGA 1155 platform. For example, if you use the additional supported BCLK frequencies, you will be able to get memory frequencies with an increment below 266 MHz. Moreover, you can set even higher frequencies than before. And the use of high-speed DDR3 SDRAM doesn’t pose any particular problems. If your system is equipped with four DDR3 SDRAM modules, operational modes up to DDR3-1867 will be available to you without any additional effort. With eight modules in the system this maximum may be lowered to DDR3-1600.
However, even if you get to the point when you need to make some “effort”, it doesn’t imply anything extraordinary. To ensure stability of the system at higher memory frequencies, you may need to increase the VTT voltage on the memory controller or the VCCSA voltage on the processor system agent, or even on both. Overclocker memory makers recommend using pretty high settings for these voltages, and sometimes they even record them in the modules XMP profiles. They really help a lot during overclocking. However, during our conversations with Intel we managed to find out that only the increase to 1.1 V on VCCSA and to 1.2 V on VTT may be considered “safe” for everyday use in 24/7 mode. Moreover, you should also keep in mind the recommendation not to exceed 1.65 V for the memory voltage, otherwise, it may stimulate degradation of the memory controller and untimely death of your processor. However, you can almost always reach DDR3-2133 frequency even without using the “risky” settings.
In this respect I can’t help mentioning that the manufacturers of overclocker memory were very excited about the new LGA 2011 platform. Many companies started offering not only quad-channel DDR3-2133, but even DDR3-2400 memory. However, purchasing high-speed modules like that can only make sense for benchmarking applications. For example, with DDR3-2400 you will most likely have to use very unsafe voltage settings. Moreover, I am sure that not all mainboards can handle memory working at this high frequency. However, there shouldn’t be any problems with DDR3-2133 and this memory will be a perfect fit for contemporary high-performance desktop systems.
We didn’t have to think for a long time which memory to choose for our LGA 2011 performance tests with different frequencies and timing settings. G.Skill was one of the first companies to release high-frequency quad-channel DDr3 SDRAM kits for Sandy Bridge-E, so when the company representatives offered us to test-drive their new product, we obviously agreed. So, we received a quad-channel DDR3-2400 SDRAM kit with default timings set at 10-11-10-31 and total capacity of 16 GB. As you may have already guessed, this kit consists of four memory modules, each 4 GB big.
The complete official specifications of the kit with F3-192000CL10Q-16GBZHD part number look as follows:
The modules are covered with unique aluminum Ripjaws heat-spreaders. They are of stylish black color. The kit also includes two DIMM frames with fans that should cool the modules during work. A complex cooling system like that was most likely included for marketing reasons rather than practical necessity. In reality there is no need in such powerful airflow directed towards the memory modules, because they don’t get too warm. However, we would like to give G.Skill special credit for their unique heat-spreader configuration: unlike many other memory makers, G.Skill’s shaped aluminum plates are pretty short, so they don’t interfere with the large processor air-coolers that are pretty popular these days. The total height of the G.Skill memory modules is only 41 mm:
To ensure simple installation procedure, G.Skill RipjawsZ F3-192000CL10Q-16GBZHD modules support XMP 1.3 technology. The only preset XMP profile contains all nominal timings and information about the need to increase the VCCSA voltage to 1.2 V. I have to say that to ensure that these modules can work stably at 2400 MHz, you will also need to increase the VTT voltage, but unfortunately, it is impossible to record this setting into the XMP to be adjusted automatically. Therefore, after installing this memory into your system you will need to additionally adjust the VTT voltage before you can start using it in this mode.
Most overclocker RipjawsZ modules, including the ones we got for review this time, are built with specifically selected Hynix H5TQ2G83BFR-H9C memory chips. At this time it is a pretty exotic solution, but it looks like these chips will become more popular, because as we can see, they are capable of working at very high frequencies, proving just as good as the well-known Powerchip and Elpida Hyper chips.
G.Skill can currently confirm that their memory works as DDR3-2400 on four popular mainboards: ASUS Rampage IV Extreme, ASUS P9X79 DELUXE, ASRock X79 Extreme 4 and MSI X79A-GD45. It turned out that this list is not just a formality, and on other mainboards the problems are more than likely to occur. For example, we couldn’t get these modules to work stably at the desired frequency on our ASUS P9X79 Pro mainboard: they refused to work at anything above DDR3-2133. However, there were absolutely no problems with ASUS Rampage IV Formula, and our G.Skill RipjawsZ F3-192000CL10Q-16GBZHD kit immediately took off as DDR3-2400 without any special effort.
In the meantime we got to the most exciting part of our today’s discussion: finding out what is more important for the memory sub-system in the new LGA 2011 platform – low timings or high frequency. This question is inevitable upon the launch of every new platform, and this time is also no exception. All previous Intel platforms with DDR3 SDRAM favored the increase in memory frequency more than lowering of its timings, but things are different with Sandy Bridge-E. This CPU uses quad-channel memory, which by itself provides superior bandwidth and additional increase in the frequency of the DDR3 SDRAM modules may turn out unnecessary. Therefore, we were particularly excited about this part of our test session.
We started with synthetic benchmarks.
Higher operation frequency of DDR3 SDRAM allows increasing its practical bandwidth and lowering practical latencies. Lower timings produce a similar effect, but according to AIDA64 cache & Memory Benchmark, this effect is significantly smaller than after frequency overclocking.
However, as we have already found out earlier, AIDA benchmark doesn’t measure the characteristics of Sandy Bridge-E memory controller correctly. Therefore, we also ran the tests in Stream benchmark in two modes – with one and eight threads:
This new test only confirms everything that we have just said. Practical memory bandwidth continues to grow as its frequency increases, and the gain in this case exceeds the effect from lowering the memory timings. So, it looks like we won’t be able to uncover any new connections between the performance and the parameters of the memory sub-system in the LGA 2011 platform.
And here are the results of a few benchmarks measuring complex system performance:
The frequency of DDR3 modules is a more important parameter for the system performance than its latencies in the end of the day. Now we have every reason to state this with all certainty. And 1.5 times better timings (as we can see from the DDR3-1333 example) only improved the performance by 1.6%, while increase in memory frequency by one 266 MHz increment delivered a 1.9% average performance boost. However, both these numbers are rather modest, which indicates that memory sub-system settings have very small overall effect on the system performance. Even a seemingly significant upgrade from quad-channel DDr3-1333 to quad-channel DDR3-2133 improves the performance by only 4% on average. Of course, there are applications that could benefit much more from having faster memory, but eve a memory-sensitive WinRAR archiving tool will speed up by only 19% after an 80% increase in memory frequency.
The study of the memory controller in the new Sandy Bridge-E processors produced unexpected and at the same time very interesting results. It turned out that this time Intel used a totally different approach to optimizing the work of the memory sub-system. The main idea of this approach was to optimize the quad-channel DDR3 SDRAM controller not for single-threaded but for multi-threaded load, which is more typical of the servers and high-performance workstations. Therefore, the Sandy Bridge-E memory controller doesn’t look too good in traditional desktop benchmarks. However, the new system simply blows you away by its unprecedented practical bandwidth in special tests such as Stream, for instance.
Unfortunately, this is bad news for the desktop users. Most typical desktop applications do not address the memory in multiple parallel threads. Therefore, we have quite a paradox in reality, when quad-channel memory access provides minimal or no benefits. Even though it may seem unbelievable, you will get practically the same performance if you use a dual- or triple-channel DDR3 SDRAM in your LGA 2011 system instead of a special quad-channel kit.
This may be a pretty useful piece of information for those users, who do not feel like upgrading their memory. They will experience truly minimal performance loss if they decide to give up quad-channel memory access.
The most important parameter for the memory sub-system in the new LGA 2011 platforms is certainly DDR3 SDRAM frequency. This parameter has a great effect on the overall system performance than the number of channel, and a much greater effect than the latencies. Therefore, you should pay special attention to this particular parameter while shopping around for memory for your LGA 2011 system. In fact, this is no news at this point: the same priority was in place for other platforms with DDR3 SDRAM, too. The overclocker memory makers are well aware of this that is why they have given up the hunt for lower timings and are focusing mostly on hitting higher frequencies.
LGA 2011 platform favors memory overclocking and allows achieving stability at pretty high DDR3 speeds. Now that the prices on DIMM modules are crashing down, it is a good chance for the manufacturers to make some money off their more expensive elite DDR3 kits. However, don’t be misled by aggressive marketing. The importance of high memory frequency, which we stated earlier, is a relative factor. In fact, the Sandy Bridge-E platform performance only depend so much on the memory frequency and even if you put in the fastest memory out there, you gain may not exceed a few percent. So, we suggest investing into expensive high-speed memory in the end, when all other system components have been upgraded to the ideal choices.