Articles: Memory
 

Bookmark and Share

(11) 
Pages: [ 1 | 2 | 3 | 4 | 5 | 6 | 7 ]

Memory Controller Features and Peculiarities

Sandy Bridge-E LGA 2011 processors are blood relatives to the conventional Sandy Bridge CPUs that we are very well familiar with. Nevertheless, there are certain differences between them that originate from the fact that Sandy Bridge-E processors were initially designed for servers and high-performance workstations and then were adapted for the high-end desktop systems, too. This is exactly why Sandy Bridge-E processors do not have an integrated graphics core, but their semiconductor die may have up to 8 computational cores, a QPI bus controller, a PCI Express bus controller supporting up to 40 lanes, and our today’s hero – a quad-channel memory controller. Those LGA 2011 Core i7 processors that may be of interest to the desktop users, are also from the Sandy Bridge-E class, but have been slightly simplified and modified from their original versions. They have only 6 cores, a disabled QPI bus controller, and a modified memory controller to meet the needs of the desktop users.

In respect to the memory controller in Sandy Bridge-E it means that it is originally designed to support Registered and Load-Reduce memory modules allowing to install up to three DIMMs into each memory channel. This way Intel engineers were trying to please those server makers who need to install large amounts of memory without using the solutions like FB-DIMM, which have never really taken off that well. However, configurations like that would hardly be acceptable for computer enthusiasts, because Registered memory modules are more expensive than regular ones and their performance is lower. Besides, the ability to support up to 12 DDR3 DIMM modules can hardly be considered an advantage for a desktop platform. There are a lot of modules with up to 8 GB capacity available in the today’s market, so even four DIMM slots are usually more than enough for a desktop mainboard.

Therefore, Sandy Bridge-E Core i7 processors use a memory controller that has been specifically modified to meet the needs of computer enthusiasts. It kept a few server features, but most importantly became compatible with the conventional unbuffered DDR3 DIMMs. AT the same time, it retained its most significant distinguishing feature – quad-channel architecture. However, now each memory channel supports only one dual-bank module. Note that even though many mainboard makers equip their products with eight DIMM slots, Intel doesn’t officially guarantee fault-free operation of more than one memory module per channel. Theoretically, it may produce certain stability problems in systems using eight double-sided DIMMs at the same time.

According to the official specification, Core i7 processors for LGA 2011 systems support DDR3-1067/133/1600 SDRAM and can work with even faster modules during overclocking. As we can see, Intel raised the official frequency of the compatible memory modules, which combined with the increase in the number of supported memory channels generated gigantic gain in the theoretical bandwidth. However, it will be fairly difficult to see this impressive memory sub-system performance boost in real life. Increase in memory performance wasn’t the primary reason for implementing quad-channel memory access. Contemporary Sandy Bridge-E processors have a very large cache, which smoothes out the effects of the memory sub-system performance on the overall system speed. However, there are very few actual applications that require streaming access to large amounts of data and therefore could benefit from increased memory bandwidth. Therefore, the primary goal of the quad-channel memory structure is to support more DIMM slots, which will enable server and high-performance workstation makers to equip their systems with enormous amounts of memory, especially since the price of contemporary DDR3 SDRAM has dropped dramatically. Optimization of the quad-channel memory controller performance has become a secondary task in this case and was addressed with focus on multi-threaded server loads.

This is very sad news for the desktop users. Since most desktop applications do not use multi-threaded memory access, you may not even notice any actual performance improvement when switching from the dual-channel memory controller in Sandy Bridge to the quad-channel memory controller in Sandy Bridge-E . This is what we have already pointed out during our first review of the new LGA 2011 platform. We measured the practical bandwidth of the quad-channel DDR3 SDRAM in LGA 2011 systems using common utilities for memory sub-system tests and we hardly saw any difference from the results demonstrated by the dual-channel memory in LGA 1155 platforms. To illustrate what I have just said, allow me to provide the results of our practical bandwidth and latency tests performed on the Sandy Bridge-E memory sub-system working in dual-, triple- and quad-channel mode and compare them with the results taken off a Sandy Bridge based system with dual-channel memory inside. The CPUs in both platforms were working at 4.4 GHz frequency and the memory was configured as DDR3-1867 with 9-11-9-27-1T latency. We used AIDA64 Cache & Memory Benchmark utility.

The results on the diagram seem like a paradox. Not only the dual-channel Sandy Bridge memory controller shows higher practical bandwidth readings and lower actual latency, but the Sandy Bridge-E platform doesn’t get any memory performance improvement upon increase in the number of memory channels. However, there is no mistake here and the selected benchmark works perfectly fine. The thing is that in this case they use a single-threaded algorithm, which doesn’t let the advantages of the Sandy Bridge-E memory controller to come through. In case of simple single-threaded memory requests a quad-channel memory controller will not be any better than a dual-channel one.

We can also show you the results of a memory test from SiSoftware Sandra 2011 suite, which uses a different algorithm generating multi-threaded requests.

The situation here is completely the opposite. Using quad-channel memory access allows to almost double the practical bandwidth compared with the dual-channel mode. And here we very clearly see that Sandy Bridge-E is superior to the regular Sandy Bridge in practical memory bandwidth. However, when it comes to practical latency, we again so no advantage whatsoever, which is, in fact, quite logical.

We will get a much more interesting and informative picture if we resort to a canonical multi-platform benchmark called Stream, which allows measuring actual memory bandwidth during a random number of parallel requests.

As you can see from the obtained results, the old dual-channel Sandy Bridge controller provides maximum bandwidth even on single-threaded memory requests. And further increase in the number of threads working with the memory sub-system doesn’t affect the practical bandwidth in any way. Things are totally different when it comes to Sandy Bridge-E. The memory controller in this processor becomes much more efficient as the workload increases. Even if it works in dual-channel mode, we notice a significant performance improvement when we switch from single- and dual-threaded requests to quad-threaded ones. If all four memory channels are utilized, the performance may triple when the level of load parallelism increases. As a result, the performance of the memory sub-system in an LGA 2011 system may be almost twice as high as that in a system with an LGA 1155 processor.

There is hardly anything that could better reveal the peculiarities of the memory controller in LGA 2011 processors. Its original optimization for servers and high-performance workstations obviously determines in which cases its strengths show their best. The theoretical bandwidth of the quad-channel memory does look very impressive, but the only way to take advantage of it in a desktop LGA 2011 system is if the data exchange is performed in multiple parallel threads. As for standard single-threaded algorithms, the memory sub-system in Sandy Bridge-E processors will be even slower than in LGA 1155 platforms.

 
Pages: [ 1 | 2 | 3 | 4 | 5 | 6 | 7 ]

Discussion

Comments currently: 11
Discussion started: 01/07/12 07:58:42 AM
Latest comment: 02/20/14 01:56:29 PM

View comments

Add your Comment