by Ilya Gavrichenkov
11/03/2008 | 08:20 PM
Yesterday Intel lifted the NDA on reviews of desktop Core i7 processors based on new Nehalem microarchitecture. Of course, we couldn’t miss this important event and tested the new CPU. The benchmark results add to the information we have already revealed in our review of the new microarchitecture. Nevertheless, we have to stress that this review doesn’t signify the official launch of the new processor. Core i7 family will be officially released in the middle of this month, when the new solutions appear in retail.
The first processors from the new family will belong to the upper price range. Until mid of next year they will hardly affect the mass market, where very successful Core 2 Quad and Core 2 Duo processors will still be available.

Nevertheless, the release of Core i7 processors is a very significant event for the computer market. And not only because this way Intel once again raised the desktop systems performance. From now on Intel start increasing the level of integration of their processors by moving the chipset North Bridge functions into them. New CPUs that we are going to talk about today have an integrated memory controller and monolithic quad-core design. However, this is just the beginning. Their upcoming successors will have an integrated graphics core and a PCI Express bus controller.
Together with Core i7 CPUs, Intel launch their new chipset: Intel X58 Express. And even though it doesn’t have any innovative peculiarities, the computer enthusiasts hunting for maximum performance will have to upgrade their entire platform. This is the platform we are going to talk about in our today’s review.
Since Core i7 belongs to the new processor generation using Nehalem microarchitecture, we should start with listing all the innovations it will boast. Among the key peculiarities of the new CPU we absolutely have to point out the following:

We should also keep in mind the microarchitectural improvements we have already discussed in our special dedicated article. Here we would like to say that there is nothing revolutionary about all these microarchitectural innovations and they mostly result from optimizations of the existing Core microarchitecture for work with SMT technology. All other innovations that come to desktop platforms together with Core i7 processors deal with the platform as a whole.
That is why no wonder that Core i7 processors differ from their predecessors not only on the inside, but also on the outside. New CPUs use LGA1366 processor socket that is larger and has more pins than the common LGA775 socket.
The processor has also become bigger. And unlike its predecessors, it is rectangular and not square.
The triple-channel memory controller must be the reason for adding more pins, because before that it used to be in the chipset North Bridge of Intel systems.
There will be three desktop Core i7 models available:

As you can see from the table above, the clock speeds of the new processors are not much different from the clock frequencies of their predecessors from Core 2 Quad family. And it means that the new generation processors will get their performance advantage from architectural solutions and new technologies.
As for the typical heat dissipation, Core i7 has higher TDP than Core 2 Quad processors. However, the top quad-core models on Core microarchitecture that belong to the Extreme Edition family have 136W TDP. It is quite logical that Core i7 doesn’t have any qualitative changes in the heat dissipation aspect: Nehalem microarchitecture didn’t get that far away from Core, and the manufacturing process used for Core i7 production remained completely the same.
Nevertheless, Intel decided that the old cooling solutions may not be used with the new processors and changed the distance between the cooler retention holes. They may have done it to encourage users to go for more efficient cooling solutions. You can tell by the cooler that arrived into our lab together with the test CPU sample. Before, the coolers for Core 2 processors used to feature aluminum heatsinks with copper core. Now half of the heatsink fins are also made of solid copper. The fins have become thinner and there are more of them; besides, the heatsink diameter is considerably larger now. However, to be fair we have to say that the fan on this cooler rotates with much lower speed ensuring comfortable acoustics.
The operating system sees Core i7 processors with enabled SMT technology as 8-core CPUs. Half of these cores are “virtual”, but Windows Vista doesn’t recognize this fact in any way.

The latest versions of diagnostic utilities detect Core i7 processor characteristics just fine.

Here I have to make an important point regarding the bus frequency of 133MHz detected by CPU-Z. The thing is that Intel decided to give up front side bus in its classical form in their new Core i7 processors, just like AMD did. In this case 133MHz frequency is the clock generator frequency used to form all other frequencies. For example, the CPU frequency is derived from this frequency multiplied by clock multiplier. The memory bus frequency is calculated the same way, only a different list of multipliers is used there. QPI interface connecting the CPU with the chipset North Bridge also uses this frequency and its own multiplier.
Just like in the previous CPU modes, processor clock frequency multiplier will be locked. The only exception will be Core i7-965 Extreme Edition model with an unlocked multiplier.
There will be several multipliers Core i7 processors will have for the memory bus frequency. For example, the Core i7-965 we tested in our lab offered to choose from 6x, 8x, 10x and 12x, which means that it supports DDR3-800/1067/133/1600 SDRAM.
The situation with QPI interface frequency is similar. It works at 3.2GHz in Core i7-965 Extreme Edition, while in Core i7-940 and i7-920 its frequency has been lowered to 2.4GHz.
Our Core i7-965 Extreme Edition sample is of C0 stepping, as you can see from the screenshot above. It is the final number that will be used for mass production CPUs. Our CPU had 1.2V Vcore, which is quite normal for a processor manufactured with 45nm process.
Since Core i7 processors use absolutely new interface to connect to the North Bridge, they require a completely new chipset, to begin with. Today there is only one single chipset for the new generation processors – the new Intel X58 Express. This chipset belongs to high-performance solutions targeted for computer enthusiasts, which is actually quite logical, because Core i7 processors are also from the upper price segment.

However, despite that, Intel X58 Express is a much simpler chipset than its X-series predecessors X38 and X48. Since the memory controller has been moved to the CPU in the new platforms, the X58 North Bridge should only provide support for the PCI Express 2.0 graphics bus. The chipset retained the traditional dual-chip structure. That is why the chipset North Bridge features a QPI interface controller which connects it to the processor and supports DMI bus which is traditionally used in Intel’s core logic sets to connect two bridges.

I have to point out that Intel worked real hard on the implementation of the PCI Express 2.0 bus in the chipset North Bridge to ensure that graphics subsystems using more than one accelerator will get the best of it. Overall, chipset North Bridge has 36 PCI Express lanes that may be distributed between four graphics card slots. As a result, PCI Express x16 slots on X58 based mainboards can work in 1x16, 2x16 or even 4x8 configurations. Moreover, Core i7 mainboards may support not only ATI Crossfire technology, but also Nvidia SLI. Although SLI technology will only be supported in several X58 based solutions that have successfully passed Nvidia certification. These mainboards will cost more because of the corresponding payoffs to Nvidia for SLI certification.
South Bridge of the X58 Express chipset is the ICH10 that we already know from the Intel P45 core logic. This chip supports 12 USB 2.0 ports, 6 SATA ports with RAID, built-in Gigabit network MAC and High Definition Audio interface. ICH10 also supports 6 additional PCI Express lanes and regular PCI bus.
Overall, Intel X58 Express has no revolutionary technologies and simply supports the new LGA1366 Core i7 processors. However, despite the fact that the chipset North Bridge has become much “lighter” due to removed memory controller, its heat dissipation didn’t get and lower than that of the LGA775 chipsets. And even though Intel top it with a passive aluminum heatsink on their own mainboard for Core i7 processors, it may get dangerously hot during intensive work. That is why the manufacturers of mainboards for computer enthusiasts will definitely use the opportunity to equip their solutions with sophisticated and impressively looking cooling systems with heatpipes and fans that may come in very handy during QPI overclocking and when multiple graphics card configurations are employed.
For our tests of the Core i7 processors Intel sent us their own new LGA1366 mainboard – DX58SO also known as Smackover. I have to say that we used to prefer other vendors’ mainboards for testing new processors, however things have recently changed. Intel started working harder on their own mainboard designs, and as a result, the consumer qualities of Intel mainboards have significantly improved. They became as fast as the competitors’ solutions and even started to offer extensive overclocking related functionality. Intel DX58SO proved absolutely up to our expectations. It is a totally appropriate enthusiast platform, although it not completely problem-free, unfortunately.
The first thing that catches your eye when you first look at Intel DX58SO is its slightly unusual layout. The memory slots have been moved above the processor socket. We have seen similar location of the memory slots only on a few mainboards for AMD processors, but never on Intel solutions. However, Intel is now using a memory controller built into their CPUs, so they can also locate the DIMM slots like that. Especially since this placement has certain advantages. It allows to better cool DDR3 SDRAM modules because they are now turned along the typical airflow inside the system case. Moreover, the memory slots above the processor socket are now very close to the CPU, which lowers the harmful EMI.
By moving the DIMM slots away from their traditional spot, Intel engineers could put the chipset North Bridge also closer to the CPU. So, QPI interface connections are also not very long.
I have to say that Intel laid out only four memory slots out of six possible on their new mainboard. So, the first memory channel allows two DDR3 SDRAM modules connected to it, while the remaining two channels can only accommodate one module per channel. Therefore, Intel DX58SO, just like LGA775 mainboards, supports only 8GB of RAM, while the majority of LGA1366 mainboards from other manufacturers will be able to work with up to 12GB of DDR3 SDRAM.
The chipset North Bridge is cooled with a relatively small aluminum heatsink. The board will also come bundled with a fan and a retention frame for it, which we strongly advise to install right away on top of the North Bridge heatsink, because we are somewhat concerned about its thermals with passive cooling.
The chipset South Bridge is also equipped with a small aluminum heatsink that is efficient enough in this case.
Processor voltage regulator module has a six-phase design. It uses solid-state capacitors with polymeric electrolyte that have already become common and its MOSFET are topped with aluminum heatsinks. All this indicates that mainboard developers decided not to pay too much attention to mainboard components cooling and used the simplest solutions on them.
However, since there is no bulky cooling system there remained enough room around the processor socket to accommodate conveniently most efficient CPU coolers that will for most part be the same as those for LGA775 CPUs. At least, cooling solution makers are announcing not the new cooler models, but modified retention kits for their existing cooling solutions that would allow using them on LGA1366 platforms.
I have to say that Smackover layout has a few hard to notice peculiarities. For example, it uses 8-layer PCB instead of a 6-layer one. According to the manufacturer, it ensures more stable processor power supply and improves the layout eliminating harmful EMI effects.
However, not all the developers’ solutions are as pleasing as the above mentioned ones. For example, the electronic components used on this board showed that the developers were a little overly economical. There are quite a few capacitors with liquid electrolyte, which are known to be less reliable than solid-state capacitors used everywhere now.
Speaking of Intel DX58SO features we have to say that it has two fully-functional PCI Express x16 slots supporting 2.0 protocol with twice the bandwidth. The mainboard can work with two graphics cards in ATI Crossfire configuration, but not Nvidia SLI.
The mainboard also has a PCI Express x4 slot implemented via the chipset North Bridge. Thanks to its tricky design you can install the third graphics card into it that will be responsible for physics effects, for instance.
Overall, when they designed Intel DX58SO, they tried to make it simple and affordable. Almost all external interfaces work through the controllers built into the chipset South Bridge. However, there are two additional chips on this board, too. They are a Texas Instruments Firewire controller and a Marvell SATA II controller providing support for eSATA ports. So, on the mainboard rear panel there are 8 USB 2.0 ports, an IEEE1384 port, a Gigabit network port, two eSATA ports and audio jacks: five analogue ones and an optical SPDIF Out.
Other ports on the PCB are laid out as pin-connectors. There can be four more USB 2.0 ports, Firewire port and 6 SATA-300 devices connected to this board. The funny thing is that Intel seems to be encouraging the community to give up legacy interfaces by setting an example. At least, Smackover board has no serial or parallel ports, and more importantly, no FDD connector and no connectors for devices with PATA interface.
However, Smackover developers decided to take care of testers and installed a Power On button and a HDD activity LED.
Since new Core i7 processors change the platform architecture we need to talk a little more about the way these processors get configured in the mainboard BIOS Setup, especially since this CPU may seem not so easy to configure to some users. We are going to discuss the configuration specifics using the BIOS of the above described Intel DX58SO Smackover mainboard as an example.
The main system parameter affecting the frequencies of almost all system components is the frequency of the clock generator called Host Clock Frequency. It can be set in the first screen of the Performance section. By default it is set at 133MHz, however, the board allows increasing it up to 240MHz.
Processor parameters can be configured on Processor Overrides page of the Performance section.
It starts with three options for CPU voltage management. You can set absolute and relative voltages and enable special mode reducing negative Vdroop effect (the drop of the voltage from the processor regulator occurring during current increase).
Then you can adjust the default clock frequency multiplier applied to clock generator frequency to obtain the CPU frequency.
The next big group of parameters is devoted to management of Turbo Boost Technology or Intel Dynamic Speed Technology in the BIOS terms. Since it is implemented via a special PCU microcontroller built into the processor, there is a lot to play with here. You can “edit” the processor TDP and maximum allowed current values used by the PCU, enable or disable automatic Vcore increase and set maximum clock frequency multipliers when the processor works with different number of active cores. However, you will be able to play with the multipliers only if you have an Extreme Edition CPU.
According to the turbo-mode specification, if the CPU utilization at a given moment of time allows increasing its frequency without getting beyond the set heat dissipation and power consumption limits, then Core i7 processors can increase their multiplier over the nominal. One step over the nominal if there are 2, 3 or 4 active cores, and two steps over the nominal if there is only one core active. However, as you can see from the available settings, this technology can do much more than that, as all of its key parameters can in fact be adjusted. Namely, overclockers who have a Core i7 Extreme Edition processor with an unlocked multiplier will be able to adapt Turbo Boost Technology for their needs. You can configure the PCU for very aggressive management of the CPU frequency, when its power consumption even goes beyond 130W.
Memory Configuration page is devoted to memory subsystem settings.
First of all you should pay attention to the way the memory frequency is set. It has a multiplier of its own applied to the clock generator frequency to obtain the DDR3 SDRAM frequency. With the supported multipliers ranging from 6 to 12 you can use memory working at 800-1600MHz. There is also a UCLK multiplier setting the operational frequency for the "un-core" interface components, such as built-in memory controller, L3 cache and QPI bus controller. This multiplier has to be at least twice as big as the memory frequency multiplier. Its further increase does improve the performance even more, but has a negative effect on the memory subsystem stability.
Here you can also adjust the memory voltage. Note that Intel strongly recommends not to push this parameter beyond 1.65V, because it may damage the memory controller built into the CPU. As a result, the systems based on Core i7 processors will have very limited ability to use previous-generation high-speed DDR3 memory. This memory required higher voltage than the 1.5V assigned by the standard in order to work at its nominal speed. Luckily, this problem can be solved in the new DDR3 SDRAM modules that are built with the chips working at high frequencies with their voltage setting close to 1.5V. Most makers of enthusiast memory kits have already released their solutions of the kind.
The most interesting thing on the Bus Overrides page is the QPI bus settings.
BIOS allows changing its frequency and voltage. Both these options may be very useful when you overclock the processor by raising the host generator frequency. Moreover, you may need to increase QPI voltage if you set a high multiplier for processor interface components.
Besides the above mentioned settings and parameters, you may also need to enable or disable SMT technology in the BIOS. It is available on the first BIOS page.
There is one more setting next to it that allows disabling two or three processor cores turning Core i7 into a dual-core or single-core CPU respectively.
You can configure power-saving technologies and Intel Enhanced SpeedStep in the Power section.
All other options that you can find in the Intel DX58SO mainboard BIOS are pretty common so we will not dwell on them in our today’s article.
We are going to compare the performance of the new Core i7 processors against that of the top quad-core CPUs from the previous Core 2 Quad generation. Therefore, we built two test platforms:
LGA1366 platform:
LGA775 platform:
The memory was configured differently because of different number of supported channels and also because the memory kits we has at our disposal at the time of the tests couldn’t work at high speeds without increasing their voltage over 1.65V, which Intel does not recommend.
Other testbed components were the same in both cases:
First we decided to focus on evident advantages of the new Core i7 processors. One of their major trumps is new three-level cache memory with shared L3 cache and memory controller built into the CPU. I would like to remind you that despite the similarities between CPUs on Nehalem and Core microarchitectures, only L1 cache of Core i7 is similar to the L1 cache in Core 2 Quad. L2 cache of the new processor is organized differently: it has become much smaller, but instead each core has its own individual L2 cache.

Core 2 Extreme QX9770

Сore i7-965 Extreme Edition
Note that 6MB L2 cache of the Core 2 processor family has 24-way set associativity. It means that to accelerate the search this cache is split into 256KB areas. Core i7 processor has an entire L2 cache of 256KB, however it has 8-way set associativity. It means that processor on Nehalem microarchitecture should spend considerably less time on L2 cache data search.
To estimate the performance of the entire subsystem including the cache and the memory, we resorted to a synthetic bandwidth and latency test built into Everest 4.60 suite.

Core 2 Extreme QX9770

Сore i7-965 Extreme Edition
First of all look at the difference in L1 cache latency. Although Core i7 processors inherited L1 cache from their predecessors, Intel gave it a little higher latency for the sake of power-saving modes support. This is what you can see from our obtained practical results.
However, L2 cache memory of the new processors does work much faster. Its practical latency equals half the latency of L2 cache in CPUs on Core microarchitecture. L2 cache of Core i7 also has higher bandwidth during reading, writing and copying. It is L3 cache of Core i7 processors that works as fast as L2 cache in Core 2 CPUs.
In other words, triple-level cache-memory of the new CPUs should be at least as efficient as that of the predecessors. Its only bottleneck is higher L1 cache latency. However, faster L2 cache should make up for it, as it actually serves as an intermediate buffer between L1 and L3 caches, which work at similar speeds as L1 and L2 caches of the Core 2 Quad processors.
As for the memory performance, Nehalem processors are simply beyond all competition here. The bandwidth of triple-channel DDR3-1067 SDRAM is 45% higher than the memory bandwidth in an LGA775 system working with dual-channel DDR3-1600 SDRAM. And the latency of the memory subsystem in Core i7 platform is about 30% lower.
Core i7 platform remain an indisputable leader even when we switch the memory controller into dual-channel mode. Although our LGA775 system uses faster memory modules, it still loses in access time and bandwidth tests.

Dual-channel mode of the Сore i7-965 Extreme Edition memory controller
By the way, as you can see, when we switched from triple-channel to dual-channel mode in a Core i7 platform, the memory subsystem performance didn’t drop too significantly. And the latency not only didn’t increase, but got even lower. It means that there is nothing wrong with using dual-channel memory in LGA1366 platforms. The processor can employ two memory channels efficiently, too. In some cases you can even expect triple-channel memory to turn out not as fast as dual-channel memory because of higher latency that will not be compensated by insignificant advantage in bandwidth.
In conclusion to our short test session of the Core i7 memory subsystem I have to mention one more parameter that may speed it up. I am talking about the frequency of L3 cache and memory controller that may be adjusted in the mainboard BIOS Setup, as we have already said above. The results we have just discussed were obtained with the processor interface blocks working at twice the frequency of the memory, namely at 2133MHz. If we use a higher un-core multiplier for processor interface blocks, for example 20x, L3 cache and memory controller frequencies will increase to 2667MHz and the benchmark results will be higher as well.
Here are the numbers we got in this case in triple-channel memory mode:

Interface blocks work at 2.66GHz
L3 cache and memory controller frequencies increased by 25%. As a result, we can see about 24% improvement of the memory subsystem bandwidth during writes and a little less significant improvement of only 10% during copying. The latency of L3 cache and the memory also dropped 8-9%. But unfortunately, this highly efficient way of boosting performance has very limited application. The thing is that the increase of the processor interface blocks frequency may often affect system stability. In our case, for example, further increase of this multiplier made the system less reliable.
Therefore, all further tests were performed with the L3 cache and memory controller working at 2667MHz.

When we talked about the architectural peculiarities of Core i7 processors we stressed that one of their key features that may affect the overall performance was SMT (Simultaneous Multi-Threading) technology support. Thanks to this technology each processor core may process two computational threads at the same time, thus loading the execution units more effectively.
However, as we remember from our experience with Pentium 4 processors that supported a similar technology called Hyper-Threading, it may also have a negative effect in some cases. It usually happens for two reasons. The first reason why the performance may drop is the silly work of the operating system task manager that may not distinguish between the physical and virtual CPU cores and assign a pair of threads to the same physical core despite the fact that other cores are not utilized at the time. The second reason has to do with the fact that some of the internal processor buffers are shared equally between the threads when SMT is enabled. Therefore, the performance of a core processing only one thread may sometimes be lower with enabled SMT.
To evaluate how enabling SMT in Core i7 affects the performance we tested Core i7-965 Extreme Edition in popular applications with SMT technology enabled and disabled. (Turbo Boost was disabled).

We can’t make any definite conclusions about SMT technology here. Quite offer it may really have a negative effect on performance. I believe that Intel put the corresponding option of their Intel DX58SO Smackover BIOS Setup onto the very first screen for a good reason…
It is very easy to figure out when you can actually benefit from SMT support. The applications with easily paralleled workload will be working faster. In this case the performance may improve by impressive 25-35%. However, when the applications, such as games for instance, create a limited number of threads, Core i7 will work slower with enabled SMT. Although, this will be a fairly small performance drop, no more than 4-5%.
SMT technology should increase the processor performance under multi-threaded load. If the running processes do not fully load the CPU, Turbo Boost Technology comes forward. It boosts the performance by raising the CPU clock frequency by 133MHz or 266MHz above the nominal. Of course, this is a pretty small frequency increase, but it is better than nothing anyway. When we investigated the way Turbo Boost operates we found out that this frequency increase is not some rare occasion. In fact, it may take place even under pretty serious multi-threaded workload.
To illustrate these statements we would like to offer you some numbers illustrating the effect from Turbo Boost activation in Core i7-965 Extreme Edition processor. The tests were performed with enabled SMT.

Activation of Turbo Boost technology raises the CPU performance by maximum 7%. It is a pretty logical result considering that CPU clock frequency in turbo-mode may only increase 8%. Some applications get minimal, barely noticeable, performance boost. These are the applications that create paralleled multi-threaded load. In other words, SMT and Turbo Boost Technology are a perfect match: together they are extremely efficient under practically any type of CPU workload. And when one of them cannot do much another technology comes to rescue.
It is interesting that Turbo Boost Technology will be most efficient in systems with the junior Core i7 processors. In systems like that the frequency increase will feel more noticeable and besides, this mode will get activated much more frequently. The CPU decides if the frequency can be increased basing on the current power consumption compared against the TDP standard set for the entire Core i7 lineup (130W). At the same time it is evident that junior processor models with lower clock speeds will have lower power consumption than their elder brothers and hence will have more headroom for turbo-mode.
It is evident that SMT and Turbo Boost Technology are enough for Core i7 processors to be faster than their quad-core predecessors from Core 2 family. That is why it is especially interesting to see the difference between the performance of the old and new generation processors without these new technologies.
So we compared the results obtained on Core i7-965 Extreme Edition processor from Nehalem generation against those of Core 2 Extreme QX9770 from Penryn generation. Both these CPUs work at 3.2 GHz frequency, therefore the obtained results show how much more progressive the new Nehalem microarchitecture is without SMT and Turbo Boost in the picture.

Overall, Core i7 processor is faster than the previous generation CPU even if it doesn’t have its main trumps – SMT and Turbo Boost Technology support. Frankly speaking, we didn’t expect any other result, because new Intel processors have a very strong memory controller and efficient cache-memory subsystem. It is a different thing that gives us some cause for concern. It turns out there are applications where “sterilized” Core i7 may fall behind the previous generation CPU working at the same clock speed. And it was especially surprising that among these applications there are a lot of games, which are very sensitive to the memory subsystem performance. It must be the differences in cache-memory organization of the new and old CPUs that affect the results like that.
L1 cache of the new processors is slower than that of the Core 2 CPUs, but faster L2 cache cannot make up for this drawback because of its very small size. Moreover, quad-core Core 2 processors can also boast large overall cache-memory capacity.
So, the obtained results let us conclude that Nehalem microarchitecture has no revolutionary innovations that could put these CPUs way ahead of their predecessors.







The results on the diagrams above will hardly surprise you if you read the previous part of our review attentively enough. Yes, Core i7 processors are overall faster than their predecessors. We can even say that on average the previous generation Core 2 Extreme QX9770 processor performs comparably to the mid model from the new series: Core i7-940; and Core 2 Quad 9650 competes successfully against the junior Core i7-920. However, there are exceptions to this rule, too. In Communications test that emulates user’s networking activities older CPUs do better than the new ones. However, in Gaming test the situation is completely the opposite. Thanks to fast memory controller, Core i7 are far ahead of their predecessors.


3DMark Vantage benchmark measures processor performance by modeling gaming artificial intelligence and environmental physics. Both algorithms used in this benchmark are well optimized for multi-threading that is why Core i7 processors supporting SMT technology perform brilliantly here.





We have already mentioned several times that Core i7 is not very well suited for gaming. Nevertheless, things are not as bad as you may think. CPU on Nehalem microarchitecture falls behind quad-core processors from Core 2 family only in several games. Overall, Core i7 and Core 2 Quad perform quite comparably.
By the way, I have to say that developers are not rushing to optimize their games for multi-core processor architectures. We are constantly updating our list of gaming benchmarks looking for optimizations like that, however, nothing changes too much. Even the newest 3D shooters, such as Crysis WAREHEAD or Far Cry 2 work best of all on dual-core processors. Therefore, it still doesn’t make sense to put a quad-core CPU into a gaming system.




New CPUs do great in video encoding applications. SMT technology and fast memory controller are two great advantages of Core i7 that make these CPUs leaders in this group of benchmarks.



New Nehalem processors are far ahead of their predecessors on Core microarchitecture in the Adobe applications. And in this case it doesn’t matter if it is image editing or non-linear video editing. Professional users will obviously be a big part of the new CPU owners after their retail debut.



Everything we have just said is once again confirmed by our final rendering tests in different applications. Core i7 again look very attractive against the background of their predecessors.



Most contemporary resource-hungry applications have long been optimized for multi-threading that is currently the main concept in increasing the systems performance. Therefore, it is not surprising that CPUs with SMT support won this one. Especially since their success is backed up by the memory controller with phenomenally high bandwidth at impressive low latency.
Overclocking Core i7 processors is yet another “hot topic”. There are two questions that emerge here: how greatly the overclocking potential of the Core i7 processors differs from that of the previous generation CPUs, and how well the new platform architecture allows us to take advantage of this potential.
In fact, to answer these questions we performed a series of overclocking experiments with our Core i7-965 Extreme Edition processor. Unfortunately, at the time of the tests we didn’t have an alternative cooling solution available to us, so we had to stick to Intel’s default cooler. However, we hope it didn’t prevent us from revealing the frequency potential of the new processor properly.
Since the CPU we are going to experiment with belongs to Extreme Edition series, its multiplier is unlocked. It means that we may resort to the simplest overclocking technique.
So, without raising the processor voltage over its nominal 1.2V we could get it to run stably at 3.6GHz.
The CPU passed a 1-hour OCCT Perestroika 2.0.1 and Prime95 25.7 test at this speed. The core temperatures remained within acceptable range and never exceeded 78°C.
By the way, overclocking Core i7 by raising its multiplier is a little tricky at least on an Intel DX58SO mainboard. The thing is that this mainboard doesn’t allow increasing the multiplier above its nominal setting just like that. So, if you need to increase it, you have to use Turbo Boost technology: set high multipliers for turbo-modes at the same time pushing back the maximums for the current and power consumption. In other words, during overclocking we force the CPU to constantly work in turbo-mode by setting multipliers far beyond the nominal values.
Just like with the previous generation 45nm processors, increasing Vcore for Core i7 ensured its stability at higher frequencies. For example, by raising the core voltage of our CPU sample to 1.45V we could overclock it to 3.87GHz.
Unfortunately, our CPU lost its stability at 4GHz frequency, so we had to stick to the last operational frequency. Hopefully, when we check out new LGA1366 mainboards and high-performance coolers with the corresponding retention we will be able to improve our today’s achievement. At 3.87GHz when our processor was passing the stability tests it heated up to over 90°C. And it means that it was the cooling system that wouldn’t let us overclock any further. So, at this time we have to admit that new Core i7 so far overclock a little worse than their predecessors.
Now let’s address the second important question: will the owners of affordable non-Extreme Edition Core i7 processors be able to overclock their system too, even though they do not have the luxury of an unlocked clock multiplier? For the purity of the experiment we tried overclocking our Core i7-965 Extreme Edition by lowering its multiplier to 20x, because it is exactly the multiplier the youngest model in the lineup, Core i7-920 will have.
I am sure many of you will be pleased to hear that processor overclocking by raising host clock frequency went smoothly and problem free. I can even say that this processor is much easier to overclock than previous generation quad-core CPUs at least because Core i7 doesn’t require balancing the FSB voltage and CPU and chipset GTL levels, because this CPU doesn’t use the FSB bus. The most important thing to watch out for when you increase the clock generator frequency over the nominal 133MHz in an LGA1366 platform is the timely lowering of all multipliers setting the frequencies for different busses and internal units of the processor.
For example, when we lowered the multiplier for the memory frequency to 6x, for the integrated memory controller and L3 cache – to 12x, and for QPI bus – to 18x, we easily achieved system stability at 190MHz clock generator frequency.
The CPU frequency with the 20x clock multiplier was 3.8GHz. Unfortunately, we kept losing stability at higher clock generator frequencies and we couldn’t reach the same maximum as in the previous experiment when we used a higher clock multiplier. However, we tend to blame the early mainboard revision for this, and not some platform issues.
To complete our today’s discussion we have also tested the power consumption of the systems (without the monitor) built on CPUs from two different generations: Penryn and Nehalem. For this test we chose the top processors from Extreme Edition series working at the same frequency of 3.2GHz: Core 2 Extreme QX9770 and Core i7-965 Extreme Edition. All power-saving technologies including Enhanced Intel SpeedStep and Turbo Boost Technology were enabled. We used Prime95 utility to load our CPUs to their maximum:

Power consumption of platforms based on CPUs from different generations is close not only in the official specifications, but also in real tests. The delta is no more than 4% for two CPUs working at the same clock frequencies. Core i7 consumes less power in idle mode due to more aggressive power-efficient modes and its ability to disconnect cores from the power supply line. Core 2 Extreme turned out slightly more economical under workload.
However, do not forget that Core i7 works faster when the load is multi-threaded even though its heat dissipation is similar to that of its predecessor. It means that the new CPU on Nehalem microarchitecture can boast better performance-per-watt.
To prove this statement we measured the electric energy of both systems during their work on the same tasks in PCMark Vantage benchmark emulating real work of different sorts. This value illustrates very well how much power these systems need to solve identical tasks. And again Core i7-965 Extreme Edition based platform showed its best having spent only 140W *h, while a system with Core 2 Extreme QX9770 inside required 159W*h of electricity.
We have finally got acquainted with the new Core i7 processors, the first solutions on Nehalem microarchitecture targeted for desktop systems. And summing up the results of our today’s test session we have to admit that this experience left pretty ambiguous impression.
Now, we are not trying to say that Core i7 is not a success. On the contrary, this CPU is brilliant from multiple standpoints. It supports new interesting technologies, such as SMT and Turbo Boost, and boasts an integrated memory controller with unprecedented performance. In most applications except a few gaming titles, the new processors turned out faster than Core 2 CPUs priced identically or working at the same clock speed. However, honestly, we expected a little more from Core i7 today. And the reason for our disappointment is actually Intel who have been stressing their sticking to “Tick-Tock” strategy and claiming that Nehalem would be a new microarchitecture. In fact, today we saw a new stage of Core microarchitecture development, but not a revolutionary product like Core 2 Duo back in the days when it came out to replace Pentium 4. So, as I have already said, we ended up slightly disappointed with the obtained results.
At the same time we can’t help mentioning that Intel engineers did a great and very important job on modifying the entire platform. Core i7 processors design is better for elimination of bottlenecks and further evolutionary development. Monolithic modular design, inter-processor interface with point-to-point topology and a built-in memory controller will definitely serve Intel well in the future. And as for today, mainstream users will hardly feel the benefits of all these innovations. Mostly the users of multi-socket server platforms will really enjoy the changes and modifications.
Therefore, we believe that Intel didn’t choose the right strategy for introducing the new Nehalem microarchitecture into the market. If this review had been discussing server processors and not desktop ones, the conclusions could have been not just more optimistic, but almost ecstatic. However, we first met Nehalem in its desktop incarnation, so its most important advantages cannot really show their best.
However, we don’t want you to think that we didn’t like the new Core i7 processor we have just tested. The new CPU and the new platform based on Intel X58 Express chipset are undoubtedly excellent products. New Core i7 are indisputably better in most aspects than Core 2 Quad CPUs of comparable price. Their performance is almost always higher, which is especially evident in case of multi-threaded load and their power consumption is comparable with that of their predecessors. New platform offers broader functionality for configuring multi-GPU video subsystem. Overclocking new processors also seems to be easier at first glance.
Of course, we are not going to stop here and will continue posting new articles that will help us better understand advantages and drawbacks of the new Core i7 processors with Nehalem microarchitecture. And in the meanwhile we have to put up with the fact that the transition to new LGA1366 platform will require not just a new processor, but also a new mainboard and most likely new generation DDR3 SDRAM. So, even though the junior Core i7 seem to be priced at very affordable $284, upgrading the system to fit the new processor will require serious financial investments.