by Ilya Gavrichenkov
06/01/2013 | 08:20 AM
Today is the first day of summer and also the day when Intel lifts the embargo on publications about performance of the Haswell, the new generation of Intel CPUs. The whole lot of Haswell-based products will be announced officially in three days, but there can hardly be any changes in such a short time. Therefore we offer you our review of one particular model of the long-anticipated CPU right now.
It is long-anticipated indeed as we have had to wait for the Haswell longer than usual. According to Intel’s tick-tock strategy of releasing new CPU designs, there should be about a year between each tick and each tock. That tick-tock model is almost forgotten now, though, as it takes Intel longer and longer to upgrade its microarchitecture. This can be expected since Intel has no real opponents on the market of x86 CPUs. Its long-time rival AMD has left the market of high-performance CPUs, switching to inexpensive and niche products. And now Intel, while owning most of the x86 CPU market, has to deal with global problems instead such as the lowering popularity of the x86 architecture and x86-based solutions at large.
It means that Intel now has to prove its viability not as a developer and manufacturer of high-performance CPU designs but as a company that can transform its vast experience into the emerging classes of mobile devices which have been gaining in popularity at the expense of classic PCs. It is not AMD that Intel has to face now as its main opponent. It is the manufacturers of CPUs with ARM architecture, which have been following the latest market trends and are riding the wave now.
We dedicated a special review to studying the Haswell microarchitecture, noting that higher performance is not the Haswell’s main goal. On the contrary, the point of the new design is in lowering thermal and power requirements to make new CPUs appropriate not only for traditional PCs but also for compact, light and thin gadgets like ultrabooks, transformer notebooks and tablet PCs.
But even though the developer’s priorities have changed dramatically, we, PC enthusiasts, have not changed as much as to give up the classic PC, especially as newfangled gadgets can’t really match its versatility and performance. That’s why we want to start out our exploration of the fourth-generation Core processors with Haswell microarchitecture by looking at classic desktop CPUs first. Considering the current circumstances, we do not even expect higher performance, clock rates or overclocking potential. It is far more important to check out if the developer hasn’t introduced some changes into the CPU design that improve its energy efficiency but make it less suitable for desktop PCs compared to the time-tested Sandy Bridge and Ivy Bridge series.
There are quite a few things that raise our apprehensions. We’ve heard about a slowed-down L3 cache, about moving a significant part of the voltage regulator from the mainboard into the CPU so it is now less flexible in terms of settings, about the fundamentally new LGA1150 platform, and about compatibility issues with older PSUs. This list of potential downsides may get longer as we explore the Haswell more. Perhaps that’s the most interesting thing about our today’s review in which we will check out the Core i7-4770K, the senior desktop fourth-generation Core processor from Intel.
Although the Haswell microarchitecture is supposed to represent the “tock” cycle, i.e. to introduce considerable innovations while retaining the same manufacturing technology, it doesn’t follow Intel’s established pattern. The Haswell is meant to optimize Core processors in terms of energy efficiency rather than high performance. Intel is going to use the new CPU design for mobile gadgets in the first place, traditional PCs and servers being of secondary importance.
The Haswell is versatile, of course. It can be used to make CPUs with conventional performance/power consumption ratios but they are only byproducts since the focus is on the new low-wattage Core CPUs. That’s why we are not really interested in the majority of innovations implemented in the Haswell microarchitecture. We can only tell you that the new CPU design allows Intel to aggressively market CPUs with a specified power of 6 to 15 watts. Such a low level of power consumption is achieved through various means, from optimized manufacturing process to the introduction of additional power-saving states that may disable certain CPU subunits even when the system is active.
There are but few innovations for reaching higher performance. The larger part of the Haswell is inherited from the Ivy Bridge. There are no changes in terms of basic CPU structure or execution pipeline whose length remains the same as before (14-19 stages).
The key improvement in the x86 core design is about the microinstruction execution stage. There are now a few new execution devices and two new execution ports which have additional subunits for processing integer instructions, branches and addresses. The higher parallelism helps solve two problems. First, the CPU has got four integer ALUs, so classic code can be executed at the rate of its decoding. Second, the microarchitecture is better suited for floating-point and FMA instructions which do not interfere with ordinary code and may make Hyper-Threading more efficient.
The second group of positive changes in the Haswell microarchitecture refers to the cache memory subsystem. The L1 and L2 caches have doubled their bandwidth while retaining the same latency. The L1 cache can execute two 32-byte reads and one 32-byte write per clock cycle. The L2 cache can receive and issue 64 bytes of data per clock cycle.
Furthermore, Haswell-based CPUs support AVX2/FMA3 instruction sets which expand the existing SIMD sets by introducing 256-bit instructions for processing integer vectors, full-width element permutes, gather and floating-point fused multiply-add operations (which include both multiplication and addition concurrently). AVX2/FMA3 instructions can be utilized for high-performance computing, gaming computations, video and audio processing, etc, to ensure higher speed of popular algorithms.
A few smaller innovations should also be mentioned. The Haswell features improved branch prediction, a larger out-of-order execution buffer, and a larger L2 TLB.
The resulting performance benefits are estimated at 5 to 10%. The clock rates of the new CPUs do not promise much in terms of speed, either. Considering that the manufacturing technology has not changed, Haswell CPUs are likely to have the same frequency potential as the Ivy Bridge series.
We use SiSoftware Sandra 2013 SP3, a synthetic benchmarking suite, to check out various performance-related aspects of different CPU designs. And we are going to compare the Core i7-4770K (Haswell) with the quad-core Core i7-3770K (Ivy Bridge) as both have the same clock rates: 3.5 GHz by default and up to 3.9 GHz in Turbo mode.
The results are not encouraging. Unless the application uses the new AVX2/FMA3 instructions (and today’s software doesn’t yet support them, of course), the new microarchitecture doesn’t offer any performance benefits. It is a mere 2-3% faster with the simple algorithms Sandra 2013 uses to benchmark performance. Well, even this improvement should be appreciated considering Intel’s current priorities and the lack of competition among top-performance x86 CPUs. And if the new instructions are indeed get implemented everywhere, the Haswell may become much better than its predecessor, ensuring a performance boost of 30-40%. It’s up to application developers now to make use of that advantage.
Well, we can still hope that the Haswell will be somewhat faster in real-life applications in any case. And this hope is based not on any improvements in its microarchitecture but on the higher bandwidth of the L1 and L2 cache. This can be easily seen in any specialized benchmark like our Sandra 2013 SP3. The test was performed on platforms equipped with DDR3-1866 SDRAM (9-11-9-27-1T timings).
So indeed, the L1 and L2 cache memory works much faster in the Haswell than in its predecessor. This is the key advantage of the new microarchitecture which may make it faster than the Ivy Bridge in real-life applications. On the other hand, the L3 cache and memory controller are somewhat slower, which may have a negative effect on performance. To enable individual control over power-saving states of the uncore part of the CPU, the clock rates of the L3 cache and memory controller are not linked to the clock rate of the x86 cores. And even though these subunits work at frequencies which are similar to the x86 cores frequency, their performance is lower as a tradeoff for asynchronous operation.
Summing everything up, we can say that the Haswell microarchitecture can hardly lift the performance of Core CPUs to a new level. The improvements it brings about ensure but a small increase in speed, derived mostly from the increased cache memory bandwidth rather than from any changes in the execution pipeline. Theoretically, the Haswell can show its best with AVX2/FMA3-using code, but software developers don’t seem eager to write such code even though some of those instructions are already supported by AMD processors as well.
What is introduced under the name of Haswell is the latest upgrade of the Core microarchitecture. It is going to be used for all modern desktop platforms manufactured in the next couple of years, except for the LGA2011 infrastructure designed for Sandy Bridge CPUs. The Haswell doesn’t have much to offer when it comes to desktop PCs, actually. Such CPUs will be still manufactured on 22nm facilities using 3D transistors. As for new features, they support FMA3/AVX2 instructions, have faster L1 and L2 cache, and offer certain optimizations in terms of parallel execution.
The lack of fundamental improvements can be easily noticed in the Haswell semiconductor die which not only looks like an Ivy Bridge die but is also similar in size and configuration.
The quad-core die of a desktop Haswell (one that features the GT2 integrated graphics core) incorporates 1.4 billion transistors and measures 177 sq. mm. Its Ivy Bridge counterpart was only 15% simpler, incorporating 1.2 billion transistors (we mean the overall design complexity that doesn’t account for duplication of certain elements in the die). Half the added transistor budget is responsible for the graphics core which now amounts to 30% of the whole CPU die, so there is little left for any changes in the microarchitecture of the x86 cores.
As a result, there are no dramatic changes in the specifications of the new CPUs. The CPU families still have the same number of x86 cores and the same technologies. The clock rates and cache memory amounts haven’t changed much, either. You can see this in the CPU specs Intel is going to announce in three days.
So the senior desktop Haswell CPUs have almost the same specifications as the flagship products of the previous generation. That’s why we can’t expect any performance benefits thanks to higher clock rates or larger cache. There are only two points of difference. First, the desktop Haswell’s graphics core is obviously faster than the Ivy Bridge’s just because there are more execution devices in it. And second, the TDP is increased from 77 to 84 watts because some voltage regulator components have been moved into the CPU die.
The Core i7 series differs from the Core i5 in the same way as before. Both series include quad-core CPUs but the Core i7 supports Hyper-Threading. And the topmost model in each series is still an overclockable K-indexed CPU with unlocked multiplier. However, Intel doesn’t offer different graphics cores in its new quad-core CPUs now, therefore all of them use the mainstream GT2 core with 20 execution devices. The junior graphics core GT1 is likely to be limited to junior CPU series.
The price policy doesn’t seem to be any different, either. The price gap between similar CPUs of different generations will be no larger than $10.
We should also add that we’re not describing the entire desktop Haswell model lineup here. Intel is actually preparing an unusually massive release to introduce, besides Core i7 and Core i5 CPUs for ordinary PCs, specialized CPU versions with S index (with a TDP of 65 watts), T index (with a TDP of 45 or 35 watts) and R index (in BGA packaging with GT3 Iris graphics core). We’ll discuss them separately in our upcoming reviews.
Introduced to back up the fourth-generation Core CPUs, the new desktop platform Lynx Point brings about more important innovations. It has become a rule for Intel to roll out a new generation of chipsets along with each new CPU microarchitecture. Some chipsets, such as the Z68 and Z77, are backwards compatible with previous CPU generations, and others are not, but in general Intel tries to upgrade the platform and entire infrastructure together with the release of each fundamentally new microarchitecture (the “tock” cycle), and that’s what we see here.
Desktop Haswell CPUs are designed in LGA1150 packaging which requires a mainboard with a corresponding CPU socket. Such mainboards may be based on Intel’s new 8 generation chipsets.
This chipset family traditionally includes several modifications targeted for different market segments, but the Z87 is the ultimate version in terms of features and functionality. It offers the full selection of controllers and interfaces, can divide PCIe 3.0 lanes for multi-GPU configurations and doesn’t prevent you from overclocking the CPU. We’ll use the Z87 to learn what the new platform can do.
The differences between the Z87 and Z77 seem to be petty at first sight, both even using the same DMI 2.0 bus with PCI Express protocol to connect to the CPU. In fact, Intel has only improved the chipset’s connectivity capabilities, so the Z87 supports six rather than four USB 3.0 ports and all of its six SATA ports are 6 Gbit/s.
By the way, take note that Thunderbolt is not mentioned on the Z87 flowchart anymore. It doesn’t mean that Thunderbolt controllers are incompatible with the LGA1150 platform. We will surely see Thunderbolt-enabled LGA1150 mainboards in the near future, but Intel seems to have lost its interest in this technology which hasn’t gained much recognition in the last two years.
At first glance, replacing the socket with a new LGA 1150 may seem to be a little artificial measure. However, this is not entirely so. As a matter of fact, the LGA1150 platform has some more substantial differences justifying this change, but they are not so conspicuous.
Intel has changed the way that monitors are connected to the CPU-integrated graphics core. The CPU itself is now responsible for digital interfaces (DisplayPort, HDMI and DVI), the chipset only supporting analog VGA connections. This solution lowers the load on the graphics FDI link between the CPU and the chipset so that the Lynx Point platform can support up to three digital monitors with 4K resolutions concurrently.
Well, this is still not the main point of the whole affair with the new CPU socket and the new generation of mainboards. The most important thing is that the platform design has been revised to once again increase the overall level of CPU integration. Core series CPUs have assimilated all North Bridge components of chipsets quite a long time ago, and the Haswell gets down to absorbing the voltage regulator, which is one of the key mainboard components.
CPUs of the previous generation required six different voltages to be supplied by the mainboard for their various subunits: x86 cores, cache memory, system agent and graphics core. The Haswell takes on that responsibility, requiring only two voltages from the mainboard: the basic input voltage of 1.8 volts and the memory voltage. Other voltage conversions and voltage regulation inside the CPU are now performed without the mainboard’s intervention.
This integration makes the CPU more flexible in terms of dynamic power supply and energy saving and also simplifies the design of LGA1150 mainboards. It’s a win-win scenario. Mainboard makers can make their products simpler. CPUs can use power in a more optimal way. And users will have a unified, reliable, stable and accurate system of voltage regulation which doesn’t vary between specific implementations. Intel promises the integrated regulator to be very precise, allowing for voltage fluctuations no larger than a few millivolts. Users will have full access to the CPU-integrated voltage regulator’s parameters, just as before.
We will use a Core i7-4770K processor for our practical exploration of the Haswell series and LGA1150 platform. It is the senior model in the whole series which is meant to replace the Core i7-3770K (Ivy Bridge) after the latter’s one year of being in the office of Intel’s flagship.
Interestingly, the mentioned quad-core CPUs have a lot in common despite belonging to different generations: Turbo Boost, Hyper-Threading and caches (64KB L1 per each core, 256KB L2 per each core, and shared 8MB L3 cache). That’s why we are curious to compare the Core i7-4770K against the Core i7-3770K since any difference in performance is likely to be due to improvements in microarchitecture.
The peak clock rate the Core i7-4770K can work at thanks to Turbo Boost is 3.9 GHz. The CPU is clocked at 3.9 GHz when only one or two of its cores are in use. When three cores are in use, the clock rate is up to 3.8 GHz. At full load the CPU is clocked at 3.7 GHz. When idle, the Core i7-4770K drops its clock rate to 800 MHz, which is twice lower compared to the idle frequency of the previous-generation CPUs.
As promised, the clock rate of the L3 cache changes independently of the x86 cores but often coincides in practice. The uncore part of the CPU only works asynchronously either in power-saving states or in Turbo mode.
The voltage of the x86 cores of our Core i7-4770K is 1.06 volts. That’s typical for 22nm CPUs.
Intel has developed new colorful packaging for its fourth-generation Core processors.
The boxed CPU comes in two versions: with or without a cooler. We guess the latter version is better since older LGA1155/1156 coolers are perfectly compatible with the LGA1150 platform.
Obviously, the most exciting topic for discussion is the performance comparison between the new Core i7-4770K on Haswell microarchitecture and Core i7-3770K based on the previous generation Ivy Bridge microarchitecture. However, we didn’t stop at just these two processors, but also added two LGA 2011 CPUs based on the long-living Sandy Bridge microarchitecture. They are the fastest six-core Core i7-3970X Extreme Edition processor and quad-core Core i7-3820. Besides, we also tested the top AMD Piledriver processor – FX-8350.
A separate topic for discussion was the integrated graphics core in Core i7-4770K processor. When we measured its performance, the obtained results were compared with the data from Core i7-3770K and with those of the top AMD APU at the time – A10-5800K.
As a result, our testbeds were configured using the following software and hardware components:
As usual, we use Bapco SYSmark 2012 suite to estimate the processor performance in general-purpose tasks. It emulates the usage models in popular office and digital content creation and processing applications. The idea behind this test is fairly simple: it produces a single score characterizing the average computer performance. After the launch of Windows 8 SYSmark 2012 got updated to version 1.5, and this is exactly the version we are using in our test session.
As we have expected, the transition of new Core processors to new generation microarchitecture doesn’t have any impressive effect. SYSmark 2012 estimates Core i7-4770K performance only 10% higher than that of its predecessor with Ivy Bridge microarchitecture, Core i7-3770K. However, even this increase is enough to place Core i7-4770K higher in the ranks than Core i7-3970X in terms of average general performance. As for the Haswell’s advantage over the quad-core Sandy Bridge Core i7-3820, it is quite obvious and reaches 18%. However, do not forget that Core i7-3820 and Core i7-4770K are two generations of processor design apart.
Let’s take a closer look at the performance scores SYSmark 2012 generates in different usage scenarios. Office Productivity scenario emulates typical office tasks, such as text editing, electronic tables processing, email and Internet surfing. This scenario uses the following applications: ABBYY FineReader Pro 10.0, Adobe Acrobat Pro 9, Adobe Flash Player 10.1, Microsoft Excel 2010, Microsoft Internet Explorer 9, Microsoft Outlook 2010, Microsoft PowerPoint 2010, Microsoft Word 2010 and WinZip Pro 14.5.
Media Creation scenario emulates the creation of a video clip using previously taken digital images and videos. Here they use popular Adobe suites: Photoshop CS5 Extended, Premiere Pro CS5 and After Effects CS5.
Web Development is a scenario emulating web-site designing. It uses the following applications: Adobe Photoshop CS5 Extended, Adobe Premiere Pro CS5, Adobe Dreamweaver CS5, Mozilla Firefox 3.6.8 and Microsoft Internet Explorer 9.
Data/Financial Analysis scenario is devoted to statistical analysis and prediction of market trends performed in Microsoft Excel 2010.
3D Modeling scenario is fully dedicated to 3D objects and rendering of static and dynamic scenes using Adobe Photoshop CS5 Extended, Autodesk 3ds Max 2011, Autodesk AutoCAD 2011 and Google SketchUp Pro 8.
The last scenario called System Management creates backups and installs software and updates. It involves several different versions of Mozilla Firefox Installer and WinZip Pro 14.5.
New Haswell microarchitecture demonstrates the highest performance advantage during well-paralleled multi-threaded load. This is yet another illustration of the changes introduced in it, which helped improve the efficiency of Hyper-Threading technology. However, Core i7-4770K becomes an absolute leader in completely different situations: in typical office and home use scenarios such as System Management, Media Creation and Office Productivity.
As you know, it is the graphics subsystem that determines the performance of the entire platform equipped with pretty high-speed processors in the majority of contemporary games. Therefore, we select the most CPU-dependent games and take the fps readings twice. The first test run is performed without antialiasing and in far not the highest screen resolutions. These settings allow us to determine how well the processors can cope with the gaming loads in general and how the tested CPUs will behave in the nearest future, when new faster graphics card models will be widely available. The second pass is performed with more real-life settings – in FullHD resolution and maximum FSAA settings. In our opinion, these results are less interesting, but they demonstrate clearly the level of performance we can expect from contemporary processors today.
If we look at the gaming performance with realistic graphics settings, it is quite logical that there will be very few differences between different flagship processors in this respect. Both: Core i7-4770K and Core i7-3770K, show about the same fps rate, because the main bottle neck in both cases is the graphics card. If we view these games as synthetic benchmarks measuring computing performance and lower the resolution to ensure that the CPU becomes the primary determinative factor in the overall scores, we will be able to notice some very positive changes. The top Haswell processor will be about 5% faster on average than the top Ivy Bridge. Moreover, in most cases LGA 1150 platform becomes a better gaming platform than the elite LGA 2011.
To test the processors performance during data archiving we resort to WinRAR archiving utility. Using maximum compression rate we archive a folder with multiple files with 1.7 GB total size.
The performance differences between Core i7-4770K and Core i7-3770K, which represent two different microarchitectures, are hardly noticeable during data compression tests. We would expect faster L1 and L2 caches to have a positive effect for Haswell, but it must be compensated by the increases latency of the L3 cache in this case.
The processor performance in cryptographic tasks is measured using a built-in benchmark of the popular TrueCrypt utility that uses AES-Twofish-Serpent “triple” encryption. I have to say that this utility not only loads any number of cores with work in a very efficient manner, but also supports special AES instructions.
5% advantage of Core i7-4770K over Core i7-3770K shows the benefits of all Haswell’s improvements (both these processors work at identical clock speeds), but is not sufficient to let the LGA 1150 flagship CPU catch up with AMD Piledriver. However, Intel fans shouldn’t get discouraged. Encryption in TrueCrypt is the only situation when eight-core AMD FX-8350 outperforms Core i7-4770K.
We use Xilisoft Audio Converter 6.4 utility to test audio transcoding speed into mp3. During this test we transcode the audio album saved in loseless flac format.
Xilisoft Audio Converter compresses mp3 files very quickly. However, Haswell based processor completes this task 6% faster than Core i7-3770K.
Now that the ninth version of the popular scientific Wolfram Mathematica suite is available, we decided to bring it back as one of our regular benchmarks. We use MathematicaMark9 integrated into this suite to test the systems performance:
Not all calculations in Mathematica can be split in parallel threads efficiently. Nevertheless, Core i7-4770K demonstrates 11% advantage over the Ivy Bridge processor working at the same clock frequency and therefore outperforms even a 1000-dollar heavy-weight – Core i7-3970X.
We measured the performance in Adobe Photoshop CS6 using our own benchmark made from Retouch Artists Photoshop Speed Test that has been creatively modified. It includes typical editing of four 24-megapixel images from a digital photo camera.
Here new microarchitecture improves the performance by only 6%. And after two generations of Intel processors the performance in Photoshop increased by only 12%: this is the difference between the test script run on Core i7-4770K and Core i7-3820. Quite a pity, and therefore this can hardly be considered a worthy argument in favor of a PC upgrade.
We have also performed some tests in Adobe Photoshop Lightroom 4.4 program. The test scenario includes post-processing and export into JPEG format of two hundred 12-megapixel images in RAW format.
The results here are almost the same as in Photoshop. The performance of Intel’s quad-core processors didn’t increase much since 2011, when Sandy Bridge microarchitecture came out.
The performance in Adobe Premiere Pro CS6 is determined by the time it takes to render a Blu-ray project with a HDV 1080p25 video into H.264 format and apply different special effects to it.
High definition video content processing is one of the best types of load for multi-core processors. Therefore, Haswells’ advantage here is slightly higher than in other tests: more efficient Hyper-Threading technology starts to matter more. However, the performance difference between Core i7-4770K and Core i7-3770K is still measured in just a few single-digit percents.
In order to measure how fast our testing participants can transcode a video into H.264 format we used x264 HD Benchmark 5.0. It works with an original MPEG-2 video recorded in 1080p resolution with 20 Mbps bitrate. I have to say that the results of this test are of great practical value, because the x264 codec is also part of numerous popular transcoding utilities, such as HandBrake, MeGUI, VirtualDub, etc.
During H.264/AVC video transcoding Haswell receives the highest performance boost compared with the Ivy Bridge CPU on the same clock frequency. It equals 13% making Core i7-4770K faster than AMD FX-8350, which used to do better with transcoding tasks. Therefore, it looks like Haswell’s launch did manage to push back the AMD processors after all, making them even less appealing for the consumers. Obviously, now AMD just has to lower the prices of their Socket AM3+ processors yet again.
We will test computational performance and rendering speeds in Autodesk 3ds max 2011 using the special SPECapc for 3ds max 2011 benchmark:
Rendering is yet another example of a multi-threaded task, that is why we see a 15% advantage of the new processor over the predecessors. The advantage of Core i7-4770K over Core i7-3820 even exceeds 20%. However, during the entire test session we didn’t see any impressive performance boost, which we experienced in synthetic tests. Unfortunately, contemporary software doesn’t yet support AVX2/FMA3 even though FMA3 set has already been supported by AMD processors for over a year now.
The desktop Haswell doesn’t offer any breakthroughs in terms of conventional computing. Its developer didn’t actually promise anything like that, focusing instead on creating energy-efficient modifications of the Core microarchitecture and perfecting 3D graphics performance. And while the Haswell CPUs with a TDP of only a few watts can hardly excite desktop PC users, the improvements in the graphics department are considerable. AMD has managed to build a market for desktop APUs and Intel’s solutions with better computing and comparable graphics performance would surely find their customer.
Well, it looks like Intel doesn’t really bother about the integrated graphics capabilities of its desktop CPUs. At least they do not have the fastest version of that core. Even the top-of-the-line Core i7-4770K is only endowed with the midrange GT2 version which is officially referred to as Intel HD Graphics 4600.
Anyway, even this graphics core is going to deliver improved performance in comparison with the HD Graphics 4000 we could see in top-end Ivy Bridge CPUs. First of all, the number of execution devices is increased from 16 to 20. The performance of most of fixed-function pixel processing units is doubled whereas the texture samplers are four times as fast as before.
The clock rate has been increased as well. The HD Graphics 4000 worked at 1.15 GHz in the Core i7-3770K while the new HD Graphics 4600 is clocked at 1.25 GHz in the Core i7-4770K. All of this implies considerable performance benefits which can be estimated with 3DMark 11 and the newest 3DMark Cloud Gate benchmark. By the way, the HD Graphics 4600 has no problems running modern games and benchmarks as it supports all modern APIs: DirectX 11.1, OpenGL 4.0 and OpenCL 1.2.
Although the number of execution devices has been increased in the HD Graphics 4600 by only a fourth in comparison with its predecessor HD Graphics 4000, they differ in speed much more. The Haswell’s graphics core is almost twice as fast as the previous version in 3DMark 11 whereas 3DMark Fire Strike thinks that the Core i7-4770K’s graphics is almost 40% better than its predecessor. In either case it is enough to make the desktop Haswell comparable to the top-end AMD Trinity APU in 3D performance.
Fortunately for AMD, Intel doesn’t have plans for an aggressive promotion of the faster graphics core GT3 in the desktop environment. The only series of Intel desktop CPUs to feature better graphics capabilities is the BGA-packaged R series. So, there is no reason for AMD to worry about its share of the APU market yet, even though Intel has been progressing rapidly in this field.
The results of the synthetic tests from Futuremark should be complemented with what the integrated graphics cores can do in actual games. There were two test modes: Full-HD resolution (1920x1080) with low visual quality settings and 1366x768 resolution with medium visual quality settings.
The gap between the HD Graphics 4600 and the HD Graphics 4000 isn’t as large as in the synthetic benchmarks. On average, the Core i7-4770K is 25 to 30% ahead of the top-end Ivy Bridge, which is not enough to let the user play latest games in Full-HD resolution even with low visual quality settings. In other words, the HD Graphics 4600 can hardly be viewed as an entry-level gaming solution. The fastest graphics core GT3 must be capable of that, therefore Intel refers to it not only by its number (5100) but also by its pretty marketing name of Iris.
Besides the 3D functionality, Intel’s graphics cores incorporate a dedicated multimedia engine known as Quick Sync technology. In the Haswell it supports new formats (SVC and Motion JPEG), new image enhancement techniques (such as hardware image stabilization and frame rate conversion), decoding of video with resolution up to 4096x2304 pixels, etc. It is promised to be faster at transcoding, and we can easily check this out.
We use CyberLink Media Espresso 6.7 utility for that purpose as it is optimized for Intel Quick Sync as well as for other transcoding capabilities of modern CPUs and GPUs. Although Intel has published its SDK for accessing the hardware coder/decoder of Core CPUs, developers of free software do not hurry to implement Quick Sync support in their solutions, so we have to use the paid utility from CyberLink.
For transcoding tests we used a 40-minute 1080p video in H.264 format with about 10 Mbit/s bitrate and in lower resolution to be viewed on iPhone 4S. The goal video format was H.264, resolution – 1280x720 with about 6 Mbit/s bitrate.
The Haswell’s upgraded media engine is about 40% faster. The quality of transcoding has improved as well, which can be easily noticed even if you convert videos for mobile gadgets. The screenshots below show you videos compressed with the Ivy Bridge’s and Haswell’s media engines using the same settings (6 Mbps bitrate for iPhone 4S).
It is easy to see that the Haswell’s Intel Quick Sync ensures a better level of detail on small objects and more natural colors.
Haswell graphics core copes great with hardware acceleration of video playback in 4K resolution. As an example we decided to check how high will the CPU utilization rise in a Core i7-4770K based system used to view a specially prepared video in H.264/AVC format in 3840x2160 resolution with 103 Mbit/s bitrate.
Intel HD graphics 4600 graphics core didn’t experience any problems. There were no dropped frames and the CPU utilization didn’t exceed 10%. Moreover, the processor computational cores even remain in one of their power-saving modes: their frequency being really far from the nominal 3.5 GHz. In other words, Haswell is fully prepped for work with 4K video.
The Haswell microarchitecture featuring a lot of optimizations for lower power consumption, many users expect Haswell-based CPUs to make desktop PCs more economical. Such expectations are not well-grounded. The Haswell can indeed be used to create mobile CPUs that dissipate less heat than Ivy Bridge products. The ultra-low-voltage U and Y series with a TDP of 15 watts and lower are an excellent example. However, a CPU design optimized for low heat dissipation and power consumption doesn’t necessarily translate into an economical desktop CPU. As a matter of fact, the TDP of desktop Haswell-based CPUs is 84 watts, which is 7 watts higher compared to the Ivy Bridge series. It means that making the Haswell work at clock rates typical of desktop CPUs requires high voltage on certain CPU subunits. Moreover, the Haswell features an integrated voltage regulator, a power-hungry circuit which should have a large reserve of capabilities for overclocking. That’s why we have some apprehensions about how economical Intel’s new flagship CPUs are.
To find out more about the practical power consumption of Intel Core i7-4770K processor we performed a round of special tests. The new digital power supply unit from Corsair – AX760i – allows monitoring consumed and produced electrical power, which we use actively during our power consumption tests. The graphs below (unless specified otherwise) show the full power draw of the computer (without the monitor) measured after the power supply. It is the total power consumption of all the system components. The PSU's efficiency is not taken into account. The CPUs are loaded by running the 64-bit version of LinX 0.6.4 utility with AVX instructions support. Moreover, we enabled Turbo mode and all power-saving technologies to correctly measure computer's power draw in idle mode: C1E, C6, Enhanced Intel SpeedStep and AMD Cool’n’Quiet.
The Core i7-4770K system is very economical in idle mode. The new power-saving states can be applied to the desktop infrastructure, lowering the minimum level of power consumption.
The Core i7-4770K is less efficient at single-threaded load, becoming comparable to the Core i7-3770K. And here’s the next diagram.
That’s a shock really! At full load the Haswell-based system needs almost 30 watts more power than the Core i7-3770K configuration. So, even though low-voltage Haswell-based CPUs for mobile gadgets are very economical, their desktop cousins are a completely different story. Intel has optimized the new microarchitecture for ultra-mobile applications but the desktop Haswell is no good in terms of performance per watt. Not all Haswell-based CPUs are energy efficient, as we can see.
There is one thing that should be taken into account, though. The Haswell’s power consumption and heat dissipation go up rapidly at full load which is generated by our LinX-AVX utility. In real-life applications the Core i7-4770K isn’t as voracious as it seems in comparison with its predecessor. Here’s the power consumption of the CPUs while encoding HD video with the x264 codec.
The Core i7-4770K demonstrates higher power consumption than Core i7-3770K. However, we are no longer referring to a shocking difference of 30 W: it has dropped down to just a few single digits.
The Haswell and the LGA1150 platform change the overclocking procedure for two reasons. There are new divisors for the PCIe/DMI bus and there is now a voltage regulator right inside the CPU.
The first thing was expected because the fixed correlation between the base clock rate and the PCIe/DMI clock rate used to make it impossible to overclock CPUs by raising the former. The PCIe/DMI bus doesn’t work well when its frequency deviates from the default 100 MHz, so increasing the base clock rate to 105-107 MHz used to render the LGA1155 platform nonfunctional.
This problem is solved to some extent in the new Haswell processors. There are now a few dividers that let you set the base and PCIe/DMI clock rates not only as 1:1 but also as 5:4 and 5:3. Thus, the LGA1150 platform can be stable at a base clock rate of 100, 125 and 166 MHz. All crucial internal frequencies remain at their defaults in this case, but the x86 cores, the uncore part, the integrated graphics core and system memory get overclocked proportionally. It also means that even LGA1150 CPUs with locked multiplier can be overclocked, but only by 25% or 66% above the default frequency.
Only K series CPUs with an unlocked frequency multiplier offer you complete overclocking freedom. By the way, during this the transition to Haswell Intel added yet another bit to the processor multiplier register, so the maximum multiplier setting during overclocking has now reached 80x.
The integration of the voltage regulator into the CPU affects overclocking, too. The integrated regulator behaves in a peculiar way. The CPU voltage used to drop at high loads, but the integrated regulator, on the contrary, automatically increases it in this case – by over 0.1 volts even at default settings. This effect is more conspicuous at overclocking.
Unfortunately, this behavior is typical of the Haswell. You can only eliminate it by locking the voltage at a certain level, which disables all power-saving technologies. So, Haswell offers a difficult choice: you either have to put up with high temperature and heat dissipation of the CPU under heavy loads caused by automatic increase in the processor core voltage or to give up power savings in idle mode.
But this is just part of the problem. Haswell turned out to be much hotter in real life than its predecessor. The maximum permissible temperature of its CPU cores is 100°C but even in nominal operational modes Core i7-4770K would get as hot as 75-80°C even with a high-performance air-cooler.
To illustrate Haswell’s thermal performance we performed a quick comparison between Core i7-4770K and Core i7-3770K working in their nominal mode and tested with the same NZXT Havik 140 cooler:
The Haswell CPU core temperatures are seriously higher than those of the previous generation processors. And although most every-day tasks do not cause the CPU to heat up so dramatically, we should base our conclusions primarily on specialized stability tests, which create heavy but nevertheless quite realistic load.
So, it turns out that overclocking the new CPUs calls for much better coolers than those we could use for Ivy Bridge processors. In other words, it is harder to reach the same results when overclocking Core i7-4770K as we did with the overclocker-friendly Sandy Bridge and Ivy Bridge products in LGA1155 form-factor.
For example, we only made our Core i7-4770K CPU stable at 4.4 GHz. The temperature of the CPU cores was alarmingly high while running the stability-testing LinX-AVX utility, even though we used a very good cooler - NZXT Havik 140.
To achieve the result shown in the screenshot, we had to increase the CPU voltage to 1.2 volts. This was only 0.14 V higher than the nominal Vcore for our particular processor, but nevertheless, the temperatures were through the roof.
Thus, even a small increase in voltage leads to a dramatic increase in temperature of the computing cores, which means that the Haswell microarchitecture is energy efficient at low clock rates and low voltages only. Haswell-based desktop and overclocker-friendly CPUs are not energy-efficient at all. As a result, Haswell’s overall overclocking potential doesn’t inspire much optimism at all. In other words, another iteration in Intel’s microarchitecture is in fact a step back in terms of frequency potential, even though Intel did everything possible to compensate for it by adding extra overclocking-friendly features into their new processors.
Intel obviously refocused all their efforts on resolving the “ARM problem”. The microprocessor giant obviously doesn’t want to hand over the compact mobile devices market to the competition, therefore all their engineering forced have been directed towards designing x86 processors with lower power consumption and excellent performance. The CPUs, which could successfully compete against the high-performance ARM products. At least, processors with this microarchitecture will be able to settle in ultra-compact notebooks-transformers and high-performance tablets. And there are a few very inspiring models in the new Haswell family targeted for these specific segments. Take, for example, low-voltage dual-core Core i5 Y-series with the average power consumption of only 6 W, which support Hyper-Threading technology and can work at 1.4-1.9 GHz frequency. Moreover, there will be at least a few dozen different SoC U-series processors on Haswell microarchitecture with 15 W TDP, which will target ultra-books.
The variety of products, which are currently being prepared for release after the Haswell microarchitecture launch, indicate clearly that Intel’s priorities have changed dramatically. Therefore, there is a good reason for why we didn’t see much progress in the new Haswell CPUs. Intel currently has no need or real interest in advancing the existing desktop platforms. The product we got today is what they managed to put together with minimal modifications to the microarchitecture, which is originally designed for mobile devices.
And frankly speaking, this product is not that impressive at all, especially in the eyes of computer enthusiasts. We tested the top of the line desktop Haswell, Core i7-4770K, and drew a number of bitter conclusions. First, Core i7-4770K is just a little bit faster than the flagship Ivy Bridge processor. Microarchitectural improvements only provide a 5-15 % performance boost, and the clock frequency hasn’t changed at all. Second, Core i7-4770K processor turned out a significantly hotter processor than the CPUs based on previous microarchitecture. Even though Haswell allows engineering energy-efficient processors with impressively low heat dissipation, its performance-per-watt has worsened a lot when they adjusted its characteristics to meet the desktop requirements. This resulted into the third item on this list: without extreme cooling Core i7-4770K overclocks less effectively than the previous generation overclocker processors. The specific CPU sample we tested this time allows us to conclude that these processors may get overheated at 4.4-4.5 GHz clock speeds even with high-performance air coolers. And fourth: Haswell processors require new LGA 1150 platform, which doesn’t boast any unique advantages, but merely offers more USB 3.0 and SATA 6 Gbps ports. But currently this platform seems quite raw and awaits a new chipset stepping, which will fix some issues with the USB 3.0 controller.
In order to make up for all the above mentioned shortcomings, desktop Haswell offers the following: support of the new AVX2/FMA3 instructions, which are not yet utilized by existing software; 30% faster graphics core, and limited overclocking of the processor with a locked clock frequency multiplier. Unfortunately, all these advantages will most likely be useless for enthusiasts. Although, junior Haswell CPU models, from the Core i3 series, for example, may turn out quite appealing particularly due to these features. Although they are scheduled to be released a little later.
And at this time we have to wrap up our Core i7-4770K and LGA 1150 platform review with a slight feeling of disappointment. The arrival of the new Haswell microarchitecture into the desktop segment seems to be very similar to the Windows 8 launch. Intel seems to be offering something very new and progressive, but each advantage the newcomer has to offer is counterbalanced by at least two shortcomings, ruining the overall impression and taking away the desire to migrate to fourth generation Core processors. Obviously, Haswell feels out of place in the desktop segment, but we hope that the new revisions of the processor die and chipset will encourage further improvement of the LGA 1150 platform and desktop processors in particular.