<%BANNER[top_768x90]%>

<%BANNER[banner_468x60_h]%>

Intel Pentium 4 3.06GHz CPU with Hyper-Threading Technology: Killing Two Birds with a Stone...

Having launched a new Pentium 4 3.06GHz processor today, Intel not only raised the performance bar for its CPUs thus turning into a performance leader, but also introduced an extremely interesting Hyper-Threading technology to the world. In our detailed review you will learn all the ins and outs of the new technology and get the whole bunch of tests to see what it is really worth.

by Cetera labs support account
11/14/2002 | 12:43 AM

In the eternal competition between the two microprocessor giants, Intel and AMD, the laurels now fully belong to the first one. Having pulled itself together, AMD managed to somehow respond to Intel's announcement of the Pentium 4 2.8GHz with the launching of their new Athlon XP 2800+, which is hardly available in the today's market. However, later on this year AMD will not undertake anything in the processor market, which could help them to represent a worthy opposition to Intel. The latter, on the contrary prepared the most interesting products to be launched particularly in the end of the year, namely a new Pentium 4 3.06GHz CPU. This move will firstly allow Intel to leave Athlon XP family hopelessly far behind in terms of performance, and secondly, with the new solution Intel managed to introduce into desktop processors the Simultaneous Multi-Threading technology, which has never been used in processors of the kind before. This way the company managed to put an end to the unannounced competition between Intel and AMD before the Christmas sales season was opened. As a result, most users will consider Intel the leader of this year, and AMD in fact can only change the situation next year when they get their new Barton core with 512KB L2 cache and 8th generation Hammer processors at their disposal. <%BANNER[article]%>

I would like to stress here that getting past the 3GHz point appeared a much more significant event than it had been initially planned. And the primary reason for the tremendous stir around the thing is the support of Simultaneous Multi-Threading technology, which Intel calls simply Hyper-Threading (hereinafter we will refer to it using Intel's term). Intel is already using Hyper-Threading technology in its Xeon processor family, and it was expected to appear in desktop solutions with the launch of 0.09micron Prescott based processor. However, very cut-throat competition with AMD as well as the coming announcement of 8th generation AMD Hammer processors pushed Intel to make a few changes to its plans. As a result, Hyper-Threading technology appeared in Pentium 4 CPUs now already, which is about a year earlier than it has been initially planned.

Hyper-Threading technology is a relatively low-cost way of increasing the CPU performance at the expense of very insignificant die size growth, that is why we are going to dwell on the peculiarities of this technology in this article. We will also pay due attention to the performance of the new Pentium 4 3.06GHz CPU and will evaluate the "pure" performance gain provided by Hyper-Threading in each particular case.

Before we pass over straight to the technology and its features we would like to draw your attention to one thing. As is known, there is the whole bunch of ways to improve processor architecture or increase the performance. Here we could list such things as pipelining, super-scalarization, processing of commands with the modified order, cache-memory increase, etc. However, all these general methods lead to a pretty noticeable die size increase, which in its turn results into higher production costs and greater heat dissipation. Hyper-Threading technology is based on somewhat different ideology. It doesn't turn very expensive because of the "additional transistors", however it should be supported by the operation system and special software, i.e. it requires extra effort from the software developers.

Hyper-Threading Technology: Buy One CPU, Get Another One Free!

As is known, the CPU performance in general is built by two components: processor core clock frequency and number of instructions processed per clock. Pentium 4 architecture was initially intended to allow reaching high clock rates, because this CPU uses an extremely long 20-stage pipeline. This makes Pentium 4 clock rates grow by leaps and bounds, although the performance of these processors remains comparable with that of AMD Athlon XP working at considerably lower core frequencies. This can be explained first of all by the fact that Athlon XP features more execution units working in parallel and secondly it restores its 10-stage pipeline much faster in case of false predictions. This way, Athlon XP performs more instructions per clock, although it is also far from being ideal. Anyway, you remember that our today's story is about a different hero, the new Intel Pentium 4. However, I think you'd better keep in mind that everything we are going to say is also valid for the AMD Athlon XP architecture (with the corresponding corrections, of course).

The major problem about increasing the performance of the contemporary processors lies with the fact that the number of instructions performed per clock grows up not in proportion to the execution units of the CPU, but much slower. In particular, although Pentium 4 features 3 parallel integer units, 2 floating point units and 2 memory units, all these resources are never involved simultaneously. In the majority of cases most of these resources stay idle either waiting for the data or appearing of no use in this or that particular operation. In fact, the idling of the processor execution units in the first case can be combated somehow by increasing the cache-memory size for instance. But you will never be able to load the entire CPU with the existing concept of sequential calculations. For example, if the program adds some integers, then the FPUs will never be involved, no way. As a result, we get a really sad picture: most existing x86 programs can load not more than 35% of the Pentium 4 execution units at a time.

This particular problem gave birth to Hyper-Threading technology. Its major concept was first introduced in 1993 by a respectable Intel employee, Mr. Glen Hinton, who managed to notice about 10 years ago that the CPU resources were never utilized to the full extent. In 1996 Intel engineers started working on the future integration of this technology into the promising next generation CPU architectures, namely Willamette/Foster. On August 28, 2001 Hyper-Threading technology was finally introduced, and on February 6. 2002 first Intel Xeon processors with Hyper-Threading technology support were announced officially. Today, on November 14, 2002 Hyper-Threading arrived into Pentium 4 family.

A lot has changed since 1993. In particular, multi-threaded operation systems have conquered the market. Their ideology is based on the simultaneous work of several calculation threads referring to one or different active applications, or to the OS itself. If the multi-processor systems have no difficulty processing these threads simultaneously (each processor in the system gets one thread to process), then in uni-processor systems the CPU has to constantly switch between multiple threads splitting the time available between the processing of different thread parts.

This way if we enable the CPU to process more than one thread at a time, its capacities will get loaded much more efficiently. This is actually the major idea of Hyper-Threading. Due to this technology, one physical CPU is recognized by the operation system and applications as two logical CPUs. As a result, the operation system and applications suppose that a CPU supporting Hyper-Threading can process two threads simultaneously that is why they load it with much more work.

This is how Hyper-Threading technology actually works:

The CPU, however, undergoes really minor modifications and manages to take advantage of its idling resources for the second thread processing. In other words, Hyper-Threading is a technology, which allows to raise the CPU efficiency, though it works adequately only in multi-task and multi-thread environments.


On the left side you can see a CPU with Hyper-Threading,
while on the right - a regular dual-processor system

Let's say a few words about the modifications introduced in the processors, which acquired Hyper-Threading support. Since a physical processor with Hyper-Threading technology is none other but two logical CPUs, some of its units have been duplicated. Moreover, only some separate control units have been duplicated, the execution units remained the same: they just get loaded heavier and more efficiently. As a result, CPUs with Hyper-Threading have doubled registers, including general-purpose registers and control registries, Advanced Programmable Interrupt Controller - APIC, and some internal function registers, such as Next Instruction Pointer. All other resources including caches, execution units, branch prediction unit, bus controller, etc. are shared by the two logical CPUs. that is why the implementation of Hyper-Threading technology cost the developers quite little: the processor die size got only 5% bigger.


New core components

Hyper-Threading in Action

Now let's figure out how the processor actually works with Hyper-Threading technology (if you need to revise Pentium 4 architecture before we go into details, please see our Review).

The first part of the Pentium 4 pipeline is responsible for submitting micro-operations (uops, the decoded x86 instructions) to the execution part of the pipeline. This is exactly the place where all units duplicated for two logical CPUs are located. The picture below shows the beginning of the processor pipeline in two cases: with an instruction in the Trace Cache (a) and without it (b).

Trace Cache contains the already decoded instructions called uops. Most commands have already been decoded earlier during the regular processor functioning and are now located in the Trace Cache. This cache is not duplicated but is shared by the two logical processors. Nevertheless, each of them features its own Instruction Pointer pointing to the next instruction to be executed for both logical CPUs. Instructions are taken from the Trace cache in turns and are lined up in the so-called uop queue, which is also individual for each logical processor.

If there is no instruction in the Trace Cache, which is level one cache for instructions according to Pentium 4 hierarchy, the CPU has to decode another x86 instruction from level two cache. The extraction of instructions from the cache involves Instruction Translation Lookaside Buffer (ITLB), which translates the address stored in the Instruction Pointer into the physical address. ITLB is also individual for each logical CPU, while L2 cache has to be shared between them. There is only one x86 decoder in CPUs with Hyper-Threading, because it is never loaded too much, since most decoded instructions are stored in Trace Cache. If both logical processors address the decoded simultaneously, it has to take turns with both of them, but only as soon as it has completed the full decoding cycle for one of the two logical processors. The decoded instructions are saved in the Trace Cache.

The execution unit receives decoded instruction sequences in two lines for each of the two logical CPUs. And here is what happens to them next:

System Requirements

Of course, the support of Hyper-Threading should be granted not only on the software level, that is by the operation system and applications. The hardware support is also required, because the CPU supporting Hyper-Threading technology is still different from the regular processors. To activate both logical processors at least the mainboard and its BIOS should support two APIC and some specific algorithms translating the logical CPUs and the physical processor into power-saving mode.

As a result, if you want to have a CPU with Hyper-Threading technology working, you need not only a CPU with the implemented technology, but also a mainboard based on a chipset supporting it. As for the today's chipsets for Socket478 mainboards, we can state the following. All Intel chipsets supporting 533MHz system bus do support Hyper-Threading. Although there is an exception. i845G supports Hyper-Threading only beginning with the B revision. All older i845G chipsets (A revision) do not support Hyper-Threading technology. As for the chipsets from other manufacturers, the situation is not so clear here. VIA claims that its chipsets do support Hyper-Threading, SiS is about to start making new update chipset revisions in the nearest future. It is important to understand that Hyper-Threading is a fully open technology and the chipset makers do not have to pay any license fees to Intel for the opportunity to implement Hyper-Threading in their products.

Besides the support implemented in the mainboard chipsets, Hyper-Threading technology should be also recognized and initialized in the mainboard BIOS. Only in this case both logical processors can be initialized successfully and recognized by the operation system. Otherwise - if either the chipset or the BIOS do not support Hyper-Threading technology - the CPU with Hyper-Threading will be recognized by the system as one regular CPU.

If the hardware support is implemented correctly, the operation system will be absolutely sure that there are two processors installed:

It is also evident that fully-fledged utilization of the processor resources in systems with Hyper-Threading is possible only if there is multi-task operation system supporting dual-processor configurations. However, in order to really increase the system performance in this case, the operation system should be specifically optimized for Hyper-Threading technology. Namely, the system threads shouldn't use any empty cycles, which we have actually already mentioned above.

At present there are two operation systems optimized for Hyper-Threading technology: Linux 2.4.x and Microsoft Windows XP (including Professional and Home Edition). The widely spread Windows 98 and Windows ME do not support Hyper-Threading because they lack support for multi-processor configurations. As far as Windows 2000 is concerned, even though this system can work in multi-processor configurations and recognizes a processor with Hyper-Threading technology correctly (that is as two processors), their performance will still be lower in most cases than that of the analogous CPUs without Hyper-Threading support. The matter lies with the fact that system threads in Windows 2000 often work with empty cycles, which are a real threat to Hyper-Threading.

First of all, the instructions from two incoming queues pass through Allocator and Register Rename units. Here the CPU assigns resources to execute the commands. The registers and buffers in this case get split between the logical CPUs, however, once one of the logical CPUs refuses to use some of the assigned resources, they get automatically at the disposal of the other logical processor.

As soon as this stage is complete, the commands get into two sorted queues - for memory operations and other operations, which are also split into two groups for each of the two logical CPUs.

Then the micro-operations sorted out this way get to the Scheduling stage, where they are sorted according to the order in which they arrive to the execution units. The operations are sent to the scheduling units according to the first-in-first-out policy. If necessary, the scheduling units can switch from the queue arranged for one logical CPU to those for another one. By the way, at this stage the micro-operations coming from the logical CPUs get totally mixed up, so that they could be executed simultaneously. Since the registers of the physical processor turn very hardly tied to the registers of both logical CPUs, it appears possible to execute instructions without knowing where which command belongs.

After the execution stage where the processor doesn't distinguish between the logical CPUs, the Retirement unit follows. There they restore the initial instructions order and figure out anew to which logical drive they belong. Re-Order Buffer in this case is divided into two halves: each for one of the two logical CPUs.

Also please note that although L1 and L2 caches are shared between the two logical processors, and although Data Translation Lookaside Buffer (DTLB) transforming the addresses of the data processed into their physical addresses is also allegedly shared, all the notes stored in it are also marked with a CPU identifier. This way you can always tell to which logical processor the taken line belongs.

This way, Hyper-Threading technology really does allow to load the CPU execution units much heavier due to simultaneous processing of the two threads. However, you should keep in mind that the effect made by this approach cannot always be positive. Firstly, if the processed threads are similar in terms of instructions types, there may be simply no performance increase at all, because one of the threads will eat up all the resources required by the second thread, while the other execution units of the CPU will still stay idle. Secondly, the situation may turn out a complete disaster. For example, imagine that one thread keeps busy all the resources that the other thread needs urgently and waits for the data to arrive. In this case the operation system, which is aware of the two processors in this system, will not undertake anything to solve the problem. At the same time the processor will be simply paralyzed. This is one of the reasons why Intel stimulates the software developers to optimize their applications for Hyper-Threading. One of the major principles of this optimization is the use of the new PAUSE instruction, which will never freeze the physical CPU operation and thus avoid empty wait clocks.

Closer Look: Intel Pentium 4 3.06GHz

So, today, on November 14, 2002, Intel officially announced they new Pentium 4 processor - Pentium 4 3.06GHz. This processor is the first in the family, which supports Hyper-Threading technology and boasts the following features:

As you can notice, the data listed above indicate that the processor core of Pentium 4 3.06GHz is of the same size and consists of the same number of transistors as the previous Pentium 4 2.8GHz. Strange, isn't it? Especially, since we have already mentioned above that the implementation of Hyper-Threading technology required about 5% bigger die. However, this is very easy to explain. It appears that Hyper-Threading technology was integrated into Intel Pentium 4 processors long time ago, and now it is simply activated. Strange as it might seem, but all Northwood based processors features everything necessary for adequate Hyper-Threading implementation. Moreover, the duplicated units necessary for Hyper-Threading have already been created even in the good old Pentium 4 processor on 0.18micron Willamette core starting from the very first models of this family. However, until lately Intel was disabling the Hyper-Threading support in its CPUs via hardware (at the die assembly stage). Therefore, if you are a happy owner of the older Pentium 4 CPU, you will never manage to enable your Hyper-Threading technology, even though your processor features those additional 5% of transistors.

After all that it seems quite logical that Pentium 4 3.06GHz CPU features the same C1 core stepping as its predecessor, Pentium 4 2.8GHz, and is manufactured from 300mm wafers.

Here we have to point out that even though semiconductor dies used in Pentium 4 3.06GHz and in CPUs with lower working frequency are hardly any different from one another, Intel is not going to add Hyper-Threading into slower processors. This way, Hyper-Threading technology will remain the advantage of Pentium 4 CPUs with the core clock frequency over 3GHz.

The second CPU model intended for the desktop market and featuring the Hyper-Threading technology support is due in Q2 2003. It will be Pentium 4 3.2GHz also based on the 0.13micron Northwood core. After that you will also see Hyper-Threading in all Pentium 4 processors based on the new 90nm Prescott core, which is due in H2 2002.

Now I would like to say a few words about the weakest point of the new Pentium 4 3.06GHz processor: high heat dissipation. Unfortunately, the introduction of Hyper-Threading technology automatically led to a pretty significant increase in the amount of dissipated heat. This is quite natural, since the CPU execution units are now used more actively and hence the CPU with Hyper-Threading warms up more than a similar CPU without this technology. As a result, Intel had to change the thermal and electrical requirements for the systems, which are intended to work with Pentium 4 processors supporting Hyper-Threading technology.

The initial version of Intel's requirements to the mainboard makers implied that the CPU would dissipate 77W of heat at the most. Now Intel has revised its requirements and released their new version aka FMB2. According to this document, the Pentium 4 processors can now dissipate 82W of heat. As a result, the mainboard makers should revise and modify their product design accordingly, if necessary. Moreover, the maximum current, which the Pentium 4 CPUs can now consume has also been increased. Now it equals 70A, while according to the initial requirements it was 60A at the most. So, the manufacturers of the up-to-date mainboards intended to work with the new Pentium 4 processors with the clock rates over 3GHz should now make sure that their product meets the new updated power and thermal requirements.

Besides, Pentium 4 3.06GHz also needs better cooling. In particular, now Intel recommends using new more efficient coolers with copper parts. They will also modify the design of the cooler shipped together with the boxed processors. The new cooler model will have copper foot, more ribs and a more powerful 5-blade fan with the adjustable rotation speed:

  

However, this is far not all. Intel has also introduced a few changes to the case thermal requirements, which will touch upon the cases intended to be used for systems with the new Pentium 4 with the frequency over 3GHz. One of the major changes is the fact that from now on the case temperature shouldn't exceed 42oC, although the previous allowance used to be 45oC. Moreover, Intel will definitely approve of those cooling solutions that will take the air for the processor cooler directly from the outside.

Testbed and Methods

The major goal of this tests session was to figure out the performance of the new Pentium 4 3.06GHz with Hyper-Threading technology. We will compare the performance of this processor with that of the same CPU when Hyper-Threading was disabled (you can enable or disable Hyper-Threading technology in the mainboard BIOS) and with that of the predecessor, namely Pentium 4 2.8GHz. Keeping in mind that the contemporary Pentium 4 systems can be assembled with two completely different memory types, RDRAM and DDR SDRAM, we ran all the tests on two platforms using different memory types and based on i850E and i845PE chipsets. These chipsets support Hyper-Threading technology and allow using the today's most powerful memory types: PC1066 RDRAM and DDR333 SDRAM respectively.

We compared the performance of Pentium 4 systems with that of the competing ones using the today's fastest processors from AMD, namely Athlon XP 2700+ and 2800+. Athlon XP based systems were built on the today's fastest Socket A solution - NVIDIA nForce2 chipset with dual-channel DDR333 SDRAM interface.

So, as a result, our testbeds were configured as follows:

 Intel Pentium 4
i850E
Intel Pentium 4
i845PE
AMD Athlon XP
NVIDIA nForce2
CPUIntel Pentium 4 3.06GHz with Hyper-Threading technology
Intel Pentium 4 3.06GHz, Hyper-Threading technology disabled
Intel Pentium 4 2.8GHz
AMD Athlon XP 2800+
AMD Athlon XP 2700+
MainboardASUS P4T533-CASUS P4PEASUS A7N8X
Memory512MB PC1066 RDRAM by Samsung512MB DDR333 CL2 SDRAM by Crucial
Graphics CardATI RADEON 9700 Pro
HDDSeagate Barracuda ATA IV, 80GB

All tests were run in MS Windows XP Professional operation system, and the BIOS Setup of the mainboards used was configured to show maximum performance possible.


Performance

Office and Content Creation Applications

So, first of all we decided to take a look at the performance of the new Pentium 4 3.06GHz with Hyper-Threading technology in classical tests.

SYSmark2002 models the work of an ordinary user in office and content creation applications. As we see, even in this case the Hyper-Threading technology involved can provide a certain effect on the performance: the increase makes about 3-5%. This performance gain appeared possible exactly due to the fact that a CPU with Hyper-Threading can simultaneously process two threads, and most contemporary applications are multi-thread ones. Moreover, Hyper-Threading can also prove efficient in single-thread applications, because the background threads belonging to the OS services can be performed in parallel to the major tasks processing.

In applications aiming at creating digital content Hyper-Threading provides a much higher positive effect in a system with PC1066 RDRAM. It can be explained by the fact that processing two threads at a time is more demanding towards the memory bandwidth, so that it becomes an evident bottleneck in a system with DDR333 SDRAM.

As we can notice from the results obtained in SYSmark2002, the competitors from AMD, Athlon XP CPUs, fall behind Intel Pentium 4. However, AMD has significant claims to the benchmark developers, BAPCO Company, accusing them of having optimized their benchmark for Intel Pentium 4 processors. Therefore, we have also used alternative tests from E-Testing Labs, which model the typical user work in office and content creation applications.

It is true, that Athlon XP is far not so slow in Business Winstone 2001. However, the new Pentium 4 3.06GHz is nevertheless faster than the top Athlon XP 2800+, no matter if the Hyper-Threading technology is enabled or not. Anyway, the performance gain Intel owes to the new technology here is not that impressive at all: it is less than 1%.

Content Creation Winstone 2002 is the first benchmark where Hyper-Threading technology doesn't speed up the process, but on the contrary, slows it down. We have already discussed the possible reasons of this earlier in the article.

Streaming Data Processing

The common data compression with WinRAR shows that Hyper-Threading is really efficient here: the performance improves by 3-5%. However, neither this performance gain, nor the core frequency increase up to 3.06GHz allow Pentium 4 processors to outpace their rival from AMD: Athlon XP 2800+. Nothing to be surprised at, actually: the data compression speed is very tightly connected with the memory subsystem bandwidth. And after the launching of the NVIDIA nForce2 Socket A chipset with 2 DDR SDRAM channels it appeared just dead frost to compete with Athlon XP systems in this parameter.

Sound encoding into mp3 format demonstrates clearly all the advantages of Hyper-Threading technology and NetBurst architecture of Intel Pentium 4 processors. Enabling Hyper-Threading here speeds up the sound processing by 8%. However, the LAME codec used here does support multi-threaded operations that is why this result is not at all surprising for us.

Video encoding is another type of tasks where Hyper-Threading technology is in the right place providing a 10% performance improvement. This can be explained in particular with the fact that the application used supports multi-threading.

Gaming Applications

As we can see, the performance of Pentium 4 processors in 3DMark2001 remains unchanged with Hyper-Threading and without it (the difference here lies within the measuring error). Of course, like most gaming applications 3DMark2001 doesn't support multi-threading, so that the new technology Pentium 4 boasts appears of absolutely no use.

The similar situation can be observed in Return to Castle Wolfenstein. However, in this game based on Quake3 engine Pentium 4 manages to outperform Athlon XP quite significantly even without the praised Hyper-Threading technology.

However, in the newest Unreal Tournament 2003 the situation is just the opposite. Athlon XP proves much faster than Pentium 4 despite the much higher actual working frequencies of the latter. Even the Hyper-Threading technology cannot help here, as the tests proved its absolute inefficiency in games.

3D Rendering

The performance gain provided by Hyper-Threading technology during the final rendering in 3ds max 5.0 appears really high and makes over 15%. This way, this gives us to understand very clearly that in case of proper optimization Hyper-Threading technology can be a real diamond.


The tests in Lightwave 7.4, however, also demonstrate just the opposite. Despite the fact that this application supports multi-threaded operations, Hyper-Threading appears almost completely inefficient here. It looks as if Lightwave creates several threads of similar instructions during the final rendering, so that they cannot be processed simultaneously by one processor.

We would also like to point out one more fact. Although final rendering is a purely computational task, which has always been a trump of Athlon XP processors, now the things changed. The matter is that the developers have little by little optimized the algorithms used in their test sets for SSE2 instructions, which is not supported by Athlon XP CPUs. As a result, the AMD processor lost its leadership here.

Scientific Applications

To test the performance of the new CPUs in scientific tasks we resorted to ScienceMark 2.0 test. This benchmark supports multi-threaded tasks, and all SIMD instructions, including MMX, 3DNow!, SSE and SSE2. The diagrams for ScienceMark show the time each CPU required to complete the tasks, so smaller value stands for higher performance.



It has been know for long that Athlon XP processors perform very well in physical modeling and cryptographic tasks. Here we see another proof of this fact. Hyper-Threading in this case also appears quite helpful improving the performance of the Pentium 4 processor by the good 17% in Molecular Dynamic Benchmark, where different threads perform calculations of different types. In the other two cases, the threads consist of similar instructions, so that the performance doesn't grow up that impressively any more.

Professional OpenGL Applications






Well, we have already pointed out the reasons of these results multiple times: the benchmarks algorithms are pretty outdated by now and do not involve SSE2 instructions. And in case of intensive calculations, Athlon XP remains an indisputable leader.

Moreover, we can observe the same "harmful" tendency in the tests of this set: Hyper-Threading slows down the processor. However, we didn't expect anything else, to tell the truth. The threads created by SPECviewperf 7.0 are very similar and struggle with one another for the same resources: OpenGL context.

Multi-Threading Tasks

As we have noticed, Hyper-Threading technology allows improving the performance in some multi-threaded tasks. However, it is evident that the maximum positive effect can be noticed in multi-threaded environments only when the applications utilize different processor resources. This organization of computing workload will allow involving the execution units of the physical CPU more efficiently. That is why we decided to test Pentium 4 3.06GHz with , Hyper-Threading technology in these conditions.

We used the following methodology. In a system with the CPU tested we started one of the five tasks, which load the processor quite heavily: WinRAR 3.0, FlasK 0.78.39/DiVX 5.02, 3ds max 5, Lighwave 7.5 or ScienceMark. At the same time, we started dm-antalus demo from Unreal Tournament 2003, where we measured the performance. As a result, we got the following numbers characterizing the performance of the system in Unreal Tournament 2003 in parallel with other different applications:

  Pentium 4 3.06 with Hyper-ThreadingPentium 4 3.06, Hyper-Threading DisabledPerformance gain provided by Hyper-Threading
Idle 59.559.30.30%
Data Compression, WinRAR 3.035.4128.3624.90%
MPEG-4 Encoding, FlasK 0.78.39/DiVX 5.0233.8827.0125.40%
3ds max 5, Final Rendering29.5929.73-0.50%
Lighwave 7.5, Final Rendering43.329.7145.70%
ScienceMark, Primordia39.7529.1936.20%

As we see, when there are two different applications running simultaneously, the system performance can be improved quite a lot due to Hyper-Threading. The maximum performance gain we managed to obtain was over 45%. However, as the practice showed, there are situations when the performance doesn't get any better even with the enabled Hyper-Threading. As we have already mentioned several times, everything depends on the type of applications working at a time and on the way they utilize the system resources. Although, we can also state that the average performance growth in multi-threaded tasks provided by Hyper-Threading makes the good 20-30%.

Also note that with enabled Hyper-Threading technology, Unreal Tournament 2003 running in parallel to some other applications works differently. Those of you who devote quite a lot of time to computer games know that it is usually almost impossible to play if there are any other applications running on the computer. The reason is very unstable work of the game, as the system needs to switch very often between the tasks, so that even though the average fps rate remains acceptable, it is almost impossible to enjoy the game. Hyper-Threading technology allows to eliminate this unpleasant effect almost completely. We didn't notice any "slowing down" in Unreal Tournament running in parallel to other activated applications. So, now you can play games when there are other applications working if you have a system with CPUs supporting Hyper-Threading technology.

Conclusion

Having launched a new Pentium 4 3.06GHz processor today, Intel not only raised the performance bar for its CPUs thus turning into a performance leader in the race with AMD, but also introduced an extremely interesting Hyper-Threading technology to the world.

In fact, it is still impossible to evaluate the cons and pros of Hyper-Threading to the full extent today. On the one hand, this technology will give green light to virtual dual-processor systems entering the market of high-performance home and office systems. The advantages of the technology are evident: the performance as well as response time when working with the existing applications get improved in most cases. However, there is always other side to the picture. Many contemporary tasks optimized for actual and not for virtual multi-processor configurations can be slowed down notably by Hyper-Threading technology. Besides, there are also quite many tasks, such as games, for instance, which performance does not depend on Hyper-Threading at all. Anyway, so far the advantages are dominating, so that the use of Hyper-Threading appears justified in most cases, if the system is not intended for any specific needs.

Another significant advantage of Hyper-Threading is a very slight die size increase. It means that the production of CPUs supporting this technology will hardly get much more expensive than the production of CPUs without Hyper-Threading. This way, Intel managed to increase the performance of its processors having paid very small price for that.

Unfortunately, the retail price of the new Intel Pentium 4 3.06GHz is extremely high now and reaches $650. So, we wouldn't claim that the launching of the new CPU will influence the situation in the processor market in any way. At the same time, there should be much more Pentium 4 CPUs with Hyper-Threading next year, so that this technology gets every chance to finally become mass then.

The introduction of Hyper-Threading technology into mass processors also implies that the software will be now developed and optimized for it. And in its turn this will definitely make Intel Pentium 4 a much more attractive product very soon.

<%BANNER[banner_468x60_f]%>