At the last Intel Developer Forum the new PC processor from Intel was officially introduced to the public (see this news story for more information on the new solution). It is a next generation processor, manufactured with 90nm technology. This allows clocking it at up to 4-5GHz frequencies. The new manufacturing process must have made it economically justifiable to increase the L2 cache up to 1MB as well as the L1 cache: its size was doubled. The FSB frequency grew to 800MHz. Overall, nearly every unit of the CPU has been somehow improved. But what does this polished-off product bring to software developers? A larger cache is a good thing: you worry less about the speed of reading/writing into memory, which often becomes a limiting factor. But this doesn’t eliminate all problems; when there are a lot of data, even a double-sized cache won’t help much.
The higher speed of the front-side bus suggests that the new Intel’s processor will be rather well-balanced, free from evident bottlenecks, unlike some previous processor models, which didn’t give us performance growth proportional to their frequency growth.
But things like the new seven-layer CPU design are hardly of any interest to software developers. It is much more important for them to know what new processor instructions have now become available, what optimization techniques should be used in the program to reach maximum performance, or at least, to do not slower than by the previous processor models. The last processor from Intel, Pentium 4, required significant software optimization to achieve higher performance. Across a wide range of tasks, Pentium 4 would lose to Pentium III of the same, or even twice as low frequency. We will discuss this phenomenon in detail later in this article. So far we have to point out that the main reasons were connected with the need for radical redesign of the processor core, so that it could support higher frequencies.
There is nothing revolutionary about the new CPU core from Intel. Everything, Pentium 4 dislikes (especially branching), was handed over to the 90nm newcomer. It even grew worse! In order to increase the clock-rate, they increased the pipeline length for Prescott, so we may expect quite significant performance losses when an incorrect branch prediction attempt leads to pipeline clearing.
But there is also good news; the extension of the processor instruction set. The software developers were “very pleased” about the introduction of MMX, SSE and SSE2, as they had to do extra work to optimize their programs for these instructions. Otherwise, the programs would never run fast. But those 13 new instructions introduced in Prescott do mean a great ease of the developer’s lot.





