Like the previous Willamette and Northwood cores, the new Prescott core is based on NetBurst micro-architecture introduced in the first Intel Pentium 4 processors. The major idea behind this architecture is to achieve high CPU performance by raising the clock frequency. It is no secret for anyone today that Pentium 4 core clock frequency looks really impressive against the background of other processors available in the market. This idea continued developing in the new Prescott CPUs. Intel made a few changes to the core, which allowed them to make another significant increase in the clock frequency potential. Besides the semiconductor technologies, “strained” silicon and special automated core design techniques, the changes have also touched upon the micro-architecture itself. Intel even mentioned that “Prescott is based on enhanced NetBurst architecture”.
No doubt that the major key to higher working frequencies is the longer execution pipeline. In this case commands execution is split into multiple simpler stages, which allows speeding up their execution, thus increasing the commands feed speed to the pipeline. When Intel announced Pentium 4 processor family, the execution pipeline got 20 stages long (it used to be 10 stages in Pentium III). We are still witnessing the effect of this change: if the maximum clock frequency of the Pentium III processor has never exceeded 1.5GHz, then the today’s Pentium 4 CPUs can easily work at the frequencies beyond 3GHz. Intel continued this successful tendency and made the execution pipeline of the new Prescott processor even longer than that.
Therefore, Intel hopes that its new Prescott based CPUs will be able to reach 4.5GHz clock frequency. Of course, they will have to increase the pipeline quite significantly if they really want to achieve this ultimate goal. However, Intel doesn’t disclose any information about the real length of Prescott’s execution pipeline, although they claim that it has at least 30 stages now.
Moreover, we undertook some empirical calculations trying to figure out the length of Prescott’s pipeline. Our assumptions were based on the time it takes the CPU to refill the pipeline in case of a wrong prediction. Our estimates showed that Prescott’s pipeline should be around 35-36 stages long!
At the same time, we shouldn’t forget that there is another side to the picture called “longer pipeline and higher clock frequency”. Firstly, the higher is the CPU core clock frequency, the more tangible is the core idling time when there is no data in the cache for further work of the CPU. We all know that the memory subsystem of contemporary platforms is very slow compared with the processors’ computational units. Moreover, two ALUs out of three in CPUs based on NetBurst architecture work at the double core frequency. Therefore, the CPU wastes a lot of time waiting for the new data to appear within its reach causing catastrophic idling. Secondly, longer pipeline causes a lot of trouble in case of wrong branch predictions. In this case execution units slow down and the CPU has to clear the entire pipeline and then refill it anew, which definitely takes more time, as the pipeline has become longer.
Two above described problems of a long processor pipeline set two major tasks for Intel engineers. They had to do their best to eliminate the negative effect of the longer Prescott pipeline, so that the overall processor performance didn’t turn into a failure for Prescott. Especially now, when it is simply impossible to achieve higher working frequencies because of the yet improper production technology, which needs to be better polished off first.
We will not discuss Intel NetBurst architecture today. If you are looking for more detailed materials on it, please check our article called Intel Pentium 4 1.4GHz Review. Part 1: Processor Architecture and Platform Overview. And now let’s find out what changes have been made to the new Prescott compared with the previous Northwood core.