The frequency growth is also a complicated task. In fact, we should optimize the design so that there would be about the same amount of work at each stage. If this rule isn’t followed, then the most loaded stage turns into a brake preventing further frequency growth. Moreover, higher working frequencies (other things being equal) automatically imply higher heat dissipation. In this case the transition to smaller technological norms could be of help, but this source also gets exhausted sooner or later (it is especially evident right now against the background of production difficulties with the 90nm process, all large chip manufacturers are going through). The thing is that while the geometrical norms of the production technology keep shrinking, new problems arise and turn out quite hard to solve.
During the x86 processor architecture development, Intel was working on increasing the number of commands performed per clock cycle. Every new CPU generation (Pentium, Pentium Pro) could process more commands per clock than the previous generation. In other words, both multipliers were growing, which resulted into rapid performance increase. This flow of things continued until they have almost completely exhausted the frequency potential of the P6 micro-architecture: they reached the 1400MHz frequency limit. The so-called swan-song of this micro-architecture were the Pentium III-S processors equipped with 512KB L2 cache (I am not taking into account Pentium M processors here). Although their performance was on a pretty good level, they were already yielding to the competitor solutions in many things. By the way, we have to give credit to this micro-architecture: they started with 150MHz frequency (0.5micron production technology, Pentium Pro 150MHz) and then went all the way up to 1400MHz with 0.13micron production process (the above mentioned Pentium III-S with 1400MHz core clock). In other words, the CPUs with this micro-architecture got their frequency increased by 9 times while the geometrical elements turned about 4 times smaller. In fact, we cannot name any other micro-architecture which could boast anything similar to that.
Of course, the developers started working on a successor to P6 micro-architecture long before its potential was exhausted. This micro-architecture was called NetBurst and its key distinguishing feature from the previous generation architecture was the totally different priorities. All developers’ efforts were dedicated not that much to the increase in the number of commands performed per clock cycle but to reaching the highest frequency possible for the same production process as that used for P6 architecture. We would like to stress that nobody faced the choice between higher performance per clock cycle and higher frequency, but the development priorities stayed with the higher frequency, of course.
Certainly, it was a smart marketing move. The generations of users taught to believe that “bigger” is “better” proved the marketing point with their wallets. But today we will not be focusing on the marketing that much: we are going to dwell on the features of the NetBurst micro-architecture, which made it possible to reach all those super-high frequencies. And of course, on the consequences of these features, too.
This way, we will take a closer look at Pentium 4 keeping in mind that all the innovations and enhancements ever made to it were aimed at reaching higher working frequencies. In other words, every innovations was first of all evaluated for efficiency for further working frequency growth.