Chapter III: Introducing New Terms in the Micro-Architecture Field
For our further discussion it would make sense to recall the schematics of the Pentium 4 processor: the things we already know about the Pentium 4 micro-architecture, which are the same for all generations of these processor cores. And then in our ongoing investigation we will specify the differences distinguishing the Willamette and Northwood cores from Prescott. Here we would like to point out right away that the “default” core we are talking about in our case will be Northwood: as we have already mentioned in the previous chapters, Northwood and Prescott differ significantly in a few important aspects. In particular, the time it takes Prescott core to perform certain commands has changed a lot (see our Appendix 1 for more numeric data on that). It means that the core has been modified dramatically. However, this is definitely no news to you. The news is how greatly the core has actually been modified. Although the major traits of the NetBurst micro-architecture are still very distinguishable (longer pipeline, Trace cache, etc.).
So, in November 2000 Intel introduced to the public a new processor: Pentium 4. Together with the new CPU they introduced a new paradigm aka NetBurst, which was intended to help continue increasing the CPU performance. We will return to a more detailed discussion of the NetBurst architecture in the ongoing chapters, and now please take a look at the flow-chart for the Willamette processor core. Now we are going to postulate the major working principles of individual processor units. Right now we will neglect the differences between various processor cores (such as the production technology and the size of L2 cache, which are different by the newer Northwood core compared to the older Willamette). Therefore, this core suits perfectly well to illustrate the general principles of the Pentium 4 micro-architecture.
The numeric data on this chart, such as bus bandwidth (3.2GB/s) and L2 cache (256KB)
are given for the very first Pentium 4 processor models,
but it doesn’t matter for the general processor micro-architecture discussion.
You can see that the CPU is composed of a few functional blocks:
- Execution units and their auxiliary units (Back End block);
- Units responsible for instructions decoding and their timely transfer to the first block (Front End block). A few units responsible for certain specific features also belong here. They are: Prefetch unit and Branch Prediction Unit. These units are not absolutely necessary for proper functioning of the other units and should simply increase their efficiency. Let’s call them Special block.
- Units managing the loading and transfer of the data to execution units (Memory Subsystem).