Microarchitecture of Low-Power Processors
Traditionally, Intel tries to use the same microarchitecture for different market segments. This helps lower the research and development expenses as well as production costs. There used to be only one exception to this rule: high-performance server processor on IA-64 microarchitecture aka Itanium. However, when Intel decided to once again expand their influence, they discovered that Core microarchitecture, just like its predecessors, was far not that universal and doesn’t fit for all occasions. For example, Core microarchitecture turned out not the best choice for ultra-portable computational devices, which require highly power-efficient components. And even though last generation Core 2 processors offer pretty good combination of performance and power consumption, their derivatives may not be widely used for small mobile devices.
This inspired Intel engineers to develop principally new x86 processors with Atom microarchitecture that would combine low power consumption, small size and low production cost. It is Atom processor family that Intel has been actively promoting for nettop systems. There is a special modification designed specifically for this market – Diamondville.
Speaking of Atom processors peculiarities, I would first of all like to point out that they have nothing in common with Core microarchitecture, not to mention NetBurst microarchitecture. That is why the common characterization methods based on the number of cores, clock frequency or the amount of cache-memory cannot be applied to them. Atom is kind of isolated from the rest of Intel products, the new microarchitecture cannot be directly compared with any of the ever existing solutions.
There is one fact that illustrates this statement very well. Atom is a CPU with in-order instructions execution, while all Intel processors starting with Pentium Pro launched in 1995 support out-of-order execution. Since Atom developers tried to maximize the performance-per-watt as much as possible, they had to “return to origins” like that. That is why in the new CPU they eliminated all architectural solutions that increase the power consumption more than performance. And even though it did in fact affect the performance of the final product, Intel doesn’t consider it a big problem: portable devices do not need as much computational power as regular notebooks and PCs based on traditional CPUs.
Atom’s pipeline has 16 stages, which is more than Core pipeline has. However, you shouldn’t draw analogies between Atom and NetBurst microarchitectures, because longer pipeline doesn’t always mean higher heat dissipation. In this case, the CPU needs a long pipeline in order to reach high working frequencies, at which a CPU with simplified microarchitecture will be able to perform at an acceptable level. And it is a pretty complicated task for Intel Atom, because this CPU can only execute two instructions at a time, while contemporary x86 processors can execute three or four instructions simultaneously.
However, sequential execution of instructions allows Atom microarchitecture not to decompose many x86 instructions into microops, like in CPUs with out-of-order execution. This way, Atom may actually execute simultaneously more than two instructions in Core’s terms. In other words, we can say that Atom has some sort of analogy to macro-fusion technology.
Nevertheless, Atom processors have two decoders, two integer execution units and two execution units for SSE and FP instructions. However, some integer operations such as multiplication and division are actually performed by the FP units, so they are not being processed fast enough. This way Atom is not the best choice for calculations, it was developed primarily for running the operating system, browser and similar simple tasks.
Taking into account the limited number of execution units and availability of only two decoders, using them efficiently is one of the main tasks for Atom developers to solve. Especially since sequential instructions execution may cause the pipeline to idle a lot while waiting for the requested data to arrive from the memory. Therefore, Atom has certain algorithms that allow partially avoid this problem. This microarchitecture has Safe Instructions Recognition mechanism that allows the processor to let those instructions that do not need to wait for any data to get ahead in the queue and be processed first. However, Atom’s main trump here is support of Hyper-Threading technology that came back from Pentium 4 processors. As a result, the operating system sees a single-core Atom processor as a dual-core CPU that can process two threads at a time. So, its execution units works a lot more effectively improving the performance greatly.
Atom 230 processors that are targeted for nettop systems work at 1.6GHz frequency and have a 512KB L2 cache. Their typical heat dissipation, however, is only 4W. I would like to remind you that the TDP of the today’s most power-efficient single-core mobile ULV Core 2 Solo processors is almost 40% higher: 5.5W. In other words, Atom does indeed lower the minimal TDP and power consumption levels for Intel x86 processors. And as for the performance, we are going to talk about it later today.
Atom processors are manufactured with today’s most advanced 45nm process. Atom’s die size is only 25sq.mm, which means that they are pretty inexpensive to make. Even the official Atom 230 price is set at $29, which is way lower than what any other contemporary ULV CPU would cost. This is all the results of simple and very energy-efficient microarchitecture. They reduced the number of processor functional units and as a result got a processor made of only 47 million transistors. So, no wonder that the results of this “optimization” are actually everywhere. Even L1 cache memory of Atom processors is a little smaller than that of other contemporary CPUs. It is only 56KB: 24KB for data and 32KB for instructions.
However, from the formal specifications prospective, Atom 230 is quite up-to-date. It uses 533MHz system bus, supports SSE3 instructions and even 64-bit Intel 64 extensions.
Besides Atom 230, the Intel nettop processor lineup includes one more solution – Atom 330. It is a dual-core processor deigned as two Diamondville semiconductor dies placed onto the same processor board. As a result, Atom 330 has the same specifications as Atom 230, but it has two cores instead of one. Even its clock frequency equals the same 1.6GHz. Its TDP is certainly twice as high: the dual-core processor dissipates 8W of heat.