Intel Smart Memory Access
The technologies combined under this general name are developed to eliminate or reduce the delays when the processor tries to access processed data. Of course, data prefetch from the memory into L1 and L2 processors caches with lower latencies is a great way out here. I have to say that data prefetch algorithms have been used in Intel CPUs for a long time now. However, this functional unit will become much more enhanced in the new CPUs with Core Microarchitecture.
Core Microarchitecture allows implementing six independent data prefetch units. Two units have to prefetch data from the memory into the shared L2 cache, other two units work with the L1 caches of each of the CPU cores. Each of these units independently tracks down data access patterns (streaming data or data taken with a certain increment within the array) of the execution units. According to the accumulated stats, the data prefetch units try to load the data into the processor cache even before the corresponding request is made.
Also, the L1 cache of each processor core of the Intel Core Microarchitecture based CPUs features the instruction prefetch unit that works similarly.
Besides the improved data prefetch, Intel Smart Access implies one more interesting technology called memory disambiguation . This technology is intended to improve the efficiency of the out-or-order algorithms reading and writing the data into memory. The thing is that contemporary processors supporting out-of-order execution do not allow to commence reading, until the data saving has been completed. It is explained by the fact that the scheduler doesn’t know about the dependence of the loaded and saved data.
However, very often the successive saving and loading instructions are not connected with one another in any way. That is why the lack of ability to change their execution order may sometimes lower the load on the execution units thus reducing the overall CPU efficiency. Memory disambiguation technology is intended to resolve this issue. It supports special algorithms that detect the connection between the successive saving and loading commands with very high probability and thus allows applying out-of-order execution to these commands also.
This way, if the memory disambiguation algorithm works correctly, the CPU can utilize its own execution units in a more efficient way. If the dependency between data loading and saving instructions has been determined incorrectly (which happened very rarely, according to the developers), memory disambiguation technology should detect the conflict, reload the correct data and initiate re-execution of the code.
The use of data prefetch algorithms together with memory disambiguation technology increases the efficiency of processor work with the memory. It not only reduces the possible delays and idling of the processor execution units, but also lowers the latency during memory access and uses the bus bandwidth more efficiently.