A Few Words about Marketing: HSA
AMD’s marketing department used to be criticized for making a poor job of promoting new products and technologies. Now the situation is just the opposite. AMD’s marketing efforts manage to make users interested even in those features that are not actually available as yet. That’s what we have with HSA: the Kaveri only has hardware support for x86 and graphics cores to have common memory access but AMD is already touting the new technology, demonstrating impressive diagrams and promising huge performance benefits.
The fact is there is yet no HSA. To implement and use HSA capabilities the hardware support must be complemented with a compatible software infrastructure which is currently not present even in some basic form. First off, AMD has not yet released an HSA-compatible driver so there’s no talking about any publicly available software. Of course, HSA-enabled applications will come out sooner or later, but we suspect that the Kaveri APUs will have become obsolete by that time. As for today, the Kaveri’s HSA support may only be interesting for software developers who can polish off their future products on this hardware.
All currently available applications with heterogeneous computing support make use of the OpenCL 1.2 API which doesn’t regard processor cores of different types as equivalent to each other. From an ordinary user’s standpoint, the Kaveri is just the same as its predecessor Richland when it comes to hybrid computing, but we still think we have to say a few words about the hardware HSA support. Just keep it in mind that it is all a matter of long-term perspectives.
So the key point of heterogeneous computing is that many tasks can be accomplished on the GPU’s stream processors faster and more efficiently than on scalar x86 cores. By combining both types of hardware resources it is possible to ensure efficient execution of a wide variety of applications. Heterogeneous processors didn’t really come to their own at first, though. Applications developers found it hard to create appropriate applications. The HSA technologies are meant to facilitate the development of heterogeneous program code and also make it execute faster.
The first HSA component is called Heterogeneous Uniform Memory Access (hUMA). It provides simple access to all system memory irrespective of what ALU subunit issues a memory request. Thus, Kaveri’s x86 and graphics cores all have the same access directly to cache and system memory. The hardware hUMA implementation ensures cache memory coherency and lets the Kaveri’s integrated GPU work with physical and virtual memory in 32-gigabyte address space. To cut it short, hUMA eliminates any limitations and differences between system and graphics memory.
The second important technology which is based on HSA and makes the Kaveri a truly heterogeneous processor is Heterogeneous Queuing (hQ). Currently, all computing load goes through the x86 cores one way or another, even if it is meant for the integrated GPU. It is the x86 cores that are responsible for submitting tasks to the GPU and checking out their completion, which involves certain latencies. With hQ technology, the GPU can interact with the application and x86 cores directly, which eliminates the difference between the CPU and the GPU, reduces latencies and simplifies the parallel processing of data on computing cores of different types. Like the CPU, the GPU acquires the right to create computing threads and submit them for execution.
The whole HSA concept looks highly promising from a theoretical standpoint. AMD suggests it will become widespread in image/video playback and editing applications, in new-generation voice/gesture/facial recognition interfaces, and in games (for physics computations or AI modeling).
We only have to wait for applications with HSA-enabled OpenCL 2.0 to come out. And that will not happen until the next year.