Graphics Chips to Gain Horsepower and Programmability - Chief Scientist of Nvidia

CPUs and GPUs Will Get Close to Each Other, But Will Not Converge

by Anton Shilov
09/01/2010 | 05:06 PM

Graphics processing units (GPUs) of the future will require both higher levels of performance and greater programmability compared to today's GPUs, according chief scientist at Nvidia Corp. But while central processing units (CPUs) will get close to each other, they will not converge into something completely universal and there will rather be heterogeneous multi-core chips than some universal processors.

Programmability and Performance - The Future of GPUs

"Future GPUs will be both far more powerful in terms of raw performance and more programmable - in the sense that the range of applications that they can accelerate will be much broader than today. A lot of our architecture research is focused at improving programmability without sacrificing performance," said Bill Dally, the chief scientist of Nvidia, during a public conference on Wednesday.

The progress made by graphics processors in terms of performance in the last ten years is colossal and the evolution in terms of programmability is massive. But the progress made by video games in the most recent decade is much less evident for many reasons. Budgets of advanced video games today may be similar to budgets of a movie and so are constraints. Apparently, additional GPU performance and programmability are needed to enable not only abstract GPU-accelerated consumer applications, but games, something that graphics processors are developed for.

"There will always be demand for more performance, but game developers are also increasingly limited by the complexity of content creation. Techniques like ray tracing and stochastic rasterization offer more robust approaches to rendering problems that developers want to work with today. Greater programmability will make these techniques easier," said David Luebke, director of graphics research at Nvidia.

Right Core for the Right Task

During the conversation Mr. Dally once again reiterated his earlier statement that the future of computing are heterogeneous systems with CPUs and GPUs performing tasks they make best. In fact, Nvidia already has such a heterogeneous system in the form of Tegra, which combines ARM9 general-purpose processing cores with Nvidia GeForce graphics processor.

"The future is heterogeneous computing in which we use CPUs (which are optimized for single-thread performance) for the latency sensitive portions of jobs, and GPUs (which are optimized for throughput per unit energy and cost) for the parallel portions of jobs. The GPUs can handle both the data parallel and the task parallel portions of jobs better than CPUs because they are more efficient. The CPUs are only needed for the latency sensitive portions of jobs - the serial portions and critical sections," said the chief scientist of the graphics company.

The chief scientist of Nvidia does not expect central processors and graphics processors to completely converge into a device with many similar cores processing both serial and parallel data. He believes that the heterogeneous multi-core approach, such as AMD Fusion or Cell, is a better candidate for the longer-term future.

"I don't see convergence between latency-optimized cores and throughput optimized cores. The techniques used to optimize for latency and throughput are very different and in conflict. We will ultimately have a single chip with many (thousands) of throughput cores and a few latency-optimized cores so we can handle both types of code," said Mr. Dally.

The worst enemy of heterogeneous computing (or accelerated computing in AMD's terminology) are current programming models. For software developers it would be nice if operating system could assign the right task to the right type of computing device, but Mr. Dally warns that any kind of load-balancing between CPU and GPU (depending on their load at particular time) may be very costly in terms of performance.

"We expect that operating system will eventually treat CPU cores and GPU cores as peers - scheduling work for both types of cores. However moving work from a CPU core to a GPU core or vice versa would be very sub-optimal," stressed the chief scientist of Nvidia.