News
 

Bookmark and Share

(2) 

The next big thing for supercomputers are projected to be exascale machines. The leading chip designers are working on technologies that will enable the next leap in the high-performance computing space. According to an HPC expert from the University of Tennessee, in exascale systems will require new central processors, graphics processors or hybrids that combine both onto the same piece of silicon. But, Cell chips, which are heterogeneous multi-core processors, are dead end.

Building an exaFLOPS machine is a huge challenge. Even Advanced Micro Devices and Intel Corp. – whose x86 central processing units (CPUs) power the absolute majority of supercomputers – admit that construction of a machine capable of performing quintillion floating point operations per second (1018 exaFLOPS) or more using x86 chips is hardly an executable task. As a result, AMD is trying to incorporate special FireStream compute accelerators (which are based on the massively parallel graphics processing units [GPUs]) into high-performance computing systems, whereas Intel is working on such accelerators, first of which is code-named Knights Corner and is projected to be released sometime in 2012 or 2013.

GPUs Are Not Ideal for Supercomputers

But the accelerators that are designed for PCI Express bus are not exactly the best thing possible since communication between CPUs and GPUs is not a strong side of such systems. Moreover, GPUs are not easy to program for. One of the ways to overcome the issues is to integrate CPUs and GPUs.

“The obvious upside of GPUs is that they provide compelling performance for modest prices. The downside is that they are more difficult to program, since at the very least you will need to write one program for the CPUs and another program for the GPUs. Another problem that GPUs present pertains to the movement of data. Any machine that requires a lot of data movement will never come close to achieving its peak performance. The CPU-GPU link is a thin pipe, and that becomes the strangle-point for the effective use of GPUs. In the future this problem will be addressed by having the CPU and GPU integrated in a single socket,” said Jack Dongarra, the director of the innovative computing laboratory at the innovative computing laboratory and the director of the center for information technology research at the University of Tennessee, in an interview with Next Big Future web-site.

Cell is Dead for HPC

Chips that contain both x86 general processing cores as well as graphics processing cores are essentially heterogeneous multi-core processors, which AMD calls Fusion. The vast majority of multi-core chips today are homogenous chips that contain a number of similar processing engines. There are processors with different types of cores – the Cell chips jointly developed by IBM, Sony Corp. and Toshiba Corp. – which originally promised to redefine the market of multimedia chips as well as CPUs for HPC market. However, since all three companies cease to develop Cell, it has no future.

“The Cell architecture is no longer being developed, so it is effectively dead. No new supercomputers will use Cell,” claimed Mr. Dongarra.

New Paradigms

But even when central processors and highly-parallel accelerators essentially represent the same piece of silicon, exascale systems will still need to have certain optimizations on the platform and software level to be efficient.

“The current memory paradigm is hierarchical, based on registers, L1 and L2 caches, local memory, shared memory, and distributed memory among nodes. That is a potential model for exaFLOPS systems. However, we want exaFLOPS systems to be designed to be relatively easy to program. We therefore want a globally shared address space, and explicit methods to pass data between the processors in order to orchestrate the unfolding computation. That paradigm may be necessary for a machine that has a billion threads,” explained the HPC specialist.

$200 Million, 20 Mega Watt

Asked how much will an exaFLOPS-capable machine cost and what its specifications are likely to be, the professor pointed out that the cost could be as high as $200 million and power consumption could be gargantuan.

“The maximum price will be no more than $200 million, and the maximum power budget will be 20megaW. It will contain about 64PB of RAM, so that alone will probably cost $100 million. Given that our Jaguar now consumes 7megaW, keeping within a 20mW budget will be a major challenge,” said Mr. Dongarra.

New Chips Needed

In order to actually stay in the 20mW power budget when it comes to exascale supercomputers it will be needed to either utilize a huge number of very low power simplistic chips or to incorporate highly-integrated chips that either feature special-purpose co-processors, built-in graphics cores or both.

“There are two models that we can use to get to an exaflop while staying within a 20megaW budget. The first model employs huge numbers of lightweight processors, such as IBM Blue Gene Processor running at 1.0GHz. If we use 1 million chips, and each chip has 1000 cores, then we can get to a potential billion threads of execution. The other approach is a hybrid that makes extensive use of coprocessors or GPUs. It would use a 1.0GHz processor and 10 000 floating point units per socket, and 100 000 sockets per system,” explained the HPC expert.

Tags: HPC, GPGPU, AMD, Fusion, ATI, , Knights Corner, x86, Intel, Nvidia, FireStream, Tesla

Discussion

Comments currently: 2
Discussion started: 07/07/10 07:26:43 PM
Latest comment: 07/09/10 04:16:10 AM

[1-2]

1. 
Very nice article!
One thing: 1 [mW] = 0.001 [W] and 1 [MW] = 1,000,000 [W]
That means Mega is Mega, not mega
and 20mW is 20 milliWatts (= 0.02 [W])

[edited]
0 0 [Posted by: cogee  | Date: 07/07/10 07:26:43 PM]
Reply

2. 
GPUs have 256bit architecture, hundreds of millions of more transistors and different design/pipe lines, caches specifically designed for their purpose while a CPU might me similar, but different in its own league in its own world, still a very important part, lets hope one day it all unifies in to just one chip with 128 cores with 100 billion transistors running at 10GHz with 1TB L1, L2 and L3 caches LOL just kidding
0 0 [Posted by: mike1101  | Date: 07/09/10 04:16:10 AM]
Reply

[1-2]

Add your Comment




Related news

Latest News

Monday, April 14, 2014

8:23 am | Microsoft Vows to Release Xbox 360 Emulator for Xbox One. Microsoft Xbox One May Gain Compatibility with Xbox 360 Games

Tuesday, April 1, 2014

10:39 am | Microsoft Reveals Kinect for Windows v2 Hardware. Launch of New Kinect for Windows Approaches

Tuesday, March 25, 2014

1:57 pm | Facebook to Acquire Virtual Reality Pioneer, Oculus VR. Facebook Considers Virtual Reality as Next-Gen Social Platform

1:35 pm | Intel Acquires Maker of Wearable Computing Devices. Basis Science Becomes Fully-Owned Subsidiary of Intel

Monday, March 24, 2014

10:53 pm | Global UHD TV Shipments Total 1.6 Million Units in 2013 – Analysts. China Ahead of the Whole World with 4K TV Adoption

10:40 pm | Crytek to Adopt AMD Mantle Mantle API for CryEngine. Leading Game Developer Adopts AMD Mantle

9:08 pm | Microsoft Unleashes DirectX 12: One API for PCs, Mobile Gadgets and Xbox One. Microsoft Promises Increased Performance, New Features with DirectX 12

3:33 pm | PowerVR Wizard: Imagination Reveals World’s First Ray-Tracing GPU IP for Mobile Devices. Imagination Technologies Brings Ray-Tracing, Hybrid Rendering Modes to Smartphones and Tablets

2:00 pm | Nokia Now Expects to Close Deal with Microsoft in Q2. Sale of Nokia’s Division to Close Next Month