News
 

Bookmark and Share

(1) 

Nvidia Corp. shed some light onto its exascale development project known as Echelon at the SC10 supercomputer-related trade-show last week. The company's researchers are completely convinced that machines capable of performing at least a quintillion double-precision floating point operations per second (1018 FLOPS) should be heterogeneous, e.g. employ both highly-parallel as well as high-performance serial processors.

Even though today's graphics processing units (GPUs) are more efficient than central processing units (CPUs) in terms of raw performance-per-watt, even modern GPUs cannot power supercomputers that are 1000 times faster than modern ones while consuming reasonable amount of energy. As a result, graphics processors should evolve radically and rapidly in order to enable exascale systems in 2018 - 2020 timeframe. Moreover, in order to efficiently program heterogeneous systems, new programming models and paradigms are required.

William Dally, the chief scientist of Nvidia who also heads the development of Echelon extreme-scale computing project partly funded by DARPA under the Ubiquitous High Performance Computing (UHPC) program, shared his thoughts about the future chips capable of powering an ExaFLOPS-class supercomputer.


Sketch of Nvidia Echelon research system

According to Steve Keckler, the director of architecture research at Nvidia, the Echelon design incorporates a large number (~1024) of stream cores and a smaller (~8) number of latency-optimized CPU-like cores on a single chip, sharing a common memory system. Just like in current architectures, eight stream cores will form a streaming multiprocessor (SM) and 128 of SMs will forum the large pool of throughput-optimized processing elements. Such a chip could deliver 20 teraFLOPS with double precision and a number of them will form a 2.6 petaFLOPS rack. At present Nvidia Fermi (GF110) chip 512 with stream processors operating at 1544MHz can deliver 0.79TFLOPS of DP compute performance. Considerint the 25 times difference in performance, it is highly likely that the Echelon will employ post-Maxwell (~2013 ~ 2014) Nvidia GPU design.

In order to keep power consumption of such a chip relatively low, stream processors have to process a double-precision floating point operation using just 10 picojoules of power, down from 200 picojoules on Nvidia's current Fermi chips, EETimes web-site quoted Mr. Dally as saying. To facilitate that drop in energy consumption, each of 1024 stream processors per chip have to perform four FLOPS per cycle.

To further trim usage of power, Nvidia intends to integrate a large (~1024) number of configurable 256KB SRAM banks into the chip. The huge amount of on-chip memory should allow to keep as many data onboard as possible and as close to processing elements as possible to avoid power-costly fetching operations where doable. The SRAM banks should be configurable and either act as unified memory pool, as dedicated caches for processing elements, as shared memory for explicit management and so on.

At present the Echelon is only a research project and not a chip from Nvidia's roadmap. From some point of view, the Echelon is much like Intel's single-chip cloud computer (SCC) which belongs to Tera-Scale research project.

Tags: Nvidia, Tesla, GPGPU, Exascale, Maxwell, Kepler

Discussion

Comments currently: 1
Discussion started: 11/27/10 11:58:52 AM
Latest comment: 11/27/10 11:58:52 AM

[1-1]

1. 
i hadnt heard of the the "dragonfly" interconnect before. there seem to be a number of papers & presentations on it. the title paper is "Technology-Driven, Highly-Scalable Dragonfly Topology", I believe, and it's lead authors are John Kim of NW U, William Dally of Stanford, Steve Scott of Cray, and Dennis Abts of Google.
http://www.lanl.gov/orgs/...t-salishan-2009-final.pdf

google webview:
http://webcache.googleuse...f+dragonfly+interconnectL
0 0 [Posted by: rektide  | Date: 11/27/10 11:58:52 AM]
Reply

[1-1]

Add your Comment




Related news

Latest News

Thursday, May 23, 2013

3:40 pm | Nvidia Unveils GeForce GTX 780: GK110-Based Consumer Solution for $649. Nvidia’s Cut Down Titan LE Becomes GeForce GTX 780

Wednesday, May 22, 2013

11:59 pm | Be Quiet: All Current Power Supplies Are Ready for Core i “Haswell”. Be Quiet Claims Top-to-Bottom Compatibility of PSUs with New Intel Chips

11:51 pm | OCZ Partners With Netgear to Deliver Flash-Based Data Center Storage in a Box Functionality to SMBs. Leading OCZ Enterprise-Class Deneva 2 SSDs Now Qualified on Netgear's ReadyDATA 516 NAS Device

11:07 pm | Half of the World’s Population Will Be Covered by 4G/LTE Networks by 2018 - Research. More Than 1 in 2 People Will Be Covered by 4G/LTE-FDD by 2018

9:38 pm | Sony Starts Manufacturing of PlayStation 3 in Brazil. Sony Begins to Make PS3 Game Consoles in Latin America

9:11 pm | Nvidia Grid Unleashes Graphics for Virtualized Desktops. Nvidia and Citrix Commercializes Grid Technology for Virtualized Desktops

8:57 pm | MIT Scientists Mix Graphene with Hexagonal Boron Nitride to Create New Material for Computer Chips. Researchers Create New Material for Semiconductors

8:43 pm | Intel Can Enable a Successful $200 PC in the Age of the Media Tablet – Analysts. Market Observers Mull Viability of $200 PCs on Current Market

8:09 pm | Microsoft Not Worried About Xbox One’s Lack of Backwards Compatibility, Vows Big Xbox 360 Announcement at E3. Microsoft Believes Xbox One Will Not Require Games of Xbox 360

7:52 pm | Asrock’s A-Style Mainboards Set to Be Waterproof. Asrock’s New Intel 8-Series Mainboards to Feature Conformal Coating

7:35 pm | Nvidia Announces PhysX and APEX Support for Microsoft Xbox One. Microsoft Xbox One Games to Use PhysX and APEX