News
 

Bookmark and Share

(16) 

UPDATE: Correcting the number of stream processors in GK110 chip.

Nvidia Corp. this week formally announced two new Tesla compute accelerators that are based on the code-named Kepler architecture. The Tesla K10 - based on the well-known GK104 chip - will become available shortly and will be aimed at those, who need maximum raw single-precision compute performance here and now. The Tesla K20, which will show up late this year, promises to be a performance monster thanks to GK110 chip with whopping 7 billion transistors.

The Nvidia Tesla "K10" compute card is based on two GK104 GPUs that deliver an aggregate peak performance of 4.58TFLOPS single precision, 0.190TFLOPS double precision and 320GB/s of m memory bandwidth. The Tesla "K10" is optimized for customers in oil and gas exploration and the defense industry. Since the K10 compute board is powered by GK104 chip, which is generally meant for graphics processing, it does not deliver really strong double precision performance and will not be a good solution for many fields of high performance computing.

The flagship compute solution based on Kepler architecture will be Tesla K20, which will be based on GK110 graphics processing unit. The latter will be a monster chip containing whopping 7.1 billion transistors, 15 streaming multiprocessors with total of 2880 stream processors and "delivering three times more double precision [performance] compared to Fermi architecture-based Tesla products", according to Nvidia. Given the fact that the highest-end Tesla 2090 provides 0.665TFLOPS of DP performance, the Tesla "K20" has potential to deliver up to enormous 2TFLOPS DP, although Nvidia claims about "over 1TFLOPS" (which is not that impressive, considering the fact that AMD Radeon HD 7970 hits 947TFLOPS DP at 925MHz).

But in addition to enormous transistor count and high performance, the GK110 will offer new capabilities that are not available on other chips:

  • Dynamic Parallelism enables GPU threads to dynamically spawn new threads, allowing the GPU to adapt dynamically to the data. This simplifies parallel programming, enabling GPU acceleration of a broader set of popular algorithms, such as adaptive mesh refinement, fast multipole methods and multigrid methods.
  • Hyper-Q enables multiple CPU cores to simultaneously use the CUDA architecture cores on a single Kepler GPU, which increases GPU utilization, slashing CPU idle times and advancing programmability. Hyper-Q is ideal for cluster applications that use MPI, according to Nvidia.

The GK110 GPU is expected to be incorporated into the new Titan supercomputer at the Oak Ridge National Laboratory in Tennessee and the Blue Waters system at the National Center for Supercomputing Applications at the University of Illinois at Urbana-Champaign.

Nvidia Tesla K20 is planned to be available in the fourth quarter of 2012.

Tags: Nvidia, Tesla, GK110, Kepler, Geforce, Quadro, 28nm

Discussion

Comments currently: 16
Discussion started: 05/16/12 06:14:31 PM
Latest comment: 05/19/12 09:19:53 AM
Expand all threads | Collapse all threads

[1-4]

1. 
show the post
1 11 [Posted by: BestJinjo  | Date: 05/16/12 06:14:31 PM]
Reply
- collapse thread

 
show the post
1 11 [Posted by: BestJinjo  | Date: 05/17/12 09:52:50 AM]
Reply
 
Sorry, I thought there were 256 stream processors per SMX. Corrected, thanks!
2 2 [Posted by: Anton  | Date: 05/17/12 12:17:48 PM]
Reply
 
show the post
1 11 [Posted by: BestJinjo  | Date: 05/17/12 01:53:39 PM]
Reply
 
Anton, but they were 256 CUDA Cores per SMX. 192 for Single precission and 64 extra cores for the double precission.

So, Can they run simultaneously?, I don´t know, but it´s probably (independent schedulers and many ports to CUDA cores arrays for them).
1 0 [Posted by: er_wendigo  | Date: 05/17/12 02:45:25 PM]
Reply
 
Yes, the math are correct:

15 SMX x 192 CUDA Cores (SP) = 2,888 CUDA Cores.
15 SMX x 64 CUDA Cores (DP) = 960 CUDA Cores.

Then:

2,880 + 960 = 3840 CUDA Cores (SP and DP Cores).
2 1 [Posted by: er_wendigo  | Date: 05/17/12 02:40:49 PM]
Reply
 
show the post
0 9 [Posted by: BestJinjo  | Date: 05/19/12 09:09:35 AM]
Reply

2. 
SM Cluster CUDA cores do not equal SMX ones, it's about 2:1; just a fyi.
2 0 [Posted by: LedHed  | Date: 05/16/12 10:14:15 PM]
Reply
- collapse thread

 
show the post
0 10 [Posted by: BestJinjo  | Date: 05/17/12 09:53:46 AM]
Reply
 
The CUDA Cores of the Fermi arquitecture runs at double speed (frecuency) than the kepler ones. So they can do the same work that two Kepler CUDA cores in the same cycle.
1 1 [Posted by: er_wendigo  | Date: 05/17/12 02:47:13 PM]
Reply
 
show the post
0 9 [Posted by: BestJinjo  | Date: 05/19/12 09:19:53 AM]
Reply

3. 
show the post
0 3 [Posted by: kvarta  | Date: 05/16/12 10:29:02 PM]
Reply
- collapse thread

 
show the post
3 11 [Posted by: BestJinjo  | Date: 05/17/12 09:55:58 AM]
Reply
 
A note:

The strenght of nvidia´s arquitectures is something more than "good support or professional software".

Its arquitectures are more flexible and have more efficiency (the practical performance in many types of applications are nearer to the theorical peak performance than AMD´s arquitectures).
0 0 [Posted by: er_wendigo  | Date: 05/17/12 02:50:36 PM]
Reply
 
show the post
0 9 [Posted by: BestJinjo  | Date: 05/19/12 09:11:55 AM]
Reply

4. 
The SI prefixes Giga and Tera seem to be randomly distributed ;-)
0 1 [Posted by: tmold  | Date: 05/17/12 04:52:03 AM]
Reply

[1-4]

Add your Comment




Related news

Latest News

Tuesday, May 21, 2013

10:25 pm | Seagate Reveals Industry's First Purpose-Built 4TB Video Hard Disk Drive. Seagate Develops Purpose-Built Hard Disk Drive for Video

10:03 pm | Microsoft Xbox One to Run Two Operating Systems at Once. To Provide the Best Experience, Xbox One Will Rely on Several Operating Systems

9:59 pm | Microsoft Xbox One Will Not Require “Always On” Connection, But Will Need the Internet for Nearly Everything. Microsoft Xbox One Will Need Internet Connection for Majority of Things

9:20 pm | Imec, GlobalFoundries and Qualcomm Team Up for High-Density STT-MRAM. Qualcomm Shows Interest in STT-MRAM, Collaboration with GlobalFoundries

8:58 pm | Intel Dominates Microprocessor Sales as AMD’s Shipments Drop Below Apple, Qualcomm and Samsung. Apple, Qualcomm and Samsung Pass AMD in Microprocessor Rankings

8:51 pm | Microsoft Xbox One Will Not Be Backwards Compatible with Xbox 360 Games. Microsoft Drops Backwards Compatibility for Xbox One

8:15 pm | Microsoft and Sony to Start Selling Next-Gen Consoles by End of October . Battlefield 4 Launch Date Reveals Availability Timeframe for PlayStation 4, Xbox One

7:44 pm | Microsoft Unveils Xbox One: The One and Only Machine One Needs in the Living Room. Microsoft Reveals Its New Vision for Game Consoles with Xbox One System