News
 

Bookmark and Share

(40) 

UPDATE: Correcting the number of stream processors in GK110 chip.

Nvidia Corp. this week formally announced two new Tesla compute accelerators that are based on the code-named Kepler architecture. The Tesla K10 - based on the well-known GK104 chip - will become available shortly and will be aimed at those, who need maximum raw single-precision compute performance here and now. The Tesla K20, which will show up late this year, promises to be a performance monster thanks to GK110 chip with whopping 7 billion transistors.

The Nvidia Tesla "K10" compute card is based on two GK104 GPUs that deliver an aggregate peak performance of 4.58TFLOPS single precision, 0.190TFLOPS double precision and 320GB/s of m memory bandwidth. The Tesla "K10" is optimized for customers in oil and gas exploration and the defense industry. Since the K10 compute board is powered by GK104 chip, which is generally meant for graphics processing, it does not deliver really strong double precision performance and will not be a good solution for many fields of high performance computing.

The flagship compute solution based on Kepler architecture will be Tesla K20, which will be based on GK110 graphics processing unit. The latter will be a monster chip containing whopping 7.1 billion transistors, 15 streaming multiprocessors with total of 2880 stream processors and "delivering three times more double precision [performance] compared to Fermi architecture-based Tesla products", according to Nvidia. Given the fact that the highest-end Tesla 2090 provides 0.665TFLOPS of DP performance, the Tesla "K20" has potential to deliver up to enormous 2TFLOPS DP, although Nvidia claims about "over 1TFLOPS" (which is not that impressive, considering the fact that AMD Radeon HD 7970 hits 947TFLOPS DP at 925MHz).

But in addition to enormous transistor count and high performance, the GK110 will offer new capabilities that are not available on other chips:

  • Dynamic Parallelism enables GPU threads to dynamically spawn new threads, allowing the GPU to adapt dynamically to the data. This simplifies parallel programming, enabling GPU acceleration of a broader set of popular algorithms, such as adaptive mesh refinement, fast multipole methods and multigrid methods.
  • Hyper-Q enables multiple CPU cores to simultaneously use the CUDA architecture cores on a single Kepler GPU, which increases GPU utilization, slashing CPU idle times and advancing programmability. Hyper-Q is ideal for cluster applications that use MPI, according to Nvidia.

The GK110 GPU is expected to be incorporated into the new Titan supercomputer at the Oak Ridge National Laboratory in Tennessee and the Blue Waters system at the National Center for Supercomputing Applications at the University of Illinois at Urbana-Champaign.

Nvidia Tesla K20 is planned to be available in the fourth quarter of 2012.

Tags: Nvidia, Tesla, GK110, Kepler, Geforce, Quadro, 28nm

Discussion

Comments currently: 40
Discussion started: 05/16/12 06:14:31 PM
Latest comment: 10/02/14 08:59:34 AM
Expand all threads | Collapse all threads

[1-4]

1. 
show the post
1 11 [Posted by: BestJinjo  | Date: 05/16/12 06:14:31 PM]
Reply
- collapse thread

 
show the post
1 11 [Posted by: BestJinjo  | Date: 05/17/12 09:52:50 AM]
Reply
 
Sorry, I thought there were 256 stream processors per SMX. Corrected, thanks!
2 2 [Posted by: Anton  | Date: 05/17/12 12:17:48 PM]
Reply
 
show the post
1 11 [Posted by: BestJinjo  | Date: 05/17/12 01:53:39 PM]
Reply
 
Anton, but they were 256 CUDA Cores per SMX. 192 for Single precission and 64 extra cores for the double precission.

So, Can they run simultaneously?, I don´t know, but it´s probably (independent schedulers and many ports to CUDA cores arrays for them).
1 0 [Posted by: er_wendigo  | Date: 05/17/12 02:45:25 PM]
Reply
 
Yes, the math are correct:

15 SMX x 192 CUDA Cores (SP) = 2,888 CUDA Cores.
15 SMX x 64 CUDA Cores (DP) = 960 CUDA Cores.

Then:

2,880 + 960 = 3840 CUDA Cores (SP and DP Cores).
2 1 [Posted by: er_wendigo  | Date: 05/17/12 02:40:49 PM]
Reply
 
show the post
0 9 [Posted by: BestJinjo  | Date: 05/19/12 09:09:35 AM]
Reply

2. 
SM Cluster CUDA cores do not equal SMX ones, it's about 2:1; just a fyi.
2 0 [Posted by: LedHed  | Date: 05/16/12 10:14:15 PM]
Reply
- collapse thread

 
show the post
0 10 [Posted by: BestJinjo  | Date: 05/17/12 09:53:46 AM]
Reply
 
The CUDA Cores of the Fermi arquitecture runs at double speed (frecuency) than the kepler ones. So they can do the same work that two Kepler CUDA cores in the same cycle.
1 1 [Posted by: er_wendigo  | Date: 05/17/12 02:47:13 PM]
Reply
 
show the post
0 9 [Posted by: BestJinjo  | Date: 05/19/12 09:19:53 AM]
Reply

3. 
show the post
0 3 [Posted by: kvarta  | Date: 05/16/12 10:29:02 PM]
Reply
- collapse thread

 
show the post
3 11 [Posted by: BestJinjo  | Date: 05/17/12 09:55:58 AM]
Reply
 
A note:

The strenght of nvidia´s arquitectures is something more than "good support or professional software".

Its arquitectures are more flexible and have more efficiency (the practical performance in many types of applications are nearer to the theorical peak performance than AMD´s arquitectures).
0 0 [Posted by: er_wendigo  | Date: 05/17/12 02:50:36 PM]
Reply
 
show the post
0 9 [Posted by: BestJinjo  | Date: 05/19/12 09:11:55 AM]
Reply

4. 
The SI prefixes Giga and Tera seem to be randomly distributed ;-)
0 1 [Posted by: tmold  | Date: 05/17/12 04:52:03 AM]
Reply

[1-4]

Add your Comment




Related news

Latest News

Wednesday, October 8, 2014

8:52 pm | Lisa Su Appointed as New CEO of Advanced Micro Devices. Rory Read Steps Down, Lisa Su Becomes New CEO of AMD

Thursday, August 28, 2014

12:22 pm | AMD Has No Plans to Reconsider Recommended Prices of Radeon R9 Graphics Cards. AMD Will Not Lower Recommended Prices of Radeon R9 Graphics Solutions

Wednesday, August 27, 2014

9:09 pm | Samsung Begins to Produce 2.13GHz 64GB DDR4 Memory Modules. Samsung Uses TSV DRAMs for 64GB DDR4 RDIMMs

Tuesday, August 26, 2014

6:41 pm | AMD Quietly Reveals Third Iteration of GCN Architecture with Tonga GPU. AMD Unleashes Radeon R9 285 Graphics Cards, Tonga GPU, GCN 1.2 Architecture

Monday, August 25, 2014

6:05 pm | Chinese Inspur to Sell Mission-Critical Servers with AMD Software, Power 8 Processors. IBM to Enter Chinese Big Data Market with the Help from Inspur