News
 

Bookmark and Share

(20) 

Intel Corp. on Tuesday demonstrated its code-named "Knights Corner" compute accelerator for highly-parallel workloads that is made using 22nm process technology and will be Intel's first commercial product based on many integrated cores (MIC) architecture. The KNC accelerator can deliver substantially higher horsepower that existing compute cards.

Intel's Knights Corner accelerator has over 50 cores and delivers 1TFLOPS of double precision floating point performance, as measured by the double-precision, general matrix-matrix multiplication benchmark (DGEMM). Currently the most powerful special purpose highly-parallel accelerator is Nvidia Tesla 2090, which boasts with 665GFLOPS (0.665TFLOPS) of peak performance, which is considerably below peak performance of Intel's KNC.


Rajeeb Hazra holding Knights Corner accelerator

The first presentation of the first silicon of “Knights Corner” co-processor showed that Intel architecture is capable of delivering more than 1TFLOPS of double precision floating point performance and this was the first demonstration of a single processing chip capable of achieving such a performance level. Interestingly, but the KNC accelerator is not just a PCI Express accelerator like its predecessor, the Knights Ferry compute accelerator for software developers, but looks like CPU that plugs into a socket or a special adapter.

“Intel first demonstrated a Teraflop supercomputer utilizing 9680 Intel Pentium Pro processors in 1997 as part of Sandia Lab’s 'ASCI RED' system. Having this performance now in a single chip based on Intel MIC architecture is a milestone that will once again be etched into HPC history,” said Rajeeb Hazra, general manager of technical computing at Intel datacenter and connected systems group.

Knights Corner, the first commercial Intel MIC architecture product, will be manufactured using Intel’s latest 3D tri-gate 22nm transistor process and will feature more than 50 cores. When available, Intel MIC products will offer both high performance from an architecture specifically designed to process highly parallel workloads, and compatibility with existing x86 programming model and tools. One of the benefits of Intel MIC architecture is the ability to run existing applications without the need to port the code to a new programming environment. This will allow scientists to use both CPU and co-processor performance simultaneously with existing x86 based applications, dramatically saving time, cost and resources that would otherwise be needed to rewrite them to alternative proprietary languages.

As previously announced at the International Supercomputing Conference 2011 in Hamburg, Germany, Intel’s goal is to deliver exascale-level performance by 2018 (which is more than 100 times faster performance than is currently available) while only requiring two times the power usage of the current top supercomputer. Fundamental to achieving that goal is working closely with the HPC community, and today Intel announced several new initiatives that will help to achieve that goal.

Tags: Intel, Knights Corner, Knights Ferry, Larrabee, 22nm, MIC, Exascale

Discussion

Comments currently: 20
Discussion started: 11/15/11 06:58:10 PM
Latest comment: 03/19/12 07:50:10 PM
Expand all threads | Collapse all threads

[1-11]

1. 
Wow, Low Power in 50 Cores Processor. Thats New. Fermi will have hard time competing Lower Efficient Processor per Watts
2 1 [Posted by: xentar  | Date: 11/15/11 06:58:10 PM]
Reply
- collapse thread

 
KC is great for Double-Precision performance. So what? HD5870 and HD6970 smoke Fermi in that metric too in the consumer space. What about games?

Kepler (Fermi II) and HD7000 series will be another huge jump. Intel has no chance of competing with nV of AMD in high-end graphics.
4 6 [Posted by: BestJinjo  | Date: 11/15/11 07:27:28 PM]
Reply
 
Fermi offer around 1.5TFlops consume power at 380 watts

Radeon HD 6970 2.7Glops consume 360watts

Intel KC Power Efficiency for 1Teraflops is a breakthrough, which Fermi dont have or ATI.

Its not design for Desktop use. Im not sure if i see it right 1TFlops Machine only 20watts? even if estimate, we increase 50 watts for sake of argument its still a big difference for today offerings.
2 4 [Posted by: xentar  | Date: 11/15/11 08:45:00 PM]
Reply
 
Thats the estimated exascale HPC for 2018 if u want similar power levels than today and dont want to own a nuclear reactor.
0 0 [Posted by: Zool  | Date: 11/15/11 11:27:41 PM]
Reply
 
KNC isn't for gaming.
Comparing DP numbers of desktop video cards is pointless. 6970 is faster than geforce fermi cards because nvidia intentionally capped their DP performance, so that they won't interfere with tesla sales.
An uncapped 570 would have a 702 GFLOPS DP number, and a 6970 would do 675 GFLOPS. Still, they are pretty close.

You're right about Kepler and HD 7000. They'll improve the DP performance at least two folds and will be available for purchase in H1 2012.
KNC is no where to be seen.
1 0 [Posted by: eddman  | Date: 11/16/11 04:33:14 AM]
Reply
 
That 20W you see there is NOT the power consumption of KNC. It is not known yet. Read this:

http://hothardware.com/Ne...ghts-Corner-MIC-Products/
0 0 [Posted by: eddman  | Date: 11/16/11 04:10:18 AM]
Reply

2. 
This is nothing special .. If you put a Nivida or AMD Chip on 22nm trigate they would make this intel KC look like a toy...

Both Nvidia and AMD move to 28nm chips that will be on sale for private or commercial use first thing next year with double the power in both compute and parallel workloads at half the power usage on a larger process node then intels KC..

So even much smaller companies can make super large intel look like an child trying to play with the big boys.... So Intel is looking forward to 2018 to offer these products??

At the rate graphics cards change and improve KC will look like a pocket calculator in comparison come 2018
3 3 [Posted by: vid_ghost  | Date: 11/15/11 10:47:49 PM]
Reply
- collapse thread

 
This product is for 2012. 2018 will see one a hundred times more powerful.

And even if NVIDIA and AMD could beat Intel on a 22nm trigate, fact is they don't have access to it.
0 0 [Posted by: ET3D  | Date: 11/15/11 11:21:55 PM]
Reply

3. 
co-procesor mean second cpu for calculations only
not gpu ..
graphics cards ??
fermi compute card is not gforce gaming toy
and what they say 1,5gflop is estimate
not hi precision and 6xx Gflop is closer to real math perf.
1 0 [Posted by: skvx  | Date: 11/15/11 11:21:35 PM]
Reply

4. 
Wait, what?

The Tesla C2090 = 665 TFlops today.

Some future Intel 22nm chip will manage a stunning 1 TFlop? Only 665 times slower than what Nvidia has been shipping?

I think someone has their units wrong.

0 0 [Posted by: sbike  | Date: 11/15/11 11:48:09 PM]
Reply
- collapse thread

 
You miss read something.

Single Tesla C2070 = SP 1.2TFlops while DP 515.2Gflops

665 Tflops? it will only take 10 of that to have 6 Peta


1 0 [Posted by: xentar  | Date: 11/16/11 12:43:33 AM]
Reply

5. 
With AVX2 being almost equivalent to LRBni, they should instead go for a homogeneous architecture next.

By executing 1024-bit instructions on 256-bit AVX execution units, in four cycles, we'd implicitly get access to four times the physical register space, and thus four times the latency hiding. This is how GPU architectures work today, but with an 1024-bit successor to AVX2 that would become available at the heart of the CPU core. Note that the AVX encoding is readily extendible to 1024-bit.

Furthermore, since it would lower the instruction rate (without lowering the data throughput), it allows to clock-gate the front-end and thus save on power consumption.

This way all the advantages of GPU architectures would be unified into the CPU architecture. We'd get superior performance due to less inter-core traffic, and better data locality due to requiring fewer threads.
0 0 [Posted by: c0d1f1ed  | Date: 11/16/11 05:39:30 AM]
Reply

6. 
Great achievement by Intel there but it looks more like a paper presentation than proper launch. I'd like to see some real numbers there .. some realworld applications where this KC Chip shows a performance 50% better than nVIDIA's Tesla.

Besides, if this requires my servers to buy who knows what adapter cards that surpass the price of the simple addition of Tesla cards and some "non free" recompiling of our software to optimize for Intel's KC Chip ... I don't think the board would ever approve spending on Intel's new toy.
0 0 [Posted by: East17  | Date: 11/16/11 04:10:03 PM]
Reply

7. 
The factors that make or break this product were left out of the article, what the power consumption / and die size was to get 1tflop performance.
0 0 [Posted by: cashkennedy  | Date: 11/16/11 06:16:11 PM]
Reply

8. 
a product still a year away on a new production process beats a 3 year old design on a 2 year old production process...

ya, not all that impressive.
0 1 [Posted by: Countess  | Date: 11/21/11 04:11:46 AM]
Reply

9. 
I think that you all missed an important point here. A big factor that makes KC very appealing is in its ability to use existing x86 based applications, which directly addresses one of the most challenging problems to face the High Performance Computing (HPC) community (i.e. the ability to write a program that can actually utilize the entire system). Case in point, the latest Chinese supercomputer boasted the number one spot on the top500 list, but they did not have a scientific program that would actually utilize the entire system, which makes the machine almost useless. The ultimate goal of investing the time and MONEY into building a HPC system is to do science. If a new HPC system were developed based on KC, which one is supposed to come online in 2013, scientists should be able to install their existing parallel scientific codes and use the entire system almost from day one. This would be extremely beneficial to everyone, including all you gamers, as it is the science that provides the new technology that everyone loves.
0 0 [Posted by: bm  | Date: 01/06/12 12:38:35 PM]
Reply

10. 
I think that you all missed an important point here. A big factor that makes KC very appealing is in its ability to use existing x86 based applications, which directly addresses one of the most challenging problems to face the High Performance Computing (HPC) community (i.e. the ability to write a program that can actually utilize the entire system). Case in point, the latest Chinese supercomputer boasted the number one spot on the top500 list, but they did not have a scientific program that would actually utilize the entire system, which makes the machine almost useless. The ultimate goal of investing the time and MONEY into building a HPC system is to do science. If a new HPC system were developed based on KC, which one is supposed to come online in 2013, scientists should be able to install their existing parallel scientific codes and use the entire system almost from day one. This would be extremely beneficial to everyone, including all you gamers, as it is the science that provides the new technology that everyone loves.
0 0 [Posted by: bm  | Date: 01/06/12 12:45:44 PM]
Reply

11. 
Liquid Nitrogen Overclocking makes the fastest desktop computers in the world that you can actually buy TODAY. How about 6.0 GHz performance 24x7? Their 5.2 GHz i7-3960X is just that; 6.0 GHz on the Gulftown architecture scale (i7-980X) and that chip was awesome even at stock speeds. They also lay claim to the fastest chess computer in the world. See the link at http://www.liquidnitrogenoverclocking.com for more information.
0 0 [Posted by: XOR_gate  | Date: 03/19/12 07:50:10 PM]
Reply

[1-11]

Add your Comment




Related news

Latest News

Monday, August 4, 2014

4:04 pm | HGST Shows-Off World’s Fastest SSD Based on PCM Memory. Phase-Change Memory Power’s World’s Fastest Solid-State Drive

Monday, July 28, 2014

6:02 pm | Microsoft’s Mobile Strategy Seem to Fail: Sales of Lumia and Surface Remain Low. Microsoft Still Cannot Make Windows a Popular Mobile Platform

12:11 pm | Intel Core i7-5960X “Haswell-E” De-Lidded: Twelve Cores and Alloy-Based Thermal Interface. Intel Core i7-5960X Uses “Haswell-EP” Die, Promises Good Overclocking Potential

Tuesday, July 22, 2014

10:40 pm | ARM Preps Second-Generation “Artemis” and “Maya” 64-Bit ARMv8-A Offerings. ARM Readies 64-Bit Cores for Non-Traditional Applications

7:38 pm | AMD Vows to Introduce 20nm Products Next Year. AMD’s 20nm APUs, GPUs and Embedded Chips to Arrive in 2015

4:08 am | Microsoft to Unify All Windows Operating Systems for Client PCs. One Windows OS will Power PCs, Tablets and Smartphones