Bookmark and Share


Intel Corp. said its code-named "Knights Corner" (KNC) multi-core chip is on track with the undisclosed Intel roadmap. The first supercomputer powered by the KNC will be, as planned, turned on in 2013.

"Knights Corner is in great shape and is exactly where it has to be according to our internal schedule. We have not disclosed any information related to production or launch date of 'Knights Corner'," said Radoslaw Walczyk, a spokesman for Intel said.

Earlier this month Diane Bryant, vice president and general manager of Intel's server product group said during a public Xeon E5 discussion that MIC was "set to go into production in about a year". Apparently, Ms. Bryant was talking about 10PFLOPS “Stampede” supercomputer - based on Intel MIC's Knights Corner & Intel Xeon E5 - at TACC scheduled for power on in early 2013.

Intel's Knights Corner accelerator has over 50 cores and delivers 1TFLOPS of double precision floating point performance, as measured by the double-precision, general matrix-matrix multiplication benchmark (DGEMM). Currently the most powerful special purpose highly-parallel accelerator is Nvidia Tesla 2090, which boasts with 665GFLOPS (0.665TFLOPS) of peak performance, which is considerably below peak performance of Intel's KNC.

When completed in early 2013, Stampede will comprise several thousand Dell "Zeus" servers with each server having two eight-core Intel Xeon E5-series "Sandy Bridge-EP" processors as well as 32GB of memory. In addition, the supercomputer will integrate Intel MIC "Knights Corner" accelerators (with 50+ cores and made using 22nm process technology) to process highly parallel workloads. Several thousands of Xeon processors will offer about 2PFLOPS of peak performance, whereas the MIC highly-parallel accelerators will provide additional 8PFLOPS of performance.

Furthermore, Stampede will offer 128 next-generation Nvidia Quadro graphics processing units (GPUs) code-named Kepler for remote visualization, 16 Dell servers with 1TB of shared memory and 2 GPUs each for large data analysis, and a high-performance Lustre file system for data-intensive computing. All components will be integrated with an InfiniBand FDR 56Gb/s network for extreme scalability.

Altogether, Stampede will have a peak performance of 10PFLOPS, 272TB of total memory, and 14PB of disk storage.

The Stampede will be one of the world's most powerful supercomputers when completed.

Tags: Intel, Knights Corner, Knights Ferry, Larrabee, 22nm, MIC, Exascale


Comments currently: 10
Discussion started: 04/08/12 03:04:10 PM
Latest comment: 04/29/14 06:24:35 PM
Expand all threads | Collapse all threads


So Intel's Knights Corner that won't even be in production until April 2013 can beat a two year old Nvidia Nvidia Tesla 2090 that was released on May 17, 2011.

Color me Unimpressed.

P.S. Kepler BigK GK110 will be available this year so when/if Knights Corner is ever released it will already be behind.
0 1 [Posted by: s23e7h4kf936hklnf7y8b  | Date: 04/08/12 08:45:38 PM]
- collapse thread

Knights Corner MIC is not about maximum throughput benchmark FLOPs but the FLOPs it can do in more complex algorithms. If you only need raw FLOPs then Nvidia cards are the cheapest way to get there, but Knights Corner has its own niche between x86 CPUs and Telsa.
4 0 [Posted by: Andys  | Date: 04/08/12 09:09:48 PM]
It's just another marketing blab to sell millenium old technology fore some new dumba*ss kids which could only shift some money out of their sugardadys pocket (govs, brokers, shabby funded non-profit orgs etc.)

Beleieve it or not, it would sale itself like cotton candy. No matter how INEFFICIENT it really is.

Specialized algorithms for CUDA. C'mon grow up. Since Ferminator cuda is well self-sustainable multicore supercruncher just like any ARM or x86 CPU
0 2 [Posted by: OmegaHuman  | Date: 04/09/12 09:40:01 AM]
Heterogeneous computing makes for nice benchmarks, but taking advantage of it for any real purpose is difficult. Ask anyone that programmed for the Cray Jaguar XT5--it was a very, very expensive computer that accomplished the political goal of putting the US back in the #1 spot on the TOP500 list, but the power went nearly unused since only a few types of tasks could take advantage of the theoretical horsepower. It wasn't long ago that it claimed that #1 TOP500 spot, yet they're already swapping out the unholy x64/PPC/Cell architecture for a more usable x64/GPU hybrid (this is a lightly-publicized fact... essentially, it was a taxpayer boondoggle project).

The point is that higher FLOPs =/= higher usable performance. CPU/GPU hybrid solutions are better than the x64/PPC/Cell combo in Jaguar, but Knights Corner potentially offers peak performance across a broader spectrum of scenarios. There is also a lot to be said for time it takes and the associated budget to develop and debug a working solution for a machine of this size. Top peak performance makes for nice headlines, but it's wasteful and unproductive at best if it can't be used due to architectural and practical realities.
5 0 [Posted by: bluvg  | Date: 04/09/12 01:42:53 AM]
Exactly my thoughts when I read through the article. The Stampede arrangement seems to be of a heterogeneous cluster nature and as such there's no way in programming tasks for it that would be able to take advantage of all its power at the same time. Single and highly parallel tasks would suffer coherency issues in a non-homogeneous environment and never reach the theoretical throughput of each of cluster's individual units. Serialization would be the biggest issue using much of its power and nearly impossible to program in any efficient way. Reading it differently, it makes no sense in adding processing power of each of its units one to the other and call the total result the processing power of a supercomputer. It just never happens so in the real life scenarios. The sum of the specs for Stampede seems to suggest it's never going to be used in such a way anyway. Calculating total computing power of such cluster makes as much sense as saying internet is the de facto number one supercomputer because the total computing power of all of its parts can't be matched by any other single machine in the world. That's just plain silly.
1 0 [Posted by: MyK  | Date: 04/09/12 07:41:24 AM]
Well we might be unimpressed. But for most of dumba*ss sys integrators these is much more appealing than any well established CUDA environment because they could run "gazillion" Solitare card games w/o the need to retouch the code some dumba*ss wrote twenty years ago.

In the end it's cheaper no matter if it blows 2times more power for same die space or if it has 2x less compute power for same die space or which essentially makes it 4-5x MORE INEFFICIENT PER WATT

1 2 [Posted by: OmegaHuman  | Date: 04/09/12 09:34:31 AM]

0 0 [Posted by: s23e7h4kf936hklnf7y8b  | Date: 04/09/12 06:25:46 AM]

the basic computation units in MIC are essentially Pentium (aka 80586) processors,so you get the advantages of both the x86 instruction set and the ease of programming on a traditional cpu over the nvidia solutions.
0 0 [Posted by: gnaw89  | Date: 04/09/12 08:50:02 AM]
- collapse thread

some of the discussions around programming the upcoming MIC chips leave me scratching my head – particularly the notion that, because MIC runs the x86 instruction set, there’s no need to change your existing code, and your port will come for free.


No “Magic” Compiler

The reality is that there is no such thing as a “magic” compiler that will automatically parallelize your code. No future processor or system (from Intel, NVIDIA, or anyone else) is going to relieve today’s programmers from the hard work of preparing their applications for the future.
1 0 [Posted by: s23e7h4kf936hklnf7y8b  | Date: 04/09/12 09:32:57 AM]

Maybe intel should throw in a six pack of atoms with every corner accelerator they sell.............FAIL
0 0 [Posted by: alpha0ne  | Date: 04/13/12 02:44:13 AM]

We, Psychsoftpc, make Tesla based Supercomputers as partners with NVidia and since we are Intel partners, too, I suppose we'll be making some based on MIC (key mouse). But everyone here that says parallelizing code will not be automatic is correct. a simple compile switch can't do that. if Intel is working on a tool, however, that would be a different story
0 0 [Posted by: psychsoftpc  | Date: 04/29/12 06:27:06 AM]

6. has produced a 64 core parallel processor that is low power, low heat, and high speed.

This small company based in the Boston area is leading the charge with new designs.
Why has Arista, Dell, HP not signed a partnership deal with this company?

Ericsson stepped up to the plate and has offered financial funding
0 0 [Posted by: Russ Bazz  | Date: 04/29/14 06:24:35 PM]


Add your Comment

Related news

Latest News

Wednesday, November 5, 2014

10:48 pm | LG’s Unique Ultra-Wide Curved 34” Display Finally Hits the Market. LG 34UC97 Available in the U.S. and the U.K.

Wednesday, October 8, 2014

12:52 pm | Lisa Su Appointed as New CEO of Advanced Micro Devices. Rory Read Steps Down, Lisa Su Becomes New CEO of AMD

Thursday, August 28, 2014

4:22 am | AMD Has No Plans to Reconsider Recommended Prices of Radeon R9 Graphics Cards. AMD Will Not Lower Recommended Prices of Radeon R9 Graphics Solutions

Wednesday, August 27, 2014

1:09 pm | Samsung Begins to Produce 2.13GHz 64GB DDR4 Memory Modules. Samsung Uses TSV DRAMs for 64GB DDR4 RDIMMs

Tuesday, August 26, 2014

10:41 am | AMD Quietly Reveals Third Iteration of GCN Architecture with Tonga GPU. AMD Unleashes Radeon R9 285 Graphics Cards, Tonga GPU, GCN 1.2 Architecture