After relatively calm 2009, the year 2010 could have been called a storm. But only could, because the year 2010 was just a calm before the storm. The year 2011 not only promises to bring some natural changes to the industry (acquisitions, bankruptcies, mergers and scandals), but it promises fulfill what the 2010 has left: to introduce new realities, trends and to make companies prove the plans.
This year happened a number of unprecedented things: Nvidia disclosed its short-term and long-term roadmaps and while no details were revealed, the direction of the company became even more obvious, OCZ Technology decided to drop the business of inexpensive memory modules to concentrate on solid-state drives, Panasonic grabbed exclusive rights onto Blu-ray 3D version of Avatar movie and ex-Sony executives expressed doubts that the PlayStation 3 could win the war against Xbox 360 and Wii. However, those events and claims are hardly too important.
There are news-stories that you, our readers, read more than others. We picked those stories up, analyzed them and chose those, which actually introduced something truly important for the computing industry. Naturally, we combined several news-stories into a single news-topic. We did not this time tried to evaluate the importance of any events, for us, they are equally substantial.
This editorial is not about revealing news, but about recapping the events that most of you read this year and that made the 2010 different and, perhaps, will make it in history.
Intel Cans Larrabee Graphics, Ships 48-Core Microprocessor, Touts 1000-Core Chip
Designing a graphics processor is definitely not an easy task, this is a lesson that Intel Corp.'s engineers learnt in 2009 - 2010 period. Firstly, Intel "delayed" roll-out of its code-named Larrabee graphics chip, secondly, the company just scrapped the project and said officially that it was unlikely to release it as a discrete graphics product. But while the GPU from Intel is essentially dead, the company outlined plans to multi-core commercial accelerators for high-performance computing (HPC) systems and started to ship its single-chip cloud computer (SCC) processor with 48-cores to researchers.
“We will not bring a discrete graphics product to market, at least in the short-term. […] We are also executing on a business opportunity derived from the Larrabee program and Intel research in many-core chips. This server product line expansion is optimized for a broader range of highly parallel workloads in segments such as high performance computing. We will also continue with ongoing Intel architecture-based graphics and HPC-related R&D and proof of concepts,” said Bill Kircos, director of product and technology media relations at Intel.
Multi-Core Chips for HPC
HPC is a segment that will benefit tangibly from many-core architectures. This year IBM and other makers of HPC servers unveiled machines featuring Nvidia Corp.’s Tesla 2000-series many-core computing processors that are designed specifically to rival x86 offerings from Advanced Micro Devices, Intel or non-x86 processors on the high-performance computing market. AMD also admits that eventually accelerators like AMD FireStream or Nvidia Tesla will rival traditional chips substantially.
But during the year Intel made it clear: it has a lot of trumps up in its sleeves: Intel started to ship a system running a 48-core x86 microprocessor to select software developers in a bid to encourage interest towards many-core x86 central processing units (CPUs) with the help of an actual personal computer and chip; Intel also began to ship its code-named Knights Ferry software development kit to interested parties looking forward creation of applications compatible with Intel's dedicated many-core HPC accelerators based on MIC (many Intel core) architecture.
The Knights Ferry chip appeared to be what has been previously known as Larrabee GPU: has 32 x86 cores clocked at 1.2GHz and featuring quad-HyperThreading. The unit, aimed at PCI Express 2.0 slots, has up to 2GB of GDDR5 memory. The chip itself has 8MB of shared L2 cache, which is quite intriguing by itself since highly-parallel applications do not require a large on-chip cache.
During SC10, Intel conducted demonstrations showcasing the real-world capabilities of the Knights Corner. These included using Intel MIC architecture as a co-processor running financial derivative Monte Carlo demonstrations that boasted twice the performance of those conducted with prior generation technologies. The Monte Carlo application for Intel MIC was generated using standard C++ code with an Intel MIC-enabled version of the Intel Parallel Studio XE 2011 software development tools, demonstrating how applications for standard Intel CPUs can scale to future Intel MIC products. Intel also showcased compressed medical imaging developed with Mayo Clinic on Knights Ferry. The demonstration used compressed signals to rapidly create high-quality images, reducing the time a patient has to spend having an MRI. Even earlier the company demoed real-time ray tracing rendering on its MIC development platform.
Intel Knights Ferry, image by ComputerBase.de
Perhaps, MIC is exciting, but will it bring profits? In order to develop those architectures, it is necessary to sell them on a broad set of markets; while the HPC market still relies onto CPUs, which is why the graphics processor vendors cannot get high revenue here. It seems that luxuries can come in and go, a long-term-oriented strategy stays.
"I do not see the economic model [with MIC]. We are able to produce those [Tesla] GPGPUs because the ultimately there is one GPU for GeForce consumer graphics, Quadro professional business. It costs $500 million to $1 billion to develop those new products every year, it is a hyge investment. Unless you have that [consumer and professional] economic engine in the background, I cannot imagine how one could make a GPU without having a graphics business," said said Sumit Gupta, product manager at Nvidia's Tesla business unit, in an interview with X-bit labs.
Single-Chip Cloud Computer
Intel MIC architecture will indisputably find its place under the sun already in two to three years from now, many of its elements will be akin to today's microprocessors and graphics cards. But the SCC chip, which is actually a small breakthrough for 2010, seems to carry on a number of concepts that will parts of both short-term and long-term processors.
The SCC prototype chip contains 24 tiles with two x86 cores per each, which results in 48 cores – the largest number ever placed on a single piece of silicon. Each core can run a separate OS and software stack and act like an individual compute node that communicates with other compute nodes over a packet-based network. Every core sports its own L2 cache and each tile sports a special router logic that allows tiles to communicate with each other using a 24-router mesh network with 256GB/s bisection bandwidth. There is no hardware cache coherence support among cores in order to simplify the design, reduce power consumption and to encourage the exploration of datacenter distributed memory software models on-chip. Each tile (2 cores) can have its own frequency, and groupings of four tiles (8 cores) can each run at their own voltage. The processor sports four integrated DDR3 memory controllers, or one controller per twelve cores.
Intel calls x86 cores inside the SCC as “Pentium-class” cores since they are superscalar in-order execution engines, but stresses that those are not cores used inside by the original Pentium (P5) processors since they have been enhanced in order to achieve certain goals and make the design suitable for implementation into the experimental chip. Considering that SCC lacks any floating point vector units, raw horsepower of the chip is relatively weak.
Intel SCC is not supposed to become an actual product by definition. The design, peculiarities and single-thread performance of the prototype will hardly satisfy actual users. The chip is purely a prototype that will help Intel and software developers to determine directions for future development of the microprocessors and software.
"The SCC is a research vehicle, we wanted it be as experimental platform as possible. Having this architecture, we have software data flow, management of execution; it is much better – for a development platform – to have this kind of capability rather than to have a fixed-function unit. Maybe, a fixed-function [data scheduler] is more efficient, but having this program [allows us to] give more flexibility to software organizations," said Sebastian Steibl, the director of Intel Labs Braunschweig, which is a part of global Intel Labs organization.
Apparently, the SCC was not only needed to test abilities of software makers, but also to explore how many simplistic x86 cores could be fit on a single chip. At present a co-designer of the SCC declares that it could be possible to build a thousand-core processor using the same architecture in the next eight or ten years.