Bookmark and Share


Penguin Computing, experts in high performance computing (HPC) solutions, on Wednesday announced that the company has successfully installed the world’s first HPC cluster powered by AMD accelerated processing units (APUs) at Sandia National Labs in Albuquerque, New Mexico.

The Altus A2A00 system comprises 104 servers powered by A-series Fusion Llano APUs (one chip per server) with four x86 cores and 320/400 stream processors that are interconnected through a QDR Infiniband fabric. It delivers a theoretical peak performance of 59.6TFLOPs. The Altus 2A00 was specifically designed by Penguin Computing, in partnership with AMD, to support the AMD Fusion APU architecture. It is the world's first Fusion APU system in a rack mountable chassis in a 2U form factor.

"With the Altus 2A00, Penguin is the first to bring AMD’s unique APU capabilities to the HPC community. We are extremely proud of our successful deployment of this platform on such a large scale. We believe that the high level of integration and the resulting benefits for HPC users will further accelerate the adoption of the GPU processing model in HPC. The APU architecture has the potential to become a key component of future exascale systems," said Phil Pokorny, chief technology officer of Penguin Computing.

Numerous high-performance computing (HPC) customers have deployed systems with compute accelerator cards, such as Nvidia Tesla or AMD FireStream solutions that are based on graphics processing units (GPUs). Those systems are powered by both x86 microprocessors as well as compute card and are used to achieve record HPC performance in highly-parallelized tasks. The complexity of GPU-accelerated clusters is very high, just like its power consumption and usage of chips with both x86 cores and multitude of stream processors greatly reduces both. The purpose of the particular machine is not to set records in terms of performance, but to allow AMD Penguin to understand challenges for hybrid processors in HPC in order to create efficient hardware and software in the future.

The APU includes 400 parallel processing cores that can be leveraged for HPC applications through the OpenCL programming framework. Unlike conventional GPU server architectures, APU parallel multiprocessors share the same physical memory space with CPU cores. As a result, the programming model for APUs is simpler, bottlenecks for data movement between GPU and main memory are avoided and data duplication is eliminated. These capabilities offer particularly compelling benefits when deployed in conjunction with low-latency RDMA interconnects such as Infiniband, as they allow for building efficient distributed GPU applications.

"We are interested in research on next generation computer architectures and look forward to collaborating with Penguin and AMD to advance power-efficient computing strategies. This first of a kind cluster of Altus 2A00 servers will support our exploration of advanced programming models like OpenCL, which seamlessly map MPI applications to the CPU and GPU cores, and research into system software support for advanced data movement capabilities,” said James Ang, manager of the scalable computer architectures department at Sandia National Labs.

Tags: AMD, Penguin, HPC, GPGPU, ATI, FireStream, Radeon, Fusion, Llano


Comments currently: 5
Discussion started: 11/03/11 10:17:31 AM
Latest comment: 11/04/11 06:12:46 PM
Expand all threads | Collapse all threads


This should make many HPC customers very happy as they can use an APU instead of discrete graphics cards. There is a significant cost savings, lower power consumption and lower heat output, all good things. Mainstrean desktops will be using this approach in a few more years with primarily high-end enthusiasts systems using discrete Vid cards.
1 0 [Posted by: beenthere  | Date: 11/03/11 10:17:31 AM]

show the post
0 4 [Posted by: madooo12  | Date: 11/03/11 01:24:36 PM]
- collapse thread

comparison purposes and fine tuning software
0 0 [Posted by: aislanluiz  | Date: 11/04/11 04:26:13 AM]

Rory should not have fired sales people of AMD, they had done unbelievable, they sold a cat in blackbox to somebody very stupid.
But, where are Opteron 62xx? The year is quickly running to the end. Will we see it? Apparently, to cook just one Opteron AMD needs to burn a few dice due to very low 32nm SOI yields.
Sorry, guys, but next year AMD wilol be out of a server business.
0 1 [Posted by: Azazel  | Date: 11/04/11 04:33:18 AM]

OK - a laymans view - but what if the weak link is the bus between the cpu & the gpu for these kind of applications & of course - power consumption

a 6800 type gpu & a sufficient cpu for task sheduling for the gpu on 1 chip at 28nm is doable for amd in the near future. The distances separating them is tiny & the bus could be cranked up to huge speeds yet be v power frugal. pcie would seem a power hungry slug by comparison.
0 0 [Posted by: msroadkill612  | Date: 11/04/11 06:12:46 PM]


Add your Comment

Related news

Latest News

Monday, July 21, 2014

12:56 pm | Microsoft to Fire 18,000 Employees to Boost Efficiency. Microsoft to Perform Massive Job Cut Ever Following Acquisition of Nokia

Tuesday, July 15, 2014

6:11 am | Apple Teams Up with IBM to Make iPhone and iPad Ultimate Tools for Businesses and Enterprises. IBM to Sell Business-Optimized iPhone and iPad Devices

Monday, July 14, 2014

6:01 am | IBM to Invest $3 Billion In Research of Next-Gen Chips, Process Technologies. IBM to Fund Development of 7nm and Below Process Technologies, Help to Create Post-Silicon Future

5:58 am | Intel Postpones Launch of High-End “Broadwell-K” Processors to July – September, 2015. High-End Core i “Broadwell” Processors Scheduled to Arrive in Q3 2015

5:50 am | Intel Delays Introduction of Core M “Broadwell” Processors Further. Low-Power Broadwell Chips Due in Late 2014