News
 

Bookmark and Share

(2) 

Graphics business unit of Advanced Micro Devices this week unveiled its new graphics processing unit (GPU) code-named Cayman, which significantly changes the architecture of ATI Radeon graphics processors, but brings relatively small performance improvement compared to the previous generation of products.

“AMD Radeon HD 6900 series graphics feature AMD’s second-generation, DirectX 11-capable architecture, new image quality improvements and up to 2GB of graphics frame buffer, making it a great choice for gamers and enthusiasts,” said Matt Skynner, corporate vice president and general manager of GPU division at AMD.

The Radeon HD  6900 "Cayman" GPU sports a number of major architectural changes compared to the previous-generation graphics processors.

  • Firstly, the Cayman has two independent graphics engines, each of which has its own vertex assembly, geometry assembly, tessellation, backface culling, clipping, rasterization/HyperZ and so on units. As a result, the chip has a peak throughput of 2 primitives per clock while maintaining a peak rasterization rate of 32 pixels per clock, a significant improvement over the previous-generation products.
  • Secondly, the Radeon HD 6900 changes the stream core (SC) to the so-called VLIW4 (very long instruction word) architecture. Previous-generation graphics processors featured VLIW5 architecture stream cores and each of the SCs featured four simple arithmetic logic units (ALUs, or processing elements as developers sometime call them) for simplistic operations and one so-called transcendental arithmetic logic unit capable of performing one complex instruction per clock. While the architecture survived for over four years, engineers from the company claim that it was almost impossible to utilize all five ALUs at once due to difficulties with register management and natural complexity of writing a program that could use all five units at once. The new four ALUs are neither simplistic nor complex and can perform up to 4 FMA (or 4MAD or 4 MUL of 4 ADD, etc) operations per clock. According to developers, such architecture saves around 10% of die area while maintaining similar performance and simplifies scheduling and register management. As a result of SC architectural change, each SIMD of the Cayman processor 64 stream processors. AMD claims that the maximum amount of SIMDs per chip is 24 (1536 ALUs), but earlier the rumours claimed that it could be as high as 30.
  • Thirdly, the new chip supports the so-called asynchronous dispatch, which allows the GPU to perform multiple completely independent tasks from completely different applications in parallel. For example, if today a video game requires GPU to process both graphics and physics effects, then GPUs have to first compute physics and then process graphics. In case of Cayman it is possible to assign certain SIMD [single input multiple data] engines to certain tasks. Unfortunately, DirectX 11 and OpenCL 1.1 application programming interfaces do not support this capability, which is why it will be exposed only in the future.
  • In fourth, AMD brings in intelligent power tuning technology called PowerTune, which automatically adjusts GPU power draw by dynamically controlling clock-speeds.
  • Additional improvements include new render back end units, support of higher-quality antialiasing and improvements aimed at general purpose computing on GPUs.

 

“Delivering DirectX 11 performance with intelligent tessellation, image quality improvements with new anti-aliasing modes and AMD PowerTune technology, we believe AMD Radeon HD 6900 series graphics cards will make excellent gifts this holiday season," added Mr. Skynner.

Initially, AMD plans to release two graphics cards based on the code-named Cayman GPU:

  • Radeon HD 6870 – 1536 stream processing units, 96 texture units, 32 render back ends, 880MHz clock-speed and 256-bit memory bus. The chip incorporates 2.64 billion of transistors, has die size of 389mm2 and has maximum compute performance of 2.7TFLOP/s SP and 0.675TFLOP/s DP. The card can connect up to five displays at once (using 2x DVI, 2x mDP + HDMI connectors) and carries 2GB of 5.5GHz GDDR5 memory. The recommended e-tail price of the novelty is $369.
  • Radeon HD 6850 – 1408 stream processing units, 88 texture units, 32 render back ends, 800MHz clock-speed and 256-bit memory bus. The chip has maximum compute performance of 2.25TFLOP/s SP and 0.563TFLOP/s DP. The card can connect up to five displays at once (using 2x DVI, 2x mDP + HDMI connectors) and carries 2GB of 5.0GHz GDDR5 memory. The recommended e-tail price of the novelty is $299.

Even though the new-generation graphics cards offer lower amount of stream processing elements than predecessors from the Radeon HD 5800-series, it does offer the same compute performance and even manages to leave the previous-generation products behind in actual games, based on the early-look at the graphics boards' performance from X-bit labs.

The Cayman graphics processor was supposed to be made using 32nm process technology at Taiwan Semiconductor Manufacturing Company. Unfortunately for AMD's ATI division, TSMC scrapped the 32nm fabrication process and the chip designer had to redesign the chip and trim the number of its processing elements so to ensure high yields and moderate costs. If it was made using 32nm, it would have incorporated more SIMD engines and stream processors, which would have ensured much higher performance.

ATI Radeon HD 6970 2GB and ATI Radeon HD 6950 2GB are available now from various suppliers across the globe, according to AMD.

Tags: ATI, Radeon, Cayman, Northern Islands, AMD, DirectX, 40nm

Discussion

Comments currently: 2
Discussion started: 12/17/10 02:05:08 AM
Latest comment: 12/19/10 09:19:57 AM

[1-2]

1. 
I wonder if AMD will roll out a 28nm based 6000 series card like they did with the 4770, 40nm v 55nm for the other 4000 series cards, to see how they are?
0 0 [Posted by: GavinT  | Date: 12/17/10 02:05:08 AM]
Reply

2. 
Call me unimpressed. I will commend ATi for designing such an efficient GPU as they've obtained a lot of performance by increasing the number of transistors only by 25% but I'm not sure that the VLIW4 bet is one that they'll win. I hope they will and I hope that, as soon as the 28nm process becomes available, they'll hit us with a 3,5 billion chip that will rock the GPU Compute world.

The thing is that they still don't improve their tessellation performance clearly and they still have applications where their older VLIW5 tech beats the newest flagship. And this is something that should NEVER HAPPEN on a launch of a new product.

Congratulations for surpassing the 32nm cancelation obstacle but if you decide to build a BIG chip, then why the f don't you got just a tad bigger making sure that your older architecture isn't equal or better to the current one ?

I'm not sure how many aspects were taken into account when deciding onto the current design but I'd have waited just a little bit more and launched a "refreshed" HD 5800 version before going VLIW4 and make the smaller 6870, the experimental design.

Optimize the software and SDK for it. See how it works out and then introduce VLIW4 into the BIG chip design if everything went well and the move proved to be a good one performance wise.

I'm scared that AMD could loose this bet and, as their ATi division is the ONLY division scoring any success story right now, if they fail, AMD could very well be doomed. And we, as paying consumers along with it.
0 0 [Posted by: East17  | Date: 12/19/10 09:19:56 AM]
Reply

[1-2]

Add your Comment




Related news

Latest News

Monday, July 21, 2014

12:56 pm | Microsoft to Fire 18,000 Employees to Boost Efficiency. Microsoft to Perform Massive Job Cut Ever Following Acquisition of Nokia

Tuesday, July 15, 2014

6:11 am | Apple Teams Up with IBM to Make iPhone and iPad Ultimate Tools for Businesses and Enterprises. IBM to Sell Business-Optimized iPhone and iPad Devices

Monday, July 14, 2014

6:01 am | IBM to Invest $3 Billion In Research of Next-Gen Chips, Process Technologies. IBM to Fund Development of 7nm and Below Process Technologies, Help to Create Post-Silicon Future

5:58 am | Intel Postpones Launch of High-End “Broadwell-K” Processors to July – September, 2015. High-End Core i “Broadwell” Processors Scheduled to Arrive in Q3 2015

5:50 am | Intel Delays Introduction of Core M “Broadwell” Processors Further. Low-Power Broadwell Chips Due in Late 2014