http://forums.amd.com/dev...d=208&threadid=112934
Furthermore you could probably combine those 2x128bit FP Units to calculate 1 256bit instruction.

AMD Discloses First Details About Next-Generation Bulldozer Processor
[11/12/2009 03:44 PM]Advanced Micro Devices this week disclosed the first details about its next-generation Bulldozer processor that is due in 2011. Although specifications of the chip seem to be rather promising at this point of time, in about one and a half years from now the central processing unit (CPU) may face a too strong rival and repeat the history of its predecessors.
Based on the information provided by AMD during its annual Analyst Day in November, the first Bulldozer chip code-named Zambezi (which belongs to Orochi family, according to the firm) will feature eight x86 processing engines with multithreading technology, two 128-bit FMAC floating point units, shared L2 cache, shared L3 cache as well as integrated memory controller. AMD also states that the new CPU will feature “extensive new power management innovations”.

The implementation of 128-bit FMAC is quite logical: AMD’s SSE5 set of extensions do feature 128-bit multimedia instructions as well as 128-bit three-operant instructions. In fact, there is a trend of increasing of precision of floating point instructions, as we can observe from the last decade.
What is important to note is that Intel Corp.’s forthcoming Sandy Bridge processor features Advanced Vector Extensions (AVX), which support 256-bit FP operations, something very progressive. Both AMD and Intel have already released documentation regarding AVX and SSE5 for developers, but Intel managed to unleash a new compiler supporting AVX in June ’09, whereas AMD has not managed to roll-out an SSE5-supporting tool. As a result, the vast majority of developers are already capable of creating AVX-capable software; however, almost no designers can make SSE5-capable programs at the moment.
Nevertheless, based on the diagram that AMD demonstrated, the company intends to dramatically improve multithreading performance of its CPUs: two INT schedulers, an FP scheduler and separate data caches for each of four cores should do the job very well.
AMD has not released any data regarding performance of Bulldozer chip, unfortunately, but since the chip designer positions the unit as a solution for desktop and server solutions in 2011, it does expect this 32nm SOI with high-k metal gate power-house to be a high-performer.
Tags: AMD, Bulldozer, Zambezi, Orochi, , SSE5, AVX, 32nm
9:02 pm | Via Technologies Kicks Off $49 Android Performance Computer Initiative. Via Starts APC Initiative: $49 Android PC
6:58 pm | AMD Reveals Trinity Accelerated Processing Unit for Embedded Decides. AMD Unveils R-Series APU with Trinity Architecture
10:40 pm | Hard Disk Drives' Densities to Double by 2016 - Analysts. 60TB Hard Disk Drives Imminent This Decade
7:59 pm | Intel Previews World's First 1Gb/s Internet Modem. Intel Shows DOCSIS 3.0 Cable Modem with 1Gb/s Speed
12:08 pm | Nvidia Denies Plans to Recall GeForce GTX 600 Due to Performance Degradation. Nvidia: GK104 Processors Do Not Suffer Performance Issues
11:20 am | Nvidia Rumoured to Recall GeForce 600-Series Graphics Cards [UPDATED]. Nvidia GeForce GTX 600-Series Rumoured to Suffer from Eventual Performance Degradation
