Future is FUSION-
AMD FirePro™ S10000(28nm 2D) delivers up to 5.91 TFLOPS of peak single precision and 1.48 TFLOPs of peak double precision floating point performance, compared to Nvidia Tesla K10 that is capable of up to 4.58 TFLOPS of peak single precision and 190 GFLOPs double precision peak floating point performance and even better than latest k20x(28nm 2D) and xeon-phi(22nm 3D) in double precision performance the best thing with s10000 is unified design.
AMD can do even better than xeon-phi with ARMv8 modified cores so lower TDP, lesser die size(one of the the most important to get denser server)
i m sure intel will use 3D memory stacking to achieve that performance in lower TDP, at that time AMD will also have access to 14nm-XM and for 3D memory AMD had that running successfully since 2011 in labs.
actually this all is waste coz of AMD hetrogenous APU/SoC. Scientists said with help of HSA APU/SoC it's possible to make petascale supercomputer in 1/3rd of power consumption which is beyond reach of xeon-phi and tesla or Fire-pro .
AND FOR ALL INFORMATION IN 2013 AMD WILL LAUNCH A SOC OF 25GFLOPS/WATT WHICH IS EVEN HIGHER THAN 2015 XEON-PHI AND NVIDIA TESLA BUT WE CAN'T BUY THAT AMD SILICON IT'S ONLY FOR SOME PARTNERS.