Information

X-bit Labs for mobile users! Do not forget that we are running a special version of X-bit Labs web-site for users of mobile and handheld devices: http://pda.xbitlabs.com. Check out our news and articles from smartphones and PDAs to be always updated on the latest computer and technology news.

 

Articles: Video

ATI RADEON X800: R420 Totally Exposed (page 7)


Category: Video

by Tim Tscheblockov , Alexey Stepin , Anton Shilov

[ 05/04/2004 | 06:42 AM ]


Pages : 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34

The number of available temporary registers, which can be used during shader processing, has been increased in R420 compared with what we had in R3x0: they increased their number from 12 to 32. This way, R420 appeared even less sensitive to the shader complexity: as you remember, pixel processors of NVIDIA NV3x lost much of their efficiency during complex shaders processing, which required involving a lot of temporary registers.

The internal calculations precision in floating-point calculations performed by the pixel processors remained the same: RADEON X800 supports the data in 16bit and 32bit floating-point format, but performs all calculations in 24bit format only.

The computational power of the pixel processors of the new RADEON X800 got significantly higher compared with the previous architecture: the number of scalar and vector arithmetic-logical units (ALUs) grew twice as big now. The former ATI pixel processors with one vector ALU, one scalar ALU and one texture addressing unit could perform up to three instructions per clock cycle for each pixel, however, ATI RADEON X800 has twice as many scalar and vector ALUs that is why its pixel processors can perform up to 5 instructions for each pixel per clock cycle.

Besides all other improvements, RADEON X800 pixel processors can now process much longer shaders compared with what the previous generation solutions could do. The maximum number of scalar and vector mathematical instructions was increased from 64 to 512, and the number of texturing instructions – from 32 to 512. Altogether the maximum number of texturing instructions, as well as scalar and vector mathematical instructions grew from 160 to 1536.

The drastic increase in the maximum number of shader instructions allows RADEON X800 to perform much more complex and resource-hungry pixel shaders in a more efficient way. If the shader turns out so complicated that the VPU resources will turn out insufficient to process it within a single pass, RADEON X800 graphics processors will split its processing into a few stages: the shader will be divided into a few fragments, which will then be processed one by one, And the intermediate results for each fragment will be temporary stored in a special buffer called Fragment Stream FIFO Buffer. It is exactly the F-Buffer, which was officially announced together with the RADEON 9800/9800 PRO solutions, together with a number of improvements intended for higher efficiency of multi-pass shader operations.

Compared with the previous architecture, ATI RADEON X800 boasts wider functionality, however, it nevertheless doesn’t support dynamic branching and looping in the pixel shaders. Therefore, ATI cannot claim the support of shader version 3.0 unlike 2.x, as the support of dynamic loops and branching is a requirement for the shaders version 3.0.

ATI certainly sees the benefits of the 3.0 shader model, but they believe that the time of 3.0 shaders hasn’t come yet: the use of dynamic loops and branching by the existing architectures inevitably results into a performance drop. Even NVIDIA warms against careless use of dynamic branching, and this is definitely not the thing the manufacturers want. At the same time, introducing the corresponding support would require a significant revision of the RADEON X800 pixel processors architecture, which have been intended for non-linear shader processing from the very beginning. As a result, having weighed all cons and pros, ATI engineers decided to give up fully-fledged shader version 3.0 support. Instead they are most likely to introduce a “cut-down” version of the standard without the support of dynamic loops and branching, which will be called 2.b.

The biggest advantage of the RADEON X800 pixel processors architecture, is their stable efficiency and predictable performance. Unlike NVIDIA’s reference cards, ATI RADEON X800, just like the previous ATI VPUs, reacts much calmer to the increase of temporary registers or change of the mathematical and texturing instructions during shader processing. It means that the developers will be able to create efficient shaders for RADEON X800 with less effort.

Here I should definitely say that with the launching of RADEON X800, ATI decided to follow in NVIDIA’s footsteps and introduce their own shader compiler optimizations. The higher computational capacity of the RADEON X800 pixel processors, should be used in the most efficient way. So, the primary goal is to minimize the number of situations when some ALUs of the pixel processors stay idle, therefore, the compiler will analyze the initial shader code and rearrange the instructions so that they could be processed in parallel.

A little bit later, I will also try to estimate in practice how efficiently RADEON X800 copes with pixel shaders, and now let’s dwell on vertex pipelines of RADEON X800.

<<< Previous page Next page >>>

Discussion

Comments currently: 30
Discussion started: 05/04/04
View comments

Add your Comment

Name/Nickname
Your Comments
 

Category News

Category: Video

Friday, July 25, 2008

12:31 pm Channel Vendors Demand Graphics Cards Suppliers to Recall Potentially Faulty Nvidia GeForce Graphics Cards. Resellers Want to Return Potentially Faulty Nvidia GeForce Graphics Boards to Makers

Thursday, July 17, 2008

5:48 am Microsoft Preps to Unveil DirectX 11 Features in Several Days. ATI, Nvidia, Microsoft to Discuss DirectX 11 Techniques at XNA, Siggraph

Wednesday, July 16, 2008

12:30 pm New Generation ATI Radeon for Mainstream, Mobile Markets are Ready. PCI-SIG Approves ATI RV730, M98-L, M96 Graphics Chips

7:22 am EVGA and XFX Reimburse Price Difference on GeForce GTX 200 after Price Collapse. EVGA and XFX to Return Money to GeForce GTX 200 Purchasers

Tuesday, July 15, 2008

4:23 pm Startup Promises to Revolutionize Multi-GPU Technology Early Next Year. LucidLogix Unveils Hydra Distributed Processing Engine

 
News Archive
All Latest News