Search<%BANNER[left_130x130_1]%>
<%BANNER[left_130x300]%>
<%BANNER[left_130x130_2]%>
InformationX-bit Labs for mobile users! Do not forget that we are running a special version of X-bit Labs web-site for users of mobile and handheld devices: http://pda.xbitlabs.com. Check out our news and articles from smartphones and PDAs to be always updated on the latest computer and technology news. <%BANNER[right_130x600]%>
|
<%BANNER[top_768x90]%>
|
|
|
<%BANNER[banner_468x60]%>
Articles: Editorial
NVIDIA Editor's Day 2004 Coverage (page 4)Category: Editorial [ 12/15/2004 | 09:08 PM ] This is where programmable shading comes in handy. Taking into account increasing shader complexity, larger amount of complex pixels generated per pass and more mathematics to be performed per single pixel, programmable shading will in the end turn out more efficient. The table below compares the bandwidth involved into pixel processing by DX8 and Programmable shading:
As you see, in case of DX8, for every final pixel there are 128bytes of traffic involved. In case we have programmable shading or TurboCache technology active we get two clock cycles per pixel, 175million pixels every second, which is 22GB/sec worth of bandwidth. Instead of using multi-pass complexity they write shader programs and use math thus involving only 44bytes per final pixel. Pressure on the frame buffer is much smaller this way. Applications make/ increasing use of shaders, they are becoming more programmable. Therefore, this saving of the bandwidth is highly efficient. They re-architected 3D pipeline for TurboCache. Take a look: TurboCache features the ability to read from system memory and to render into system memory very efficiently (100%). The cache in TurboCache is always at least one piece of memory available locally or allocated from the frame buffer directly attached to the GPU. In terms of performance, it translates into the possibility to use less local memory and bring down the cost of the graphics system as there is no need to use as many physical memory chips on the PCB as before. Within the TurboCache concept local frame buffer is none other but a software managed graphics cache. Some memory still must be local, but not too much. All graphics drivers take system memory. Turbocache simply extends that functionality to renderable surfaces. Memory allocation and de-allocation is still limited by the total available memory, but this memory is used and released only as needed, so that no memory remains locked down. Now let’s say a few words about the bandwidth provided by the TurboCache technology As you can see on the second scheme, the situation is a little bit different. We see less data can be transferred upstream. The amount of memory consumed is limited by the bus between GPU and core logic. <%BANNER[banner_468x30]%>
|
<%BANNER[right_130x130_1]%>
|
|
<%BANNER[foot_728x90]%> | ||




