X-bit Labs for mobile users! Do not forget that we are running a special version of X-bit Labs web-site for users of mobile and handheld devices: http://pda.xbitlabs.com. Check out our news and articles from smartphones and PDAs to be always updated on the latest computer and technology news.
The core now has four pixel pipelines. To be more exact, GMA 900, like the previous Extreme Graphics 2 core, has only one pixel pipeline, but unlike its predecessor, it processing four pixels at a time – all modern graphics processors use the same organization of the pixel pipework.
GMA 900’s “four-pixel” pipeline has four texture-sampling units, equivalent to a scheme with four independent pixel pipelines with one texture-mapping unit per each. Graphics Media Accelerator 900 can render one texture on a pixel per cycle, and rendering of each next texture requires an additional cycle. That’s exactly how modern GPUs work.
GMA 900 features hardware support of DirectX 9 pixel shaders. It means that modern applications using Shader Model 2.0 can start and run on a GMA 900 system without any problems or quality loss. Unfortunately, Intel didn’t publish yet any info about the supported calculation accuracy during execution of shaders, while special-purpose test utilities like Xbitmark, Shademark and others, as if conspiredly, refused to run on the i915G – they all require hardware support of DirectX 9 vertex shaders.
The new graphics core from Intel has no hardware support of vertex shaders or T&L. All the geometry transformations are calculated by the central processor of the system. The company presents this as an efficient use of system resources and relies on the power of its processors – the user shouldn’t pay for a more complex graphics core if the CPU can handle geometry calculations all right. However, things are not so bright in reality: discrete graphics processors have long had hardware support of vertex shaders and their special-purpose vertex units are no inferiors even to the topmost Intel CPUs as concerns fast shader execution.
Intel’s GMA 900 employs the tile-based architecture – “Zone Rendering Technology 3” is Intel’s term for it. This technology works like that:
Before drawing the image, the driver first waits till the application provides all the polygons necessary for the rendering. Then, for each tile (a “zone” in Intel’s terminology, it is a rectangular fragment of the image) lists of triangles that fully or partially cover it are produced.
When rendering a frame, the graphics processor renders tile after tile, using the polygon lists created at Step 1 as source data, until the entire frame is rendered.
This operation scheme has both advantages and shortcomings. The advantages include:
The use of small fragments, tiles, allows for an efficient use of the GPU’s caches since small amounts of homogenous data are operated upon.
Having drawn a tile, the GPU never turns back to it in the frame-creation process. Considering that a tile has a small size, the fragments of the frame buffer and the Z-buffer, corresponding to this tile, can be wholly loaded into the GPU’s cache. Thus, the graphics processor does all of its calculations “on-die”, using the cache, rather than system memory. After the tile is drawn, the contents of the tile frame-buffer and the Z-buffer are written into the system RAM. Caching of the frame and Z buffers allows alleviating the load on the memory bus by performing data transfers in larger blocks. This is most important for an integrated chipset, whose graphics core has to share the memory bus with the central processor.
Intel’s GMA 900 has a special unit for checking the values of Z pixels. If the check says some group of pixels won’t be visible, it is excluded from further processing. This Z-checking helps GMA 900 to avoid performing unnecessary work – like texturing or shader execution – for invisible pixels and to be most efficient at rendering scenes with a high overdraw parameter, which reflects the level of overlapping of the objects (or the number of “redraws” of a pixel).
The disadvantages of a tile-based architecture are mostly related to how it processes geometry data:
In order to create the polygon lists correctly, the tile-architecture graphics processor has to wait for all the geometry data, necessary to build a frame, to come in, and only then it starts rendering the scene. GPUs of the traditional architecture begin to process streams of geometry data and render the scene right after they start receiving the data.
The need to sort the polygons and create lists for each tile badly conforms to the well-established stream-n-pipelined operation algorithm of vertex processors. This is probably the reason for GMA 900 to offer no hardware support of T&L and vertex shaders, while all the geometry data as well as the polygons sorting are performed by the central processor.
Graphics Media Accelerator 900 doesn’t seem to have full-screen antialiasing. At least, the driver’s control panel doesn’t offer this function. Of course, FSAA is not a very important feature for an integrated graphics core, considering its overall low level of performance. However, it would come in handy in simple 3D games where the core would have some performance reserve.
GMA 900 supports anisotropic texture filtering of up to 4x level. Anisotropic filtering cannot be combined with tri-linear filtering: the latter is disabled when you enable the former.
GMA 900 supports Dynamic Video Memory Technology version 3.0. Thanks to DVMT, the system memory becomes “graphics memory” when it’s necessary and in the necessary amounts; it is flushed up for the needs of the OS after it is no longer in use by the GPU. Thus, the OS and the GPU share the system memory in the most efficient and balanced way.
That’s how GMA 900 works with memory:
The memory amount necessary for the graphics core is divided in two parts. The first and smaller part – Preallocated Memory – is the GPU’s domain; the operating system cannot use it and regards it as regular graphics memory. You can set up the size of this memory area in the BIOS into 1MB or 8MB.
The other part is provided for GMA 900 by DVM Technology. Three DVMT modes are supported:
In the “Fixed” mode, a fixed-size fragment of the system memory is allocated to the graphics core. It can only be used by the graphics core; its size can be set to 64 or 128MB.
In the “DVMT” mode, the driver of the graphics core uses the system memory like any other OS component or application does. If a “heavy” 3D game starts up, requiring a lot of memory for textures, geometry data and so on, and there’re no other memory-hungry applications running, the required memory amount is automatically allotted to the graphics core. When the GPU doesn’t need the surplus memory, it automatically hands it over to the OS. The maximum amount of memory, given to the GPU in this mode, is 224MB (the preallocated memory included).
In the “Fixed+DVMT” mode, the graphics processor gets a fixed-size chunk of 64MB of memory (preallocated memory included) and up to 64MB of dynamically-allotted memory. This mode guarantees that at least 64MB of memory is available to the graphics core, with a possibility to increase this amount to 128MB, if necessary.
So, the new graphics processor from Intel is an ambiguous figure that combines an efficient tile-based architecture, support of DirectX 9 pixel shaders and flexible control over memory with such deficiencies as the lack of hardware support of T&L and vertex shaders, unavailable FSAA and high texture filtering modes (tri-linear plus anisotropic filtering).
Today I’m going to test Graphics Media Accelerator 900 and compare it to potential and actual competitors. The description of our testbed and testing methodology follows.