<%BANNER[top_768x90]%>

<%BANNER[banner_468x60]%>

NVIDIA Editor's Day 2004 Coverage

On December 7 NVIDIA arranged a technology seminar in San Jose called NVIDIA Editor’s Day, where hardware editors and testers from multiple on-line and off-line publications could meet NVIDIA developers and engineers, talk about the latest technology and the upcoming solutions. Read more about this event in our coverage!

by Anna Filatova
12/15/2004 | 09:08 PM

On December 7 NVIDIA arranged a technology seminar in San Jose called NVIDIA Editor’s Day, where hardware editors and testers from multiple on-line and off-line publications could meet NVIDIA developers and engineers, talk about the latest technology and the upcoming solutions. From the very beginning we expected to finally get some answers to the questions that kept troubling us for s long time now, especially since a lot of innovative initiatives and new products have just recently come into the market.

<%BANNER[article]%>

Well, since I am just one hour away from San Jose, I couldn’t miss this opportunity. Today I would like to share some of the interesting things we have been discussing these days with NVIDIA people.

Latest and Greatest News from NVIDIA

The day started with an opening speech by Jen-Hsun Huang, NVIDIA President and CEO. If you have been reading our reports from last year’s NVIDIA Editor’s Day, you should remember that last time we were talking a lot about cinematic computing, its potential and importance for the market. It is clear that programmable shaders have changed the computing today All kinds of new effects have come along because of the programmable shaders implementation and development. The great variety of programmable media allows you to express your artistic goals with highest efficiency.

Jen-Hsun admitted that in the last year they not only made quite a few good decisions, bud also a few bad ones. Among the good things that they managed to put into life last year he named making everything 32bit compatible, investing in compiler technology for real time compilation, enhancing and expanding the programmable shaders technology. As for the “bad” decisions, NVIDIA CEO admitted that one of them was GeForce FX architecture that turned out hard to program and inefficient to manufacture from the production costs point of view.

Despite the fact that NVIDIA did go through some tough period of time, the last couple of years turned out the most fruitful. We did the best work we have ever done in NVIDIA, Jen-Hsun said.

Right now there are the following major goals that stand in front of NVIDIA Company. First of all the company is very aggressive about recapturing their leadership position in the market. This will only be possible if they continue to introduce successful and demanded solutions. Also NVIDIA is planning to work hard on bringing more extraordinary competitive features to the marketplace, so that they could expand the reach of the market and of the products, and could bring the benefits of the GPUs to a broader market.

In fact NVIDIA started working on achieving these goals several months ago with the launch of GeForce 6. Besides that a few important initiatives have also been introduced. First one was implemented in the mobile segment. They decided to create a form fit electrical thermal standard that could be supported on an ongoing basis in the laptop industry. It could help the industry adapt technology more rapidly.


The second initiative was to extend the reach of the gaming platforms mostly with a powerful graphics card. It was one of the first steps towards building a gaming super-computer around the PC architecture. Here the main idea was to redesign system architecture, GPU architecture and software accordingly. As a result we got the solution known as SLI. It will not just bring dual-graphics to the gaming system. Its point is enabling new categories of computers in the market. You are going to see dual GPUs (maybe more than dual), dual-core processors (maybe more than dual) bringing you to new level of graphics performance. And in the first place it addresses really specific audience that loves to get extra performance.

Besides that NVIDIA also undertook fundamental redesign of the way graphics cards are built by introducing turbo-cache technology. We are going to talk more about it later in this article.

Another initiative, which you might have already heard of before is called Pure Video. Computer has much more technology than any consumer electronic device nowadays. As we go into the high-definition transition next year NVIDIA believes they have to develop technology that can deliver one of the best experiences in high-definition digital video.

The goal is to invest into digital media technology so that it could organically merge with the culture of our time. And this is when MCP (media communications processor) turns out the actual implementation of this initiative. NVIDIA didn’t call it a chipset, like many other core logic developers as the word “chipset” doesn’t actually say anything. The idea behind our MCP is not to be just a bridge. It is not only a large package to connect the CPU and memory. It certainly performs this function, but in the future it will do a lot of other things. That is why NVIDIA decided to call it a Media Communications Processor. This business is very important to the company, according to Jen-Hsun. Especially, since it looks like NVIDIA has just finalized their cross-licensing agreement with Intel, so that they will be able to finally bring nForce into Intel platform (see this news story for more details). nForce franchise will really stretch into various markets, such as platforms for AMD and Intel, solutions for desktops, workstations and laptops.

Of course, NVIDIA is perfectly aware that AMD would have preferred nForce 4 to be an exclusive AMD platform. However, according to NVIDIA’s vision of the situation both, AMD and themselves, want the same thing – that the consumers have the choice. This way going with Intel will supposedly not change NVIDIA’s relationship with AMD. But the market needs Intel platform too, and right now there is a lot of enthusiasm there.

There is nothing about Intel platform that limits NVIDIA’s innovation. The question is how we should actually encourage innovation? NVIDIA’s major strategy in this case is not to be afraid to make mistakes and not to punish for making mistakes. “When we do make a mistake, we step back, enjoy it and try to make sure we avoid it next time”, - said NVIDIA’s CEO.


System-on-a-chip is the future of computing. Right now GeForce 6 has 250million transistors. Next generation will be from 0.5-1 billion transistors. These systems are going to come one day, the only question is how they become relevant in the market.

Image processing is also picking up as one of the most popular tasks nowadays. The media processor business is also very important for the consumer electronics. And already now we see a lot of steps undertaken in this direction. On Tuesday night, for instance, next generation Sony Playstation was announced. So far NVIDIA refused to disclose any more details, but we know that it is going to be a very powerful platform. NVIDIA is building a custom GPU on the architecture beyond GeForce 6 for Playstation 3. Graphics technology is certainly one of the most important things NVIDIA is working on regarding this platform, but there is the whole variety of other things including software tools and other technologies, that are also worth paying special attention to. You can also find more info about this upcoming gaming console in this news story of ours.

As for the long-term projects and direction NVIDIA is going to take, it looks like the next 20 years will be about digital media, consumer electronics, intersection of computers and consumers, and this is where NVIDIA is strong right now and will remain stronger ever after, because this promises to be a very large market.

Introducing NVIDIA GeForce 6200: More Details about TurboCache Technology

The next presentation was delivered by the General Manager for desktop GPUs - Ujesh Desai and his colleagues. The major highlight of this presentation was the new TurboCache technology, which is supposed to become a new word in the entry-level market segment.

The major advantages of the new technology should be really tremendous. First of all, TurboCache is going to fundamentally redefine the price to performance ratio that NVIDIA will deliver to an entry level PC. TurboCache allows direct rendering to system memory and significantly reduces local frame buffer requirement. Also TurboCache delivers shader model 3.0 to entry level solutions.

As an example of the TurboCache technology in operation we had the chance to see a demo from the new Half-Life 2 game running on a system with GeForce 6200, high quality textures, and 1024x768 screen resolution.

NVIDIA architectured 6200 around PCI Express interface. As you know, it is a very high bandwidth bi-directional bus. The local frame buffer is typically 64bit wide. PCI Express offers more bandwidth than most of the graphics processors have when they are attached to the local frame buffer.

Modern 3D applications are increasingly shader, texture and storage intensive. They have a lot of surfaces to be rendered and have very high texture and effects requirements. Namely they require on average 256MB of addressable memory if the quality settings are set all the way up.


This is where programmable shading comes in handy. Taking into account increasing shader complexity, larger amount of complex pixels generated per pass and more mathematics to be performed per single pixel, programmable shading will in the end turn out more efficient. The table below compares the bandwidth involved into pixel processing by DX8 and Programmable shading:

As you see, in case of DX8, for every final pixel there are 128bytes of traffic involved. In case we have programmable shading or TurboCache technology active we get two clock cycles per pixel, 175million pixels every second, which is 22GB/sec worth of bandwidth. Instead of using multi-pass complexity they write shader programs and use math thus involving only 44bytes per final pixel. Pressure on the frame buffer is much smaller this way. Applications make/ increasing use of shaders, they are becoming more programmable. Therefore, this saving of the bandwidth is highly efficient.

They re-architected 3D pipeline for TurboCache. Take a look:

TurboCache features the ability to read from system memory and to render into system memory very efficiently (100%). The cache in TurboCache is always at least one piece of memory available locally or allocated from the frame buffer directly attached to the GPU.

In terms of performance, it translates into the possibility to use less local memory and bring down the cost of the graphics system as there is no need to use as many physical memory chips on the PCB as before.

Within the TurboCache concept local frame buffer is none other but a software managed graphics cache. Some memory still must be local, but not too much. All graphics drivers take system memory. Turbocache simply extends that functionality to renderable surfaces. Memory allocation and de-allocation is still limited by the total available memory, but this memory is used and released only as needed, so that no memory remains locked down.

Now let’s say a few words about the bandwidth provided by the TurboCache technology

As you can see on the second scheme, the situation is a little bit different. We see less data can be transferred upstream. The amount of memory consumed is limited by the bus between GPU and core logic.


As we saw on the diagrams the more bandwidth you have the better you get. TurboCache uses this bandwidth effectively. For instance, GeForce 6200 with TurboCache uses 1 piece of memory and is 25% faster in most games when we compare it against ATI RADEON x300 SE with 4 physical pieces of memory. When we compare it against ATI RADEON x300LE the performance difference makes about 50%.

In case of the local memory it is directly attached to the frame buffer. Here the latency is longer that is why we need to design longer pipelines to cover for longer latency. Local frame buffer can be faster than system memory if we can run it at higher frequency.

TurboCache allows to bring better performance at lower price. One way to do it is to reduce the cost of memory and thus increase the investment into the processor.

I believe that you might have already heard about ATI Hyper Memory technology. So did NVIDIA. According to them, the story behind this technology is the following. Architecting a GPU in hardware is not a trivial. During the development of this solution some info leaked and ATI somehow managed to learn quite a lot about it. At first they started saying bad things about NVIDIA’s TurboCache technology, so that the customers got somewhat concerned about its success. And then all of a sudden Hyper Memory was announced. They didn’t make and hardware modifications on the chips. All the data are read only in the local memory, which everyone have been doing since the AGP interface came on stage. There was no rendering into the system memory, which means that this technology couldn’t work in a class of applications. NVIDIA didn’t respond by an immediate announcement of the technology, because they claim they wanted to have a product ready first. Again, I would like to stress that this is NVIDIA’s story. It is up to you to believe it or not. As for our opinion about the Hyper Memory and TurboCache efficiency, we would retain our verdict until the official release of the graphics cards supporting TurboCache, when we will be able to offer you a bit more than just speculation. So, be patient :)

On the mobile side TurboCache will also be a highly efficient. Mobile solutions are small, so fewer physical memory chips onboard make a lot of sense. The second advantage, is that DRAM takes power, so if we have less DRAM, we will require less power and hence get more battery life. So, TurboCache is a double win on the mobile side.

As you can see, these boards are equipped with a passive heatsink, may have 1 or 2 pieces of memory onboard, will be designed for PCI Express x16 interface. NVIDIA’s partners should start shipping their solutions based on GeForce 6200 with TurboCache technology around late January – early February 2005. The cards will be priced from $79-$129 depending on the specification. However, the first samples should start appearing next week already.

Since we mentioned the price a good question arises: how many customers considering a purchase of a $100 card actually have the required 512MB of system memory in their system, which is supposed to be a minimum requirement for the PCI Express system? In fact NVIDIA didn’t give a definite answer to that. They only claimed that their TurboCache technology also works in systems with smaller amount of system memory, which however will imply that we will dynamically allocate less system memory for the graphics needs.

In conclusion about the desktop solutions I would like to add that NVIDIA also announced that they would start shipping GeForce 6600 for AGP interface in February. This way they will have a complete product line-up for both: PCI Express and AGP interfaces.

<%BANNER[banner_468x30]%>