<%BANNER[top_768x90]%>

<%BANNER[banner_468x60]%>

DDR vs. DDRII: Fight!

If you’re watching closely the events of the PC market, you should have noticed that the term “DDR2” has been used more frequently. So, I think it will be really interesting to know how this second generation differs from the previous one, what its advantages and shortcomings are. In other words, let’s try to figure out why the industry needs to take this memory type up.

by Victor Kartunov
05/19/2004 | 11:30 AM

If you’re watching closely the events of the PC market, you should have noticed that the term “DDR2” has been used more frequently. As you know, it stands for the second generation of DDR SDRAM (Double Data Rate Synchronous Dynamic Random-Access memory, if you’re not in the know). Platforms with support of the new memory type are starting out in this year. In the next year DDR2 is going to become widespread (or even predominant) memory type on the PC.

<%BANNER[article]%>

So it makes sense to know beforehand how this second generation differs from the previous one, what its advantages and shortcomings, if any, are. In other words, let’s try to figure out why the industry needs to take this memory type up.

Current processors are just gobbling down each data chunk the main memory can supply them. There’s no end to this process as the performance of the CPU is only growing with time. Accordingly, the memory should provide more data per second for the processor to be well-fed. No achievable memory speed seems to be absolutely enough, but the transition to DDR2 is a way to higher memory bandwidths, which removes the problem for a while.

As you know, performance of any memory is calculated by the formula:

Speed = Width x Frequency,

where Speed is the memory performance (Mb/s), Width is the width of the memory bus (in bits), and Frequency is the frequency at which the data are being transferred (in megahertz).

Thus, to improve performance, we need to increase either the memory bus or its operational frequency. Or both. Let’s see what the industry can do today.

Although there are many variations of RAM, they are all based around the DRAM cell, which is in fact a combination of a transistor and a capacitor. There have been numerous attempts to discard this elementary cell as obsolete, offering other data storage technologies like MRAM (Magnetoresistive RAM), FRAM (Ferroelectric RAM) and others, but without much success. No other memory type can provide a similar combination of capacity, cost and speed as the good old DRAM does.

There are faster elementary cells (like Static RAM – SRAM), but they are much costlier and larger, so the memory chip cannot reach the same capacity. There are cheap alternatives, which are noticeably slower and thus cannot be used as the computer’s main memory. There are cheap and fast alternatives, but they serve for a very short period of time.


In other words, the elementary DRAM cell remains the foundation of all the variety of modern memory types. Thus, all modern memory types inherited both advantages and shortcoming of DRAM: the need for regeneration and the operational frequency ceiling. Talking about the last parameter, you may notice that the clock rate of DRAM has only changed by one factor throughout its long history, while other PC subsystems have progressed much faster. This is only because the classic organization of the memory cell makes it difficult to increase its clock rate. In fact, the frequency grows only along with the reduction of the geometrical dimensions of the cell due to the constantly improving technological process.

Today, specially culled (but produced industrially) memory has a cell frequency of 275MHz (I mean DDR550 that some manufacturers like Hynix have announced). This method of progressing involves much spending into the transition to a thinner tech process. In fact, one up-to-date semiconductor factory costs over $2 billion, and that’s not the end. Note also that the real frequency of memory cells is not 550MHz as the marking suggests but twice lower. We’ll clear this fact out shortly.

Thus, we’ve got only one way left – to enlarge the memory bus. Again, our opportunities are limited: today, standard platforms use a dual-channel 128-bit memory bus. Designing, wiring and producing mainboards with such a bus is a much more complex matter than doing the same for the 64-bit bus. And wiring a 256-bit bus is so costly that it makes sense for servers only. Of course, this situation changes with time, but today such mainboards would be too expensive for the mass market.

So we find ourselves in a kind of deadlock. Memory cells don’t grow in frequency and the memory bus is not easily expandable. Where’s the way out?

There’s a method once suggested by Rambus. The idea can be expressed in one word: multiplexing. We can describe it in more detail, too.

SDRAM (Synchronous Dynamic Random-Access Memory)

First, let’s recall the operational principle of the now-obsolete SDRAM. In fact, it consists of an array of cells, Input/Output buffers and power/regeneration circuitry. The last item is of no importance for us for now.

All three subsystems work at the same sync frequency – that’s where “Synchronous” comes from. Let this frequency be 100MHz and the bus width – 64 bits, for example. The memory data are taken to the I/O buffers and then to the memory controller. A memory module on such chips is known as PC100 memory and has a bandwidth of 800MB/s (100MHz x 8 bytes or 64 bits). The data is transferred once per clock, on the rising edge of the clock signal.


DDR (Double Data Rate SDRAM)

DDR has its name because it can output data twice faster than SDRAM of the same frequency. That is, twice per clock cycle, on the signal’s rising and falling edges. But these data should be taken from somewhere, yes? The developers went for a trick: the memory cells are working at the same frequency, but the internal bus is wider to boost the data-transfer rate inside the chip. In other words, the internal path from the array of cells to the buffer is twice wider than the external path, from the buffer to the controller. The resulting frequency of the data-transfer rate from the buffer to the controller is twice higher than the frequency of the memory cells. That is, the data go from the cells along a wide bus, but then go to the controller along a narrower, but faster bus.

Let’s see it in numbers. Let the width of the internal bus of the chip is 32 bits and the cell array works at 100MHz. Then, with SDRAM, the buffer transfers data along the external 32-bit bus at the same 100MHz frequency. There’s no change in the data flow along all the way. Data are read from two chips at once and the whole module is 64-bit.

DDR is another matter. Now we have an array of memory cells that pumps data along the internal 100MHz 64-bit bus to the I/O buffer (or the level amplifier). But the data go to the controller along a twice as narrow, 32-bit bus. On the other hand, now data are transferred twice per clock, along the rise and fall edges of the signal. That is, the resulting data-transfer rate is twice as high as the original frequency of the memory cells. Now we have a simple and evident equation: the data flows slowly along a wide pipe, but then get into a twice-narrower pipe and start flowing faster. We have a kind of “Bernoulli’s principle” applied to computer science. The module is 64-bit, so two chips of the module are being read simultaneously. I don’t now pay attention to certain peculiarities like that the address can be only set for one memory bank at one moment and the other bank cannot get the address sooner than in a clock cycle.

Such memory was named DDR200 (by the resulting data-transfer frequency) or PC1600. Accordingly, DRAM cells in DDR266 memory work at 133MHz, in DDR333 – at 166MHz and in DDR400 – at 200MHz. Currently mass-produced DDR SDRAM (I don’t count in “memory for overclockers”) has grown to a frequency of 550MHz – that’s why I wrote 275MHz above, talking about the frequency of the array of DRAM cells. But it’s very problematic to raise this clock rate further. The industry may overcome the barrier of 300MHz, but what’s of it? The DDR technology has no performance reserve. The industry needs a new memory standard that would ensure a stable frequency and performance growth for some time more.

DDR2 memory is going to be the answer.


DDR2

The key principle of DDR2 is easy to understand once you learn what DDR SDRAM is. Like with DDR, the internal bank issues data to the I/O buffers along a broad 64-bit 100MHz internal bus. But now the data go from the buffer at a faster and narrower bus (16 bit, 200MHz), which uses the Double Data Rate trick. Thus, we achieve a resulting data-rate frequency of 400MHz! Accordingly, the 64 bits on the module output are made up by simultaneous transmission from four banks. Such memory module goes under the name of DDR2-400 – the marking system is similar to that of DDR, telling the resulting data-transfer rate to the memory controller.


Picture taken from http://www.lostcircuits.com/

Thus, for one and the same frequency of the array of DRAM cells – 100MHz – we have different memory module bandwidths. It is 800MB/s for SDRAM and 1600MB/s for DDR SDRAM and 3200MB/s for DDR2 SDRAM! Thanks to multiplexing, the memory module has a higher bandwidth although the memory cells work at the same low frequency. That’s what we need to be able to fetch data for the processor.

This is the main idea that distinguishes DDR2 from DDR. However, the difference between the two memory types is not only in their bandwidth.

Besides bandwidth, there is an important characteristic called latency. As I said above, the memory cell is not always available because of the refresh procedure. Moreover, even if the cell is available, it’s not possible to get its contents instantaneously: there are other types of latencies such as the time it takes to set up the address of the column or row, the minimum time between setting different addresses. Such latencies are not intrinsic in any specific memory type; they are always here because all memory types use the same elementary DRAM cell.

Let’s now see what we have with latencies. Let the cell array in the above example works at 2-2-2 clock combination. Since the array works at the same frequency in all cases, all modules will have the same latencies (I speak about PC100, DDR200 and DDR2-400). Only the bandwidth differs. By the way, the 2-2-2 combination means: CAS Latency, RAS-to-CAS Delay and RAS Precharge time. The first number is the latency of extracting the column address, the second is the latency between the addresses of the row and column and the third is the time it takes to charge up the cells in the row before giving the data out.


In real life, cells do not work at the same frequency. For example, PC133 was once widespread where DRAM cells work at 133MHz. Accordingly, DDR200 had a higher bandwidth than PC133, but was slower as concerned latencies. In fact, the number of latency clocks being the same, PC133 has a 33% higher clock rate (the clock is thrice shorter). As a result, only DDR266 that had the same latencies as PC133 showed the real advantages of DDR SDRAM.

We see a similar situation today. Yes, the latencies are the same for DDR200 and DDR2-400 and the second module will have a higher bandwidth. But in fact DDR2-400 will compete with DDR400 rather than DDR200! And that’s where everything changes: first, the bandwidth of the modules is the same, 3.2GB/s. Second, the frequency of the DRAM array is 200MHz with DDR400, while DDR2 works at 100MHz. As a result, the latencies are noticeably smaller with DDR400, even considering that the latencies of DDR400 are typically three, rather than two clocks.

Let’s again view it in numbers. For DDR400, we usually have 2 or 2.5 latency clocks, sometimes 3. That is, from 10 to 15 nanoseconds. For DDR2-400 we calculate the latency in the following way: let the core has 2 latency clocks at 100MHz. It means we have 20ns latency. It means 4 latency clocks at the interface frequency (as the interface works at a higher clock rate). Thus, the resulting latencies for a DDR2 module will be 4-4-4 clocks. Considering the relatively low core clock rate, we may hope to see DDR2-400 modules with 3-3-3 characteristics in the future. But even such DDR2-400 modules will lose to a DDR400 module as concerns the time of accessibility of the data.

Overall, the situation seems absurd. Yes, DDR2 is potentially faster, since it provides a much higher bandwidth. But the transition to DDR2 will first somewhat slow down the systems that use it compared to systems with DDR. As you understand, the industry needs to produce some other advantages to attract the end-buyer.

Besides the Speed

DDR memory modules typically use TSOP-packaged chips (known to you by the numerous snapshots of memory modules). This packaging is all right with frequencies up to 200MHz, but starts to fail on higher clock rates. The chips in a TSOP package have too high resistance and inductance capacity, which hinders further frequency growth.

That’s why another packaging – BGA or Ball Grid Array – is long employed for higher frequencies, for example in graphics cards. This packaging is good as it has lower resistance and capacitance, smaller geometrical dimensions, and permits to take heat off more efficiently. That’s why DDR2 modules will use the BGA packaging. There are no miracles in this world, though, and this package costs more to manufacture.

Note that the BGA packaging itself is no prerogative of DDR2 memory. For example, Kingmax uses BGA for DDR memory as the company is into producing overclocker memory modules and the packaging makes the memory more tolerant to high frequencies.


Anyway, this is rather an exception to the rule. DDR typically uses TSOP packaging for making the memory cheaper.

That’s not the only difference between DDR2 and DDR. Now we should recall such trivial things like signal termination. You should know about that if you’re using a SCSI controller and hard disk drives. In brief, the high-frequency signal reflects from the end of the signal line, masking the useful signal with the reflection noise. In order to prevent this, a bunch of resistances is hung on the end of the line, to muffle the signal completely.

These termination resistors are on the mainboard for DDR memory: you may notice numerous tiny resistors and capacitors in the neighborhood of DIMM slots. Of course, the necessity to wire all this stuff onboard doesn’t make the mainboard cheaper, or easier to manufacture. With DDR2, such devices are placed directly in the memory, thus making it unnecessary to place all those electronics onboard. Of course, this puts stricter demands on the range of the nominal characteristics of a module since modules from different manufacturers must work together right, but all involved are going to win from this solution in the long run.

There’s yet another aspect where DDR2 is preferable over DDR – heat generation. DDR modules of ordinary capacities (256MB – 512MB) are not very hot at work. But the fact is that the module becomes hotter when its capacity grows. For example, installing 4GB of RAM into the slots, we can notice that the memory may dissipate 35-40W under peak loads and that’s not little. Yes, such memory amounts are rare today, but tomorrow? Thus, it is necessary to solve this problem beforehand, writing premises for reducing heat dissipation into the new memory standard. Moreover, the operational frequency (heat is proportional to the clock rate, all other factors being equal) of the memory will significantly grow up.

Well, DDR2 has something it can be boastful of. The DDR2 core will work at 1.8v – compare to 2.5-2.6v of the current DDR (higher voltage means more heat, too). Thus, DDR2 should produce less heat. They estimate a reduction by 30% and practice will prove that. We should notice the fact that a lot of today’s DDR works at 1.8v, which it converts from the input 2.5v voltage. However, such conversions heat the module up, too, so without them the heat dissipation will decrease.

Another innovation employed in DDR2 is called Additive Latency. To understand its point we should know that in real circumstances it is not sometimes possible to transfer data even at the moments when this is formally allowable, because the memory control bus has a limited number of states. So sometimes it’s not possible to send the command to initialize the next memory bank simultaneously with the command for reading the earlier-initialized bank. Just because these two commands require that signals of two opposite levels are sent at once by the same bus. As a result, there’s a bubble in the data stream from the module with no useful data, which the memory couldn’t provide due to this organizational conflict.


This additional latency was introduced in DDR2 to solve this problem. Its point is in transferring the read command automatically to the next clock in case of conflict. Thus, we seem to get data a clock cycle later, but there are no bubbles in the data stream and the efficiency of the memory subsystem is increased.

The next divergence between the memory types is in DDR2’s ability to do Variable Write Latency. DDR always has a write latency of 1T. This time is dictated by the specification and cannot be changed. For DDR2, this write latency depends on the read latency and equals write latency minus one clock. For example, the read latency being 7 clocks, the write latency will be 6 clocks. This sounds terrible compared to 1 clock for DDR. In reality, it’s not that bad, because the write procedure with DDR requires some special preparation, unnecessary in DDR2. So there is a difference, but it is smaller than may seem. The resulting write latency of DDR2 is about three times higher than with DDR.

Conclusion

Let’s summarize. DDR2 memory comes in 240-pin modules of the same length as the ordinary 184-pin DIMMs (the pins will be smaller). The modules are characterized by their potential of reaching higher operational frequencies. Moreover, they contain certain improvements that allow for a higher efficiency of the memory. However, besides evident advantages, the memory has drawbacks: first, it has much higher frequencies at the same interface clock rate. Second, write latencies grow considerably. Third, this memory will have a higher cost due to its much more expensive packaging. Other differences are listed in the following comparative table:

 

DDR

DDR-II

Data transfer rate  

200/266/333/400 Mbps* 

400/533/(667) Mbps* 

Bus frequency

100/133/166/200 MHz

200/266/(333) MHz

Memory frequency

100/133/166/200 MHz

100/133/(166) MHz

Batch reading size

2/4/8

4/8**

Data Strobe

Single DQS

Differential Strobe: DQS, /DQS***

CAS Latency

1.5, 2, 2.5

3+, 4, 5

Write Latency

1T

Read Latency-1

* Megabit/pin/sec
** The specification originally described a packet length of 4QW, but later added the 8QW mode, proposed by Intel and Samsung.

<%BANNER[banner_468x30]%>