by FastSite
10/03/2002 | 12:00 AM
Today we're going to review the new 3ware Escalade 7850 controller, which only differs from the previously reviewed 3ware 7810 (3ware 7810 IDE RAID Controller Review) by large drives support (over 137GB) and the new R5 Fusion technology intended to raise controller performance with RAID5 arrays.
On the other hand, this is already the third generation of 3ware controllers that I work with and every time I couldn't conceal my admiration of their performance and functionality. What does this controller have to strike me with this time?
3ware sent us a boxed version of their 7850 controller. Besides the card itself, the package included eight 24-inch (!) 80-pin ATA cables, four power supply splitters, a CD and floppies with drivers.
I was rather intrigued by the unusual length of the ATA cables. The common ATA cables are 18" (45cm) long. 3ware claims its cables fully comply with the ATA standard (see here). Well, we'll soon test them at work!
The exterior of the controller didn't undergo any significant modifications:

But we know that great things are concealed in small details. Look at the cache-memory chips:

This is the same high-speed SRAM memory, but the capacity of one chip is 1MB now. We see two chips like that on the 7850 controller, so the overall cache-memory capacity makes 2MB. If you check our 3ware 7810 review, you will see that the cache memory of 7810 was only 1.125MB.
So here's the difference between 7850 and 7810 - the size of the cache-buffer. Of course, the difference is not very big, but let's recall the way 3ware 7810 successfully competed with Adaptec 2400A, which had a much larger cache. Every MB of the cache-buffer in 3ware controllers is as precious as gold.
So, what does the 7850 controller need the extra cache-memory for? I mentioned above that one of the differences between 3ware 7850 and 7810 was the support of some R5 Fusion technology. Let's find out what this super technology looks like.
R5 Fusion technology significantly improves controller performance during write operations in RAID5 arrays. The performance growth is achieved both at writing large sequential data blocks and small random blocks.
3ware claims the technology boosts linear write speed in RAID5 up to 40-60MB/sec and 3ware 7850 outperforms traditional SCSI RAID solutions, which is surely attractive as the price of the ATA RAID disk subsystem appears twice as low as that of SCSI RAID one.
The R5 Fusion's underlying concept of "lazy" write is nothing new and is actively implemented by all manufacturers of RAID5-supporting RAID controllers. We remember that it was lazy write that helped Adaptec 2400A to surpass 3ware 7810 in our High-End IDE RAID Controllers Roundup.
But R5 Fusion is something more than simple lazy write. The controller doesn't just put the write request into the cache to be executed at its convenience. The smart controller waits for a while for the "fun to continue", i.e. until the cache gets "a stripe full of data".
To understand the advantages of R5 Fusion, let's consider the situation when it's not involved. :)
Generally, when writing a block of data into RAID5 array, the controller has to:
In case the data block to be written is smaller than the stripe block, it's possible to read not the entire stripe-block at steps 2 and 3, but just a part of it of the same size as the block to be written. In this case the controller saves some time as the XOR processor will work faster and the read/write operations will also take less time (as smaller blocks are read/written).
But anyway, every write to the RAID5 array generates two read requests to the drives, two XOR operations and two write requests to the drives (let's denote it as "2-2-2").
Let's now imagine that the controller receives requests to write large data blocks (larger than N-1 x stripe size).
As soon as the cache-memory accumulates a data block of sequential write requests, which size is the stripe block size multiply by N-1 (the number of "useful" HDDs in a RAID5 array of N drives), the controller doesn't need to perform read operations any more as all the blocks of a complete stripe are to be changed! It only has to calculate the check sum and perform N writes (N-1 data blocks and the check sum).
To clear it up, this approach eliminates:
And the more drives are in the array, the more time and money the 3ware 7850 controller saves for us! :)
As the 3ware controller doesn't load HDDs up with unnecessary requests, the requests are processed faster resulting in higher write speed! We're going to see it work soon.
When arranging the arrays we set the stripe block size to 64KB. For WinBench tests the arrays were formatted in FAT32 and NTFS as one logical drive with the default cluster size. WinBench tests were run five times each; the average result was taken for further analysis. The HDDs didn't receive any extra cooling between the tests. On building up the arrays, we increased the number of the HDDs by adding them to the IDE channels one by one. This approach prevents 3ware7850 from showing its maximum performance, but at the same time it allows us to test the 3ware 7450 controller, though "virtually".
We used the following benchmarking software:
To evaluate the controller performance in RAID arrays of different types in IOMeter, we used the new StorageReview patterns. They were introduced in the third edition of the HDD testing methodology:

These patterns are intended to measure the disk subsystem performance under workload typical of file- and web-servers.
Our colleague, Sergey Romanov aka GreY, developed a pattern for Intel IOMeter basing on the StorageReveiw's study of the disk subsystem workload in ordinary Windows applications. The pattern was based on the average IPEAK statistics StorageReview provides for Office, High-End and Bootup work modes:

The pattern serves to determine the attractiveness of the HDDs and RAID controllers for an ordinary Windows user.
We also compared the controller performance in RAID arrays of different types for varying write-to-read operations ratio. We made a pattern in which 100% random 8KB blocks were used and the write-to-read ratio was changing from 100/0 to 0/100 with the step of -10/+10.
Well, and in the end we checked the ability of the controllers to work with sequential write and read requests of variable size in different RAID arrays.
Our testbed was configured as follows:
The controller was tested with 7.5 firmware and the same version of the drivers. Eight IBM DTLA 307015 HDDs were used to build the arrays. To comfortably accommodate all the hard drives in the testbed we needed a big case equipped with active-cooling bays for HDDs. SuperMicro SC901D (Chieftec Dragon Series) suited us best:

But there's always one problem with too many hard drives in a single case - short IDE cables. As the number of 3.5" bays is limited we have to use mobile racks or external chassises to put the HDDs into 5" ones. And very often the length of a standard IDE cable (18 inches or 45cm) is not enough to reach the 5" bays. Thanks goodness that 3ware were smart to foresee the problem and ship longer (24" or 60cm) IDE cables with their controllers. That's how I managed to arrange the cables that neatly (see the snapshot) :).
WinBench will be the first of our tests. Previous reviews showed that hardware IDE RAID controllers do not perform well in it. But there's always some hope for the miracle! :)


As we see, there's certain dependence of the performance on the number of the HDDs, but the 3ware controller shows not a single sign of the optimization most firmware RAID controllers have for this test. Well, maybe it doesn't need it. :)


The same picture appears in NTFS. The array performance depends on the number of hard drives, but the proportion coefficient is rather low.
But look at the way controller bandwidth is utilized:

During linear reading, the maximum read speed is directly proportional to the number of the drives! Take a look at the X-axis scale: it is marked with 36400 steps, which is exactly the performance of a single IBM DTLA 307015 drive. We see that linear read speed from RAID0 array of N drives is nearly equal to Nx36400. But this proportion is valid only up to the 5-drive array. Unfortunately, with more drives in the array, controller bandwidth is limited by the central chip or by the PCI bandwidth (64/33MHz). Anyway, the controller notched up 194MB/sec linear read speed.


We see the controller performance decreasing in case of RAID5. Well, nothing to be surprised with: WinBench has write operations, too, and they negatively impact the speed of a controller with smaller cache-buffer.


Win2000 sets default cluster size in NTFS at 4KB. As the stripe block size for RAID5 by 3ware 7850 is equal to the same 64KB, it turns out the controller does too much unnecessary work. This results in lower performance.
Let's see how the type of RAID array involved affected the linear read speed:

It turns out the linear read speed of an N-drive RAID array equals to that of (N-1)-drive RAID0. It's not surprising, though, as every HDD in RAID5 gives 1/N part of itself to storing the check sum. At linear read, the controller ignores those blocks as they don't contain (in this case) any useful information.
Note that linear read speed "saturation" reached its peak much earlier in RAID5 than in RAID0.


We see that RAID1 performs a bit faster than the single drive, while the RAID10 arrays showed poor scalability depending on the number of drives used. Especially, bearing in mind that the performance may get slightly higher if the arrays gets two more HDDs, and not just one, as in case of RAID0 and RAID5.


We see the same picture in NTFS.
But there's something interesting in the Disk Transfer Rate diagram:

Watch the linear read speed from RAID1 array! The TwinStor technology boosted it due to intelligent routing of the read requests to both drives of the mirror pair. However, RAID10 arrays do not speed up in this case. The read speed in these arrays is equal to the read speed in RAID0 built of N/2 drives, where N is the number of HDDs in RAID10.
Well, seems like TwinStor doesn't work for sequential requests in RAID10 anymore?!
Let's see how the controller will cope with random requests.
So, the controller is bombarded by requests to read and write 8KB blocks with randomly calculated addresses. The diagrams show the dependence of controller performance on the write operations share at 256 queue depth. The X-axis has the percentage of writes, the Y-axis - the speed of requests processing (Total I/O) per second.

We see that the graphs for RAID0 with different number of drives are pretty much the same and repeat the graph of the single drive. All this tells that StorSwitch works well (sorting and routing the request to the appropriate HDD).

The mirrored arrays undoubtedly use TwinStor technology, because in case of random read, RAID10 array outperforms RAID0 array! Note, though, the graph of the eight-HDD RAID10. The bigger gets the share of write operations, the closer this array runs to the six-HDD RAID10 one. Note also that the "steps" of the three lower graphs in RandomWrite mode were equally high. But RAID10 array didn't rise to the occasion…Maybe its cache-buffer simply proved too small for the lazy writes in case of eight hard drives…

We see the same ordered picture with RAID5. Performance loss on every X-axis step is the same for all the arrays, i.e. it doesn't depend on the number of HDDs in the array (that's what the 2-2-2 algorithm was invented for).
Let's compare RAID10 and RAID5 arrays built of the same number of HDDs, which should be pretty helpful for those of you who want to build a RAID array for an SQL base.

It's evident that all the arrays are almost equal at random read. But as soon as there appear write requests, RAID5 arrays slow down a lot. Although with a big write operations share, the absolute performance drop gets lower and becomes stable, while the relative performance advantage of RAID10 over RAID5 is growing.
So, the choice of the appropriate array type comes down to price-capacity-performance question. If we disregard the capacity factor (as present-day HDDs have large enough capacities), then the price-to-performance coefficient would be better for RAID10 than RAID5. RAID10 also boasts higher fault-tolerance.
Let's see how well the controller will cope with sequential read requests. The point of the test is to measure the time necessary to read data blocks of different sizes. The request queue depth is four requests.


It's clear that big RAID0 arrays acquire visible advantage only when the requested data block is rather large. In other words, when the controller splits the large request into a few smaller ones that are processed simultaneously by different hard drives.


The situation turns more interesting with the mirror arrays: do you see how sharply the read speed changed when the requested block exceeded 64KB! Seems like TwinStor shows its best again.
For example, take RAID1. As the requested block becomes larger than the stripe block, the controller divides the request in two and gives out two requests to both HDDs in the array. Of course, the speed is growing, but why doesn't it double? Strange that RAID1 performs even a little slower with 512KB blocks. Seems like four blocks like that (queue = 4) cannot be stored in the controller's cache (it's 2MB, but part of it is reserved for storing system information, such as request statistics, etc.).
RAID10 arrays perform similarly, on the whole. They only have "higher actuation threshold" for TwinStor technology than RAID1 (i.e. the size of the block which pushes the data to be read from both drives of the mirror-pair). It's all right as RAID10 arrays include more hard disk drives.


Well, once again we see that RAID5 of N drives shows equal read speed with RAID0 of (N-1) drives.
Now, to sequential write. We're more interested in the results of RAID5 as I in the beginning of the review I kept praising 3ware as a great solution for that. Let's see if this promotion made any sense. :)
But first, I would like to offer you the controller performance in RAID0 array:


Well, do you see what I see? The unexpected performance boost for the arrays of over 4 HDDs and 32KB data blocks confused me a lot, I should say…
Is it the effect of the small cache that tells once again here?


Interesting that we see the same local performance boost when the data blocks are written to RAID10, too. But here the boost takes place when the block is only 16KB big, which is twice as small as by RAID0.
I would like to draw your attention to the fact that writing speed to RAID1 equals to that shown by a single drive (and that's all right), but ALL RAID10 arrays are writing data as fast as the two-HDD RAID0 does.


When writing to RAID5 array, there were no boosts or slumps in performance. Seems like R5 Fusion technology uses caching algorithms to smooth the influence of the increasing block size. The controller performance still depends on the size of the data block, but 3ware managed to "smooth" it quite nicely.
As promised, the controller reached 70MB/sec write speed in the five-HDD array already! Excellent! And that's the result shown with the rather old IBM DTLA 307015 HDDs. I guess with a more up-to-date HDD, the controller could reach 70MB/sec even with four hard drives (the prices of 3ware 7850 and 7450 differ quite a lot, you know).
If we return to the previous 3ware 7810 IDE RAID Controller Review, that was 9 months ago, we will see the following picture:

Note that on the green graph (RAID5) the speed spontaneously rises up as soon as we reach 5 drives. Moreover, the five-HDD RAID5 proved faster than 6- 7- or 8-drive array.
The explanation appeared very simple: we used 256KB blocks for Sequential patterns then and that made the full stripe (4x64KB) for five-HDD RAID5 configuration with four "working" drives. Controller firmware got the point and boosted write speed to 50MB/sec in that case.
And now the R5 Fusion technology boosted write speed of the same five-HDD RAID5 array up to 70MB/sec.
Now let's see what we're going to get in a pattern that isn't formally considered synthetic:


Well, the usual picture. The more hard drives are in the array, the faster it processes the requests. Six- and seven-HDD arrays break the order a little, though. In case of big queue depth, the performance of these arrays doesn't grow as fast as by other arrays. Can the 3ware controller be unable to handle these configurations? Well, it shows excellent performance with the eight-drive array! Maybe the workload model we have here doesn't suit those arrays at 64KB stripe block size? Overall, it's not quite clear.


As you see, there's no such problem with RAID10. The graphs for the arrays made of different number of HDDs formed neat "stairs"...
I included here the results of a single HDD on purpose: I wanted to demonstrate the TwinStor technology efficiency. RAID1 clearly outperforms the single drive even under small workloads.


The eight-HDD RAID5 array appeared a little slower under high workload.
If we compare RAID10 and RAID5 performance in this pattern:

We will see that RAID5 has been utterly defeated. Not much of a surprise as this pattern features a big share of write operations and each operation of the kind, especially a random operation, tells negatively on the overall RAID5 performance.
In the FileServer pattern, we were changing the workload range, but it didn't influence the controller performance with any types of arrays.


We see a certain performance reduction in case of six- and seven-HDD RAID0 arrays, which is the same as we have just seen in the WorkStation pattern.


RAID1 is still faster than the single drive. RAID10 shows good scalability: the more drives are involved - the higher is the performance.


With RAID5, the controller seems to have no problems with any number of hard drives.
If we compare RAID10 and RAID5, we will see that RAID5 arrays look much better in this pattern than in WorkStation.

That's because FileServer features only 20% of write operations, while WorkStation - about 40%.
In WebServer pattern, there are no write requests at all so we can expect RAID5 to perform as fast as RAID0. Let's check it out! :)


Oops! The eight-HDD array worked well only under high workload. Does the controller lack speed? Or, on the contrary, the drives are dragging the controller down…


Interesting that there are no problems with eight-drive configurations in RAID10. It is also interesting that RAID1 array is not much faster than the single hard disk drive compared with what we saw in FileServer or WorkStation. That's really weird as the TwinStor technology should prove most efficient in the pattern with no write operations!
Now, let's see whether there's any effect of TwinStor in RAID10 arrays. We'll compare the speeds of RAID10 and RAID0 made of eight drives each.

Well, TwinStor does work, though its effect is not very tangible. It is interesting, that this effect depends on the request queue depth, but not linearly. :)


RAID5 tests again show the small effect of the eighth HDD under small workloads.
On conclusion (most of you might have given a sigh of relief) we would like to compare the performance of RAID0, RAID5 and RAID10 arrays made of four, six and eight drives in the WebServer pattern.

The best result is marked in blue, the worst one - in red. We see RAID10 outperforming RAID0 and RAID5 under small workloads due to smaller access time in the arrays with mirror pairs of drives. However, this advantage is not free: RAID10 requires twice as many hard drives as RAID0 of the same capacity.
Again I really enjoyed the 3ware controller. 3ware 7850 has no rivals among contemporary IDE RAID controllers at processing sequential read and write requests.
The 70MB/sec speed it showed in RAID5 array allows it to successfully compete with SCSI RAID controllers in tasks connected with processing large streams of information (such as high-quality video editing).
Excellent work with RAID10 makes the 3ware 7850 controller a good choice for "budget" file- and web-servers while high speed of random reads in RAID5 suggests it as an appropriate solution for data storage systems with random access.
Now, as usual, the list of all the advantages and shortcomings of the controller reviewed.
Highs:
Lows:
We're going to compare 3ware 7850 with its competitors (including SCSI RAID controllers) in one of our upcoming articles.
During this test session, I discovered a few things, which I didn't quite understand. So, I decided to express my concern to you, guys, hoping you might be able to help me out:1. When the RAID10 arrays included more than four drives, the controller BIOS froze at the array initialization stage (duplicating the mirror pairs). It mostly happened when 45 % of the process had been complete. When reset, the controller thought the array was ready to work. But when I got to tests, it found out that the array was "incorrect" and restarted the initialization. When the whole thing for Windows was through, the array was ready and no more problems cropped up.2. Second issue is the fixed stripe block in RAID5, i.e. I couldn't change the stripe block size. It is always equal to 64KB whatever we might wish. I guess it's connected with the R5 Fusion implementation. Well, if it's not a "bug", but a "feature", they could have told us about it beforehand. 3. Unexpected performance drops by our controller in case of arrays involving many hard drives.
As you see, I didn't reveal anything deadly dangerous and I'll have to double check all these issues once again in the near future, as I've already got…

P.S.: That's not a fake, believe me :)