by Alexander Yuriev , Nikita Nikolaichev
02/05/2004 | 10:14 PM
In our previous roundup of SerialATA RAID controllers, we promised you to dedicate a single review to an Escalade 8500 controller from 3ware, which we didn’t include in that roundup. Today we would like to introduce to you a product from the well-known manufacturer, who has always pleased us with its high-performance solutions. However, the good name of the developer is no guarantee that the product will be a success, although it significantly increases this probability.
So what about the 8500-series controllers? 3ware took it easy when developing them: the engineers just equipped their ATA RAID controllers with bridges from Marvell. The Escalade 8500 series consists of three models that differ by the number of supported drives (4, 8 and 12). We received an eight-channel controller, but tested it with four hard disk drives as if it had been four-channel (we just didn’t have eight Raptor HDDs at that time).
3ware 8500-8 controller supports up to 8 SerialATA-150 devices and allows uniting them into RAID arrays of levels 0, 1, 5, 10 and JBOD. It connects to the computer across the 64-bit PCI bus with a frequency of 33MHz. The controller carries two chips of SRAM memory, 1.125MB each, so the total amount of the available cache memory is 2.25MB. The controller card looks like that:
The front side of the PCB accommodates numerous chips: the actual controller chip, memory and BIOS. There is so little free room on the PCB that they had to install the SATA connectors in two rows. You can’t see it in the snapshots, just believe me: there are eight, not four, in total. On the other hand, the expedient placement of the components around the PCB allowed to leave some space for another four SATA connectors and the corresponding controller chips, still keeping the controller small (half-length, to be more exact).
The basic characteristics of 3ware 8500-8 controller are listed below:
2.25MB SRAM (3.3V)
64bit 33MHz PCI bus
8 SerialATA 150 channels
RAID 0, 1, 5, 0+1 and JBOD
Besides the controller, you receive a CD with the drivers, a controller specification, and as many as eight SATA-150 cables. So our initial impression about the product is good; let’s see if it remains good after the tests.
The tested was configured as follows:
We tested the controller in WinBench 99 2.0 and IOMeter 2003.02.15.
We created one logical partition for the whole storage capacity of the array in WinBench 99. We ran each of the WinBench tests seven times and took the best result for further analysis.
For Intel IOMeter, we used FileServer and WebServer patterns.
These patterns are intended to measure the disk subsystem performance under workloads typical of file- and web-servers.
WorkStation pattern for Intel IOMeter was developed by Sergey Romanov aka GReY basing on the disk requesting statistics in different applications StorageReview provided for Office, High-End and Bootup work modes in NTFS5 file system and mentioned in Testbed3 description.
This pattern helps to estimate the performance of the RAID arrays in typical Windows applications.
After that, we checked the ability of the controller to process sequential read/write requests of variable size in the DataBase pattern that was sending SQL-like requests to the disk subsystem.
We flashed the firmware version 7.6.3 and used the controller with the drivers from the same suite (7.6.3). The 3DM Disk Management utility helped us control the status of RAID arrays and synchronize them. The controller was installed into a PCI-X (133MHz) expansion slot (although it only supports the 64-bit PCI bus working at 33MHz).
We created RAID arrays on channels 1-4 of the controller to imitate the operation of Escalade 8500-4 controller. There are two AccelerATA chips, each of which is responsible for four drives, so when we filled up channels 1-4 we actually employed only one of the chips. If we were to use two channels of each chip, the controller would work faster, but that wouldn’t be correct, as we are testing a “four-channel” controller (Regrettably, we couldn’t find 8 Raptors for our today’s testing session).
WD360GD (Raptor) drives were installed into the standard bays of the SC5200 case and were fastened with four screws at the bottom.
When performing the basic testing program, we enabled lazy write for the drives, the driver’s request caching mode (WriteBack or WriteThrough) was changed when necessary.
We start out as usual with checking the controller performance when processing mixed request streams.
This pattern serves to send a mixed stream of requests to read and write 8KB data blocks with a random address. By changing the ratio of reads to writes, we can find out how good the controller driver is at sorting the requests out.
The largest table comes first: the results of the controller in the WriteBack mode:
The following diagrams show the dependence of the data-transfer rate on the reads/writes ratio for different request queue depths. For easier reading, I created two diagrams:
All arrays show similar speeds under linear workload, at the beginning of the graph (Random Read mode). However, the “mirror” arrays, RAID1 and RAID10, are faster due to TwinStor technology that doesn’t just alternate the read requests between the two disks of a mirror couple, but does this intellectually, that is, by determining which disk will perform a given request faster, according to the current position of its read/write heads (or, to be more precise, according to the request history).
The performance of the single drive is going up as there appear more write requests in the queue (in our case, there is a higher probability of such a request). The speed of the RAID0 grows in proportion to the number of disks in the array, but this proportion remains only in operational modes with a big share of write requests. RAID1 and RAID10 arrays also speed up, but more slowly. RAID5 arrays draw a sloping down graph: write requests impede them greatly, and the more write requests are in the queue, the more difficult it is for the controller. However, when there is a big share of writes, the controller even increases its pace a little bit, which is a pretty curious fact.
As the workload increases, we see the arrays perform at different speeds. The “mirror” arrays run much faster when the reads share is high, but lose their speed as the share of writes increases. These arrays are efficient at reading random-address data due to TwinStor technology. Ideally, you get a double performance gain in reading speed.
Anyway, RAID1 is always faster than the single drive, while RAID10 is in its turn faster than the two-drive RAID0 across all the operational modes. RAID0 arrays should actually be the fastest, but as we explained in our previous review, random selection of the drive to read from may lead to a situation when some drives are overloaded and the others stay idle. Meanwhile, “mirror” arrays alternate requests between the disks (basing on the access statistics), thus creating the same or similar workload for all drives in the array. That’s why when the writes share is low (below 30%), RAID1 outperforms even RAID0 of two disks, while RAID10 outperforms RAID0 arrays of three and four disks when the writes share is below 40% and 20%, respectively.
Of course, we couldn’t pass by the small slump in performance RAID0 arrays of three and four drives suffered on 10% writes. As you remember, we saw a similar slump with Intel SRCS14L and Promise FT S150 TX4 controllers. It seems like 3ware 8500-8 controller takes some time to execute the lazy write algorithm, and the resulting optimization doesn’t compensate for that time when there are only 10% of write requests.
The graphs for RAID0 arrays have the same shape as the graph for the single drive, indicating that the StorSwitch architecture shows excellent results at sorting requests out and sending them to the appropriate hard disk. On the other hand, we still see the slumps in performance of the RAID0 arrays of three and four drives on 10% writes. The influence of TwinStor on “mirror” arrays, RAID1 and RAID10, is the highest in the Random Read mode and when there are enough requests in the queue, so these arrays are doing faster than RAID0 arrays of two and four disks, respectively.
To check out the influence of lazy write on the results, let’s compare the numbers we have just got (in the WriteBack mode) to the results in the WriteThrough mode:
To compare the speeds of the RAID arrays in different caching modes, we fill the table with ratios of the controller speed in the WB mode to its speed in the WT mode. The higher is the number, the higher is the efficiency of WB caching in this mode. If the number is below 1 (marked with red), WB caching is harmful. If the number is above 1 (marked with blue), WB caching brings a performance gain. If you see “1.0”, then WB and WT caching modes are equally useful.
As you see, you can speed up RAID arrays by enabling WB caching in the controller’s BIOS. We see no speed reduction even in the Random Read mode (when there are no write requests at all), while in the Random Write mode the speed gets six times higher in some cases. Only RAID5 array slows down with WB caching and requests queue = 256, but such a long queue is highly improbable in real tasks.
The results of WB caching are more illustrative when demonstrated on the graphs. We created three of them, for three different queue depths.
RAID0 is speeding up when caching is enabled and the writes share is high. As the number of requests in the queue grows, the gap between WriteBack and WriteThrough modes becomes smaller, but the advantages of WB caching are perfectly seen everywhere, save for the Random Read mode where there are no write requests and, accordingly, there is nothing to optimize!
WB caching influences the array’s performance in case of single requests. The speed boost is as high as 137%. However, the efficiency of WB caching goes down as the request queue becomes longer. When the request queue depth hits 256, WB caching even slows down the operation of the array by about 10%.
WB caching brings the same effect as with the RAID0 array, but its value is smaller this time: only 245%. :)
So, WB caching doesn’t actually slow down the array, while sometimes it can on the contrary provide a very substantial performance boost.
We chose the WriteBack option in the controller driver for this pattern, but also checked out some of the arrays in the WriteThrough mode. So, the array receives a stream of read/write requests with a request queue depth of 4. Every minute the size of the data block changes, so we can see the dependence of the linear read/write speed on the size of the data block. The results (the correlation of the controller’s data-transfer rate and the data block size) are listed in the following tables:
For easier analysis, we divided the arrays into two groups in the diagrams:
The advantages of the RAID arrays that consist of many HDDs become apparent when the data block is big enough, that is, when the request is so big that the drives of the array work simultaneously (in parallel).
The read speed graphs for the “mirror” and RAID5 arrays, which improve their reading speed due to certain optimization algorithms, are not monotonous.
Let’s see how the controller behaves in the WriteThrough mode and compare it to the results we have just seen:
As it might have been expected, the difference in the results is negligible when there are no write requests. Turning deferred write on doesn’t practically affect the shape of the graphs.
Let’s turn to sequential write now:
And the graphs:
Here you go: these are the first problems for today. The speed of 4-HDD RAID0 array is indecently low on big requests. It seems like the controller lacks cache memory in these circumstances. Well, it is overall weird to have a situation when one drive of the array has four times the amount of the controller’s cache memory…
The performance of RAID1 array nearly coincides with that of the single drive, while the speed of RAID10 is a little lower than that of the two-disk RAID0. It’s more complicated with the write speed of RAID5 arrays. When the data blocks are small, the write speed of RAID5 is higher than that of a single drive or of the “mirror”. This is where R5 Fusion works (it “glues” write requests together in the controller cache). But when the requests are big, there is simply not enough space in the controller cache…
By disabling lazy write we greatly reduce the performance of every array; this is especially catastrophic for the RAID5!
These patterns imitate the operation of the disk subsystem of a typical file- or web-server.
RAID0 arrays show nice scalability in speed: the more drives are included into the array, the faster the array is.
There are only 20% of write requests here, so all the arrays show high results, although I can’t explain the slump in the graph of the RAID1 when the request queue depth equals 64.
Let’s calculate the performance rating for each array by averaging the controller performance under each workload:
I would like to single out the “mirror” arrays here. Thanks to TwinStor technology, RAID10 took the honorable second place, while RAID1 could have been at a better position if it hadn’t been for the above-mentioned slump in its graph.
The WriteBack caching mode provides slightly higher results than the WriteThrough mode, as the amount of writes to be optimized is quite small here (20%).
Let’s see how the results will change in the WebServer pattern, which typical trait is the complete lack of write requests.
The graphs for RAID0 arrays haven’t practically changed since the FileServer pattern, while the performance of RAID5 arrays has grown up significantly because the WebServer pattern with no write requests is the optimal operational mode for them. RAID1 has the same slump in the graph, and RAID10 has acquired a flat stretch in its graph.
These are the ratings of the arrays, which we calculated according to the same rules as in the FileServer pattern:
RAID10 has finally won the race! RAID5 arrays are high enough, and the performance of the four-disk RAID5 nearly matched that of the four-disk RAID0. The performance rating of the RAID1 array is somewhat lower because of the slump in one of the modes, just like in the FileServer pattern.
Lazy write doesn’t boost the performance of our arrays in this test. It’s got nothing to optimize as there are absolutely no write requests!
Workstation pattern imitates intensive work of a user in various applications in the NTFS5 file system. The table for the WriteBack mode comes first:
And that I the more illustrative graphical representation:
The situation with RAID0 is quite ordinary: the more disks the array has, the faster it processes requests. RAID1 is faster than the single drive even under small workloads, while RAID10 outperforms RAID0 of three disks. RAID5 arrays seem weak against them, but this is actually no surprise. The thing is that numerous random requests for writing small data blocks slow down RAID5 array greatly.
We calculated the performance rating for the WorkStation pattern according to the following formula:
Performance Rating = Total I/O (queue=1)/1 + Total I/O (queue=2)/2 + Total I/O (queue=4)/4 + Total I/O (queue=8)/8 + Total I/O (queue=16)/16 + Total I/O (queue=32)/32.
RAID5 arrays fall behind the single drive. RAID0 arrays lined up in “size order” (according to the number of disks in them), while RAID10 was just a little faster than RAID0 of three disks. RAID1 has a higher performance rating than RAID0 of two disks, although we saw it otherwise in the graphs. That’s because we assume that short queues are more likely to occur in a workstation, therefore, short queues have higher weights in the total result. So, the performance advantage of RAID1 during short queues processing is more than enough to outpace RAID 0 of 2 HDDs in the performance rating chart.
Let’s see how lazy write affects the operation of the arrays in this pattern:
Switching to the WriteThrough mode, the arrays rank up the same way, but perform 25-35% slower (depending on the array type).
WinBench is going to be our last test today. This benchmarking set helps to estimate the disk subsystem performance in desktop applications.
The following table compares the arrays in two integral subtests: Business Disk Winmark and High-End Disk Winmark:
Winbench99 requests a lot of writes, so it is not at all surprising that RAID5 arrays demonstrate poor results because of their slow write speed. And that’s also the reason for RAID1 and RAID10 arrays to be just a little faster than the single drive. RAID0 arrays show a dependence of the speed on the number of drives in the array, but the proportion coefficient is small.
We see RAID10 profiting most of all from lazy write (it was slower than RAID5 in the WriteThrough mode!). As for other arrays, WriteBack does have a positive effect on them, but this effect is not so impressively great.
Now we shift to FAT32 file system.
The arrays ranked up just like they did in NTFS. The speed of RAID0 arrays slightly depends on the number of drives in the array. The performance of RAID5 is the lowest since WinBench contains write operations, which negatively affect the performance of the controller with its small cache buffer. RAID1 just managed to outperform the single drive, while RAID10 runs faster than the latter two only because it consists of many drives.
The WriteBack mode influences the speed of the arrays in FAT32 more than in NTFS. The ranking doesn’t change, but the speed does change noticeably.
The linear read speeds are the same for both file systems, so we will show you just one general diagram here:
The linear read speed depends linearly on the number of the drives, so RAID0 arrays of two and three devices show twice or three times the speed of the single drive. RAID0 array of four disks doesn’t keep this tendency on, unfortunately. We could either blame the low performance of the central chip or the lack of bandwidth of the PCI 64/33MHz bus.
Unlike the situation with 3Ware 7850 controller, the linear read speed of the N-disk RAID5 array is not analogous to that of (N-1)-disk RAID0 array. The dependence of RAID5 speed on the quantity of the disks is evident, but it is not 4:3. RAID1 array is slower at linear reading than the single drive, while RAID10 falls behind RAID0 of two HDDs. It looks like TwinStor technology (it provides higher read speed from a “mirror” array due to intellectual alteration of the read requests to both drives of the mirror couple) doesn’t provide any advantages in linear reading on the 8500-8 controller.
The linear read speed doesn’t depend on the lazy write status. Well, that’s reading after all!
For the most meticulous sirs (and ladies!), I would like to offer the linear read graphs for every array:
3ware 8500-8 Escalade controller left a lasting good impression as it went with waving banners through our tests. Of course, there are minor problems you have to put up with. For example, TwinStor technology is not always correct, and, just like with 3Ware 7850 controller, we have no support for PCI 64bit/66MHz, which is critical even for the four-channel version of the controller if you use modern hard disk drives.
Anyway, this one showed most stable and predictable results among all SATA RAID controllers we have tested so far. We will learn soon if its supremacy is for long. Stay tuned!
You can always find the driver package for 3ware 8500-8 Escalade controller on the manufacturer’s website.
When I was working on the review, version 7.7.0 of the package was already available. This set of firmware, drivers and utilities works in Windows 2003/XP/2000, SuSE Linux 8x, Red Hat Linux 8x and 9x, and FreeBSD 4.8 Beta. By the way, this is the only driver release for Escalade 8500 to support FreeBSD.
Previous versions of the driver set support older operation systems. For example, the 7.6.3 release we used in our tests supports Red Hat 7x and SuSE 7x, but doesn’t support FreeBSD and Red Hat 9x. Overall, it’s quite possible to find a driver for 3ware Escalade 8500 controller that would support the particular OS you use.