HighPoint RocketRAID 2320 Controller Review

Today we are going to introduce to you a hardware RAID 5 controller from HighPoint that supports SATA-II and SATA150 hard disk drive, advanced PCI-Express interface, both command queuing technologies, such as TCQ and NCQ. This controller demonstrated pretty good scalability and high performance in our tests. Read more in our review!

by Alexander Yuriev , Nikita Nikolaichev
02/24/2006 | 11:53 PM

It looks like we haven’t had any reviews of HighPoint controller cards on our site for quite some time now. We decided to correct this mistake, especially since we managed to get our hands on a new and very interesting product from this manufacturer.

 

After long-term and not very successful experiments with the software RAID 5 implementation, HighPoint Company managed to catch up with its competitors in one big jump having launched the product that represented a great combination of the today’s latest technologies.

Closer Look

RocketRAID 2320 controller from HighPoint supports 8 SATA-II HDDs (of course it will also work perfectly well with SATA150 HDDs). It allows building the arrays of the following types: RAID 0, 1, 5, 10 and JBOD. The controller supports the today’s most advanced interface – PCI E x4.

Thanks to the 64-bit LBA addressing support, RocketRAID 2320 controller allows creating arrays with the capacity over 2TB. Moreover, it can work with both command queuing technologies, such as TCQ and NCQ.

The controller is shipped in a light-blue box:

The controller itself is designed as a low-profile card. To reduce its height, HighPoint installed SATA connectors in two rows and two layers. XOR processor is hidden under the heatsink, which looks very impressive on a small PCB like this. PCI Express x4 connector allows installing this controller card into PCI-E x8 and x16 slots.

Besides the controller card itself, the box also contains a low-profile bracket, a cable for RocketGuard100 monitoring card, corresponding software, 8 SATA cables of HighPoint’s traditional blue color and a user’s guide.

RocketRAID 2320 controller is claimed to support the following operating systems:

You can download all the latest driver versions from the manufacturer’s web-site .

Testbed and Methods

Our testbed was configured as follows:

We tested the controller in FC-Test 1.0 build 13 and Intel IOMeter 2003.02.15. We used FileServer and WebServer patterns in our Intel IOMeter tests.

These patterns are intended for measuring the performance of the disk subsystem under workload typical of file- and web- servers.

We also use the WorkStation pattern, created by Sergey Romanov (a.k.a. GReY). It is based on the statistical data about the disk subsystem workload as given in the StorageReview Testbed 3 description . The statistical data for the NTFS5 file system are gathered in three operational modes: Office, Hi-End and Boot-up.

This pattern shows how well the controller performs in a typical Windows environment.

Lastly, we checked out the controller’s ability to process sequential read/write requests of variable size and its performance in the DataBase pattern, which loads the disk subsystem with SQL-like requests.

For FC-Test we used our five standard file-sets (Install, ISO, MP3, Programs and Windows) which we wrote to the array, read from it and then copied.

Our controller was tested with the firmware version 1.01 and the drivers from the same set. It was installed into PCI-E x4 slot.

WD740GD (Raptor) hard disk drives were installed into the ASATAHSDB (SATA 4-Hot Swap Drive Bay Upgrade) basket.

Performance in Intel IOMeter DataBase Pattern

We traditionally start out by checking the controller’s operation with mixed streams of requests.

This pattern sends a stream of requests to read and write 8KB random-address data blocks. By changing the ratio of reads to writes we can check how well the controller’s driver can sort them out. The results of the controller in WriteBack mode are presented in the table:

Let’s view these numbers as diagrams, which will show the dependence of the controller’s speed on the percentage of write requests for queue depths of 1, 16 and 256 requests. For better readability we divide the arrays into two groups.

As the number of write requests increases, the efficiency of lazy writing grows up and the speed of the single drive rises. The speed of the RAID0 arrays also grows up depending on the number of discs per array, but it doesn’t scale up exactly proportionally to the number of the disks even in Random Write mode (when there are 100% of writes). In RandomRead mode under linear workload all arrays perform close enough, however this time we can notice that the performance is inversely proportional to the number of HDDs in the array.

The arrays with mirrored pairs (RAID1 and RAID10) alternate the requests between the two disks of the mirror because their performance improves (above that of the JBOD and the two-disc RAID0) at higher percentages of reads. When the probability of writes is high, RAID1 and RAID10 arrays run much slower than a single HDD or a two-disc RAID0 array.

RAID5 array performance should theoretically reduce as the share of write requests grows up. The only exception is the RandomWrite mode, in all other cases the performance of these arrays corresponds to the theory. Only a four-HDD array for some reason fell behind the three-HDD array.

Now let’s increase the workload:

The higher workload results into RAID0 array speed being proportional to the number of HDDs in the array when we get close to 100% read requests. However, as the share of write requests increases, the picture gets less rosy. If the array is built of the odd number of hard disc drives, the lowest speed is demonstrated in case of 50% writes. And by the arrays built of even number of discs the minimal speed is achieved in case of 60% writes.

Let’s check out the performance of other arrays:

Just like in case of linear workload, the performance of RAID1 and RAID10 arrays is higher than that of a single HDD and RAID0 array of two HDDs correspondingly in test modes with high reads share. In test modes with higher writes share this performance is lower. Of course, RAID1 and RAID10 are faster in modes with higher reads share due to intellectual selection of the optimal hard disc drive from the mirrored pair.

And as for RAID5, the situation again leaves much to be desired. The four-HDD array is faster than a three-HDD array only when the read requests are dominating. As the share of writes picks up, these two arrays level out and then the one with more HDDs falls behind.

As the workload increases to 256 requests, nothing really changes about the arrays performance. At the same time, the performance of RAID0 arrays in RandomRead mode indicates with all certainty that this controller supports TCQ.

Performance in Intel IOMeter Sequential Read and Write Patterns

IOMeter is sending a stream of read/write requests to the array with request queue depth = 4. Every minute the size of the data block changes, so we can see the dependence of the linear read/write speed on the data block size.

The dependence of the controller data read speed on the size of the data block is given in the table below:

Now let’s build the graphs for two groups of RAID arrays showing the dependence of their performance on the data block size:

The advantage of RAID0 array made of a lot of hard drives starts showing only when the data blocks are really big, i.e. when the controller can split large data blocks into a few smaller ones and use the hard disk drives in parallel. In this case RAID0 arrays proved pretty efficient. The arrays of two, three and four hard drives reach the maximum request speed at 16, 32 and 64KB data blocks respectively. Moreover, the scalability of the read speed on the number of HDDs in the array is almost ideal.

This group of arrays looks somewhat worse. In fact all arrays perform pretty well until the data blocks reach a certain size (the sizes are different for different arrays). The graphs for RAID1, RAID10 and RAID 5 of three drives and RAID5 of four drives are exactly the same as the graphs for a single hard disk drive and RAID0 of two and three HDDs respectively. However, when the data block size reaches 64, 128 or 256KB (this size is again the same for each type of array), the array speeds drop down quite rapidly.

Now let’s see what the controller does during sequential writing. The controller data transfer rates depending on the size of the data block are all given in the table below:

Now let’s take a look at the graphs showing the dependence of the array speed on the data block size. The graphs will again split into groups:

Just as in case of Sequential Reading, RAID0 arrays built of multiple HDDs start showing their real advantages only when the requested data blocks are considerably large. However, during reading this array group jumped very rapidly to the maximum level of performance with small requests already, while during writing the speed growth was definitely smoother.

When there are no read requests, the performance of RAID1 and RAID10 mirrored arrays almost coincides with that of a single HDD and a two-drive RAID0 array respectively. RAID5 arrays demonstrate pretty good scalability of performance on the number of hard disk drives in the array and reach their maximum speed at relatively small data blocks already. However, they still cannot reach the maximum speed of RAID0 arrays from n-1.

Performance in Intel IOMeter Fileserver and Webserver Patterns

Let’s see if the controller manages to cope with the test mode emulating the file- and webserver operation.

At first come the results for File-server test mode:

Let’s take a look at the results represented as dependence of the data transfer rate on the queue depth. For easier analysis the arrays were split into two groups:

There are only 20% of writes in this pattern that is why most arrays show pretty decent results. RAID0 arrays demonstrate very good performance scalability on the number of hard drives in the array. The performance of RAID1 and RAID10 arrays is much higher than that of a single HDD or RAID0 of two HDDs. This is clear evidence that the optimization algorithms for mirrored reading work perfectly well here.

RAID5 array performance in heavy workmodes appeared lower than that of RAID1, which once again suggests that the XOR processor of our controller is not powerful enough.

Let’s now compare the performance of different arrays using our rating system. Provided all workloads are considered equally probable, we will calculate the general performance rating index as the average performance during requests processing under three types of workload:

RAID0 array is far ahead of the others here. RAID10 array is close behind the RAID0 of three hard drives, and RAID1 array is just a tiny bit behind RAID0 of 2 hard disk drives. RAID5 arrays managed to outperform only a single HDD this time.

Now let’s take a look at the results in Webserver pattern:

RAID0 array graphs haven’t really changed compared to those in Fileserver, however, all other arrays got much more affected by the absence of write requests. Mirrored RAID1 and RAID10 arrays that use algorithms for read optimization from the mirror turn out faster than RAID0 of two hard disk drives and four hard disk drives respectively in all test modes except the one with 256 requests queue depth. However, it is even more noticeable that the performance of RAID5 array of three and four drives is higher than that of RAID0 arrays of the corresponding number of drives in almost all test modes.

Now let’s compare the results for different arrays using our rating system. Provided all workloads are considered equally probable, we will calculate the general performance rating index as the average performance during requests processing under all types of workload:

Since there are no tasks for the XOR-processor, RAID5 array of four drives appeared the fastest of all, and RAID0 and RAID5 arrays of three drives run almost equally fast. RAID1 and RAID10 also proved quite fast when there are no write requests to be processed, they even outpaced RAID0 arrays of the corresponding number of HDDs.

Performance in Intel IOMeter WorkStation Pattern

Now let’s pass over to WorkStation pattern, which imitates active user work in various applications in NTFS5 file system:

The situation is pretty typical of RAID 0 arrays: the more HDDs are used to build the array, the faster it processes the requests. As for RAID1 and RAID10 arrays, they are at least a little bit faster than a single HDD and RAID 0 array made of 2 respectively.

There are quite a few random write requests in WorkStation pattern, which reduce the performance of RAID5 pretty noticeably. But in this case they have simply buried RAID5 array completely.

Let’s compare the performance of RAID arrays of different types. The ratings for the WorkStation pattern will be calculated according to the formula below:

Performance Rating = Total I/O (queue=1)/1 + Total I/O (queue=2)/2 + Total I/O (queue=4)/4 + Total I/O (queue=8)/8 + Total I/O (queue=16)/16 + Total I/O (queue=32)/32

As we have expected, random writes allowed RAID1 and RAID10 arrays to outperform only a single HDD and 2-HDD RAID0. As we have expected, RAID5 arrays were the slowest of all here.

Performance during Multi-Threaded Writing / Reading

In this pattern we test the controller’s ability to perform multi-threaded sequential writing/reading by emulating simultaneous workload on the storage subsystem imposed by a few applications that have requested “large files”. The special test agent of the IOMeter program that will emulate the above mentioned applications (in Intel IOMeter terms it is called Worker) reads/writes a sequence of 64KB blocks of data starting with some initial segment. By increasing the number of outgoing requests from the Worker (from 1 to 8 with the increment equal to 1) we study how well the controller can reorganize and arrange requests (i.e. combine several requests for sequential data into a single request). Raising the number of simultaneously running Workers, we make it harder for the storage subsystem, because in real-life situation several simultaneously working programs will compete with one another for the priority access to the hard disk drives. Each Worker processes its own data (i.e. the addresses of the requested data blocks do not coincide by different Workers).

Let’s build the diagram for different arrays performance for 1 request queue, as this is the most probable real-life situation. All RAID0 arrays are marked with the same color. RAID5 arrays have a different color, and so do mirrored arrays. The order in which these arrays are displayed on the diagrams is the same as on the legend: the higher – the most hard disk drives are involved into the given array.

In case of 1 request workload per thread we haven’t seen any steady scalability of the read speed on the number of HDDs per array neither in RAID0 nor in RAID5. The single HDD speed turned out almost the same as RAID1 speed. Since we didn’t manage to squeeze anything more than that, neither the controller, nor the HDDs can boast aggressive read ahead.

In case of two simultaneous threads mirrored RAID1 and RAID10 arrays showed their best (the ideal situation is when each HDD returns its own data). And the performance of all other arrays has dropped down quite unexpectedly, although we could see some scalability of the speed on the number of HDDs in the array with RAID0 and RAID5 arrays. Of course, the HDD heads have to shift between two work zones all the time, when there are two operational streams, so the linear read speed from the array is completely out of the question here.

Further increase in the number of working threads affects the performance of all arrays, but the general tendency remains the unchanged.

Now let’s take a look at multi-threaded writing:

And during writing, RAID1 and RAID10 mirrored arrays turned out slower than a single HDD and a 2-HDD RAID0 array in almost all test modes. RAID5 arrays on the contrary run very fast and are perfectly scalable depending on the number of hard drives in the array.

RAID0 arrays are pretty slow and hardly scalable when there is only one working stream. However, in case of two threads of data processed simultaneously the performance and scalability get much better. Incase of three simultaneous threads RAID0 arrays again prove highly scalable on the number of HDDs involved, but the speed is not that high any more. As it comes to four threads running at the same time, performance drops and scalability disappears completely.

Performance in FC-Test

We stick to our traditional methodology of using FC-Test: we create two logical volumes, 32GB each, on the array and format them in NTFS and then in FAT32. We create a set of files on the first volume, and then this pattern is read from the array, then copied into a folder on the same partition (copy-near – inside one and the same logical volume), and finally copied onto another partition (copy-far).

The test system is rebooted between the tests to avoid the influence of the OS’s caching on the results. We use five file patterns here:

NTFS File System

Let’s start with NTFS. We’re going to examine the results of each test action for each pattern independently due to the abundance of data we have received. The first action is the creation of a set of files on the array.

We will consider three most interesting sets on the diagrams:

As we have expected, RAID 0 arrays are the fastest here and highlyscalable depending on the number of HDDs. RAID1 and RAID10 arrays are always falling behind the single drive and RAID0 array of two drives respectively.

RAID5 array is the slowest here, although we can see great scalability of the results in all patterns.

Now let’s check out the reading:

We have already seen this result in our article called LSI MegaRAID SATA 300-8X Controller Review.

The performance levels of three- and four-drive RAID0 and RAID5 arrays are very close and aren’t too high. The read speed from RAID1 and RAID10 arrays is almost the same as in case of a single hard disk drive and two-HDD RAID0.

Now let’s take a look at file copying speed within the same partition:

RAID0 and RAID5 arrays demonstrate perfect scalability of the results on the number of hard drives included into the array in all file patterns. However, the overall level of performance is not very high, I should say.

RAID10 performance is almost the same as that of RAID0 of two hard drives. And RAID1 is stably faster than a single HDD.

Now it is the time for another copying test – from one partition to another:

Just as in the previous case RAID0 and RAID5 arrays scale up very nicely depending on the number of drives in the array, and the performance of RAID10 equals to that of two-drive RAID0.

RAID5 array is pretty fast in Install and ISO patterns. RAID1 array performs very well in ISO pattern, too.

FAT32 File System

The performance is slightly higher than in NTFS file system. Other than that everything looks pretty much the same.

RAID1 array is practically as fast as a single hard disk drive. The speeds of all other arrays are very similar.

During the copy tests the results are again very similar to what we have seen in NTFS file system discussed above.

Linear Read Speed

As usual, we will wind up our review with the linear read graphs:

Conclusion

Well, we have just tested the first hardware RAID5 controller from HighPoint. The results show that this is a pretty good product. RAID 0 arrays built with this controller are quite fast and demonstrate excellent performance scalability depending on the number of hard disk drives in an array. RAID1 and RAId10 mirrored arrays also proved highly efficient.

However, RAID5 arrays still leave much to be desired. I dare conclude that HighPoint software developers should pay a little bit more attention to this array type. However, I am confident that in the future HighPoint RocketRAID 2320 will become a very successful product.