Intel SRCS14L Four-Channel SerialATA RAID Controller Review

We are going to start a series of reviews devoted to multi-channel (that is 4 and up) SerialATA RAID controllers. The first one to open this series will be a four-channel Intel SRCS14L SATA RAID controller. The indepth analysis is now available in our new review!

by Nikita Nikolaichev , Alexander Yuriev
11/24/2003 | 11:36 PM

The closer the winter, the bigger gets the presence of the SerialATA interface in the market. I get the impression that almost any mainboard comes out equipped with a SATA-RAID controller, which is integrated into the chipset South Bridge in more than half of all cases. However, the hard disk drives supporting SATA interface are still less widely spread than those with the traditional ATA one. This way, I dare suppose that most users still prefer regular hard disk drives to the new SATA ones. In other words, it doesn’t yet make much sense to the users to shift to SATA interface, especially taking into account the price they will have to pay in this case.

However, there is a certain field where SATA hard disk drives are highly demanded: I am talking about the Low-End servers and high-performance workstations. They do not require super-high performance of the SCSI systems, and care more about the low cost.

Therefore, we are going to start a series of reviews devoted to multi-channel (that is 4 and up) SerialATA RAID controllers. The first one to open this series will be a four-channel Intel SRCS14L SATA RAID controller.

In our previous review of the dual-channel SATA RAID controllers we used Seagate Barracuda SATA V HDDs. We decided on these particular drives for the following reasons. Firstly, we didn’t have any other SATA HDDs at our disposal at that time. And secondly, I believe that most users prefer to read about widely spread controllers tested with widely spread hard drives. In other words, we believe that is makes much more sense to test dual-channel controllers with hard drives boasting 7,200rpm spindle rotation speed. And as for more serious controllers (with 4 channels and more), we will test them with two WD Raptor drives, especially since we have enough of them now :)

By the way, since we came to speak about dual-channel controllers. As we have promised the controllers have already been retested with the new driver versions and the results will be up on our site very soon. So, stay tuned!

Intel SRCS14L: Closer Look

TO tell the truth, I haven’t expected an Intel controller to be the first four-channel SATA RAID controller to be tested in our lab. But, since it was the first one we managed to get hold of, we think we have every right to state that Intel is very serious about the promotion of SerialATA interface :)

The SRCS14L controller is shipped in a nice box:

Besides the controller card, the package also includes a CD with the drivers, a bracket for the use of this controller card in low-profile PC cases and a user’s manual.


And this is the Intel SRCS14L controller:

The first thing that immediately catches your eye on the box as well as on the controller card itself is the input/output Intel 80303 processor working at 100MHz frequency and featuring hardware support for XOR operation intended to increase the performance of RAID 4/5 arrays. The controller processor takes over the entire work with the drives and creates different RAID arrays from them, thus unloading the CPU and increasing the overall system performance. Intel SRCS14L controller supports RAID 0, 1, 4, 5 and 10.

The work with the SerialATA drives is actually performed by two SiliconImage Sil3112A controllers. Each chip has two channels, so that we get the total of four SATA-channels. The controller features integrated cache-memory, so that you will not be able to increase its size. There are 64MB of PC100 ECC SDRAM memory onboard.

The controller itself is designed as a low-profile PCI card. Since it is intended for the simplest servers, the decision to design it as a low-profile card seems to be quite logical.

The interface of the 64bit 66MHz PCI 2.2 bus is intended for 3.3V and 5V, is backward compatible with the interface of the 32bit 33MHz bus. Finally, the controller is equipped with a speaker (emergency alarm) and status LEDs.

Here is a list of Intel SRCS14L controller specifications, according to the manufacturer:

Testbed and Methods

Our testbed was configured as follows:

We used the following software:

For WinBench tests the hard drive was formatted as one partition with the default cluster size. The WinBench tests were run seven times each; the best result was then taken for further analysis.

To compare the hard disk drives performance in Intel IOMeter we used the FileServer and WebServer patterns:

These patterns are intended to measure the disk subsystem performance under workloads typical of file- and web-servers.


Our colleague, Sergey Romanov aka GreY, developed a WorkStation pattern for Intel IOMeter basing on the StorageReveiw's study of the disk subsystem workload in ordinary Windows applications. The pattern was based on the average IPEAK statistics StorageReview provided for Office, High-End and Bootup work modes in NTFS5 file system and mentioned in Testbed3 description.

The pattern serves to determine the attractiveness of the HDDs for an ordinary Windows user.

Well, and in the end we checked the ability of the drives to work with sequential write and read requests of variable size, and tested the drive’s performance in DataBase pattern, which imitates the work of the disk subsystem with SQL-like requests.

The controller featured firmware version 2.36.02-R048. We used the driver version 3.05.

The controller was installed into the PCI-X/133MHz slot (even though Intel controller supports only PCI64/66MHz), the J4 jumper cap was removed, i.e. we allowed the controller to work in PC64/66 mode.

The WD360GD Raptor hard disk drives were installed into the default chassis of SC5200 case and were fastened at the bottom with four screws.

The workmode, arrays status and the like can be managed not only via the controller BIOS accessed during the system reboot, but also via the special Intel RAID Storage Console, which can be started right from Windows:

For our test session we used the following controller settings:

During the major tests we enabled lazy write algorithms for the drives:

And for some tests we had to disable them in order to evaluate the performance differences in both cases.

This time, when we tested Intel SRC14L controller, I actually had the first experience working with the Intel controllers BIOS. And to tell the truth there was something that nearly shocked me. This screenshot shows the integrated options for workload monitoring working in real time!


Performance in Intel IOMeter DataBase Pattern

As you remember, this pattern is used to check how well the controller can cope with a mixed stream of reads and writes for random 8KB data blocks. By changing the share of reads and writes we can evaluate how well the controller driver can sort them out:

Let’s have a look at a more illustrative picture shown on the graphs:

As the number of write requests increases, the HDD starts performing lazy writes more efficiently, therefore you can see RAID 0 array run faster in case of more HDDs involved only when the share of writes grows bigger. During reading, all arrays run equally fast.

The same situation can be observed with mirroring arrays. RAID 10 of four hard drives manages to run twice as fast as RAID 1 of two HDDs only when the share of writes is big enough, because RAID 10 is none other but a pair of RAID 1 arrays combined into a RAIUD 0 type one. Moreover, random writing onto a RAID 0 arrays is always twice faster than onto a single hard disk drive.

With RAID 5 array the situation is not so smooth any more. we have already seen in case of a 3Ware 7850 controller how the RAID 5 array should actually behave. However, in our today’s case we see a somewhat different situation. At first the arrays demonstrated similar performance growth, and as soon as the number of writes increases, the performance starts dropping down equally fast for both arrays (as it should actually be for RAID 5). The three-HDD array demonstrates higher performance than a four-HDD array. In case of small reads share the RAID 5 array of four drives is no faster than the same array of three. Although it should be faster. All in all, this time I came across the whole bunch of strange things…


The workload of 16 requests makes things look the right way in case of RAID 5 array. The graphs now behave as they should (the performance drops a while down with every new step along the X axis for all arrays and doesn’t depend anymore on the number of HDDs in an array). And the array of four drives is faster than the one of three. RAID 1, RAID 10 and RAID 0 of four drives suffered a pretty interesting performance drop when the writes reached 10% of the entire requests share. Here Intel SRCS14L controller starts using lazy writing algorithms, since the write requests start appearing, but they are still too few to make the caching efficient. As a result, the whole thing slows down because of this caching. However, as soon as the share of writes reaches 20%, the contribution of the lazy writing to the overall performance improvement is indisputable.

Under the workload of 256 requests the performance drop in case of 10% writes becomes even bigger for the reasons mentioned above. The most interesting thing here is probably the RandomRead. Since the performance of 2-HDD RAID 1 is almost the same as that of a single drive at that point, Intel SRCS14L controller doesn’t use any technology similar to Twinstor from 3Ware. On the other hand, if we compare the performance of two-drive RAID 0 and RAID 1 arrays, we will see that the latter is almost as fast as the former (but not in case of 0% and 10% writes, as these are the errors of the controller algorithms sending requests to the drives of the array). In other words, Intel’s alternating technology does work, but is not stable enough.

The graphs for RAID 0 array look almost the same as the graph for a single HDD (of course, we also take into account the adjustment coefficient depending on the number of the HDDs involved), which means that the controller copes with this workmode perfectly well. The graphs for RAID 5 array are also close to being impeccable.

Now let’s consider the RAID 0 array of four drives and see how the disabled lazy writing will affect the performance.

As you can see, this influence is pretty high in case of small queues and drops down to naught as the queue depth increases. However, this influence is anyway highly positive. So, why not leave the lazy writing enabled by default? Because this way if the power goes down for some emergency reasons, everything stored in the buffer will be lost. Well, is higher performance worth sacrificing the reliability? Anyway, it is always great to have a choice: high speed or high data security, trick or treat… :)


Performance in Intel IOMeter Sequential Read and Write Patterns

Now let’s see how the controller will cope with Sequential Reads and Writes.

IOMeter utility sends a stream of read and write requests with the requests queue depth equal to 4. Once per minute the test automatically changes the size of the processed data blocks. As a result, we can evaluate the dependence of the linear read or write speed on the data block size. The obtained results (the dependence of the controller performance on the data block size) have been summed up in the tables below for your convenience:

Let’s split the graph into two parts for better understanding:

The scalability of the array performance depending on the number of HDDs involved is pretty evident, although the maximum speed can only be achieved for large requests, that is when the controller splits the big request into a few smaller ones, which are performed by a few HDDs simultaneously. This is probably once of the reasons for the four-drive array to run almost as fast as the three-drive one. On the other hand, we know how big the stripe is: 64KB. As soon as the data block reaches 256KB, the array of 4 HDDs should already be 4 times faster than the single HDD. We do not see this, however, and the controller is the one to blame.

As we remember, some manufacturers alternate read requests between the drives of the mirrored pair. This way, RAID 1 array appears similar to RAID 0 array during reading, so that the reading from the array can (theoretically) become twice as fast. But as we see in the next benchmark, Intel SRCS14L controller doesn’t use this algorithm. The read speed from RAID 0 array is almost the same as the read speed from RAID 1. RAID 10 proves to be almost ideally twice as fast as RAID 1. RAID 5 arrays of three and four drives speed up a little bit as the data block grows bigger, however, the performance growth typical of RAID 5 only starts somewhere around 256KB data blocks. As we remember, the stripe-block is 64KB big, therefore the array of three HDDs starts working faster only when the data block size reaches 192KB (we simply do not have this mark on the axis). And for RAID 5 of four drives this point is at 256KB. At the same time, the latter array speeds up so greatly that with 1MB data blocks we can even see our Intel SRCS14L controller work at the maximum of 180MB/sec.

Now let’s check if the lazy writing affects the performance in Sequential Read pattern at all.

As we see, it doesn’t, which is really nice :)


Now let’s pass over to SequentialWrite pattern:

Here are the graphs:

Just like in case of SequentialRead, only RAID 0 of four drives stands out here.

Since the four-HDD RAID 0 array manages to write 64KB data blocks much faster than the three-HDD RAID 0 array and starts lagging behind only when the processed data blocks grow really big, Intel SRCS14L controller should be offering too little space for the data in its onboard cache. Of course, the controller processor might be unable to handle such big data packs, however, the XOR-processor is hardly loaded at all in case of RAID 0, so that the data write speed is primarily determined by the controller cache speed and the HDDs performance.

Note that RAID 1 and RAID 10 arrays fall behind the single drive when they read data blocks less than 64KB in size. RAID 5 arrays, on the contrary, start falling behind the single HDD only when the data blocks reach 32KB.

If we disable the lazy write algorithms, the HDDs will work noticeably slower. By the way, this difference in performance proves that the cache of Intel SRCS14L controller, which is always on, cannot handle the entire data flow. In other words, we shouldn’t have blamed the innocent processor.

Now let’s check the controller performance in real practical applications.


Performance in Intel IOMeter File- and WebServer Patterns

The more hard disk drives form an array, the faster it processes requests. Though, I noticed that an array of four drives is sometimes a little slow…

RAID 5 array of four drives is slightly slower than RAID 10, and RAID 5 of three drives is even slower than RAID 1.

Since FileServer pattern works with about 20% of writes, the performance in it doesn’t depend on the lazy writing algorithms that much. On the other hand, the cache of Intel SRCS14L controller has definitely contributed to lowering the performance differences, which is especially noticeable in the end points of the graph.

To compare the performance of various RAID arrays we suggest using the rating system. Considering all workloads are equally probable, we will calculate the rating coefficient as the average controller performance under all types of workload:

The arrays using mirroring, namely RAID 1 and RAID 10, demonstrate excellent results, as we can see. RAID 10 array outperformed RAID 0 of three HDDs. RAID 0 of four HDDs managed to retain its leadership only due to the fact that there is a certain amount of writes in the FileServer pattern, which are performed faster by RAID 0 array rather than by RAID 10 array. RAID 5 of N HDDs falls behind RAID 0 of N-1 HDDs because of the same 20% of writes.


Now let’s have a look at the WebServer pattern, which is known for having no writes at all:

The major feature distinguishing these graphs from what we have just seen in FileServer pattern, is the fact that the single HDD graph and the ones for RAID 0 arrays start in the same point. The thing is that under linear workload RAID 0 arrays take advantage of their ability to make the HDDs work in parallel. The received request is simply sent to a corresponding drive and the HDDs work independently of one another (the requests data block is smaller than the stripe).

The performance drop you can see in case of large queue depth blocks processed by RAID 10 and RAID 1 arrays is a feature of the Intel SRCS14L controller, although I personally would regard it as a bug…

Just like in SequentialRead mode, the lazy writing doesn’t affect the performance in WebServer pattern in any way.

Again we suggest comparing the performance of different RAID arrays using our rating system. Considering all the workloads equally probable we will calculate the rating coefficient as the average controller performance under all types of workload:

The absence of writes changes the situation completely. RAID 10 array still manages to outperform RAID 0 array of four hard disk drives, however, since its performance drops down under heavy workload it yields to RAID 5 of four HDDs. RAID 5 array of three HDDs occupies the fourth place in this case. It was really interesting to see RAID 5 array of four drives outperform RAID 0 composed of the same four drives.


Performance in Intel IOMeter WorkStation Pattern

WorkStation pattern should imitate the workload created by user working hard in different applications under NTFS5 file system.

The performance of the single hard disk drive drops down as the requests queue depth increases. RAID 0 arrays demonstrate a certain performance improvement, although the array of three drives still cannot be called scaleable on the number of HDDs, like in the previous patterns.

You can notice the RAID 1 and RAID 10 performance depends on the number of HDDs involved only when the queue depth exceeds four requests. At the same time, RAID 5 of four drives starts running faster than RAID 5 of three.

The large share of writes in the WorkStation pattern makes the performance depend more on the lazy write algorithms.

The WorkStation rating is calculated according to the following formula:

Performance = Total I/O (queue=1)/1 + Total I/O (queue=2)/2 + Total I/O (queue=4)/4 + Total I/O (queue=8)/8 + Total I/O (queue=16)/16 + Total I/O (queue=32)/32

Since the writes share is pretty high here, the results appeared quite predictable. RAID 5 arrays fell even behind a single HDD. RAID 1 array gave way to RAID 0 of two HDDs, while RAID 10 array fell behind RAID 0 of four.

Another very interesting rating is to come now. We have already shown you how greatly the performance of Intel SRCS14L controller depends on the lazy write algorithms involved. Now we suggest comparing the three last patterns in this respect:

The performance difference in one and the same pattern depends on the lazy write algorithms of the HDDs used, and the performance difference between various patterns is determined by the writes share, and by the constantly enabled controller lazy writing (if the HDD lazy writing is disabled). Therefore, the performance difference in WebServer pattern with 0% of writes depends on the HDD lazy writing and lies within the measuring error, and shows the slowest performance compared with the other patterns. FileServer with 20% of writes is the second fastest in both: performance and performance difference. And the first prize, of course, belongs to WorkStation boasting the highest writes share of all three.


Performance in WinBench99

In conclusion I ran some tests in WinBench99 software package. This benchmark has been used in our lab for years and still serves as an excellent tool for desktop performance tests.

We will start with NTFS this time:

The table is certainly not so illustrative as the diagrams. So, let’s compare the performance of different arrays in two integral subtests: Business Disk WinMark and High-End Disk WinMark:

RAID 5 arrays let everyone else go ahead. RAID 1 array defeated the single hard disk drive, while RAID 10 yielded to RAID 0 of three drives.

The efficiency of lazy writing is very important here, as we can see.

Since the linear read speed and Average Access Time are the same for NTFS and FAT32 file systems, we will offer you just two diagrams for both.

As you can see, the dependence of the read speed from the arrays on the array type is a little unusual by Intel’s controller (to put it mildly). The leader in reading appeared RAID 10 array of four drives. And the read speed achieved on it makes only 114MB/sec (which is equal to the read speed from two WD Raptor drives). And what about the other arrays?


Here are the linear read graphs for your reference:

I was very curious to dig a little bit more into the read speed being inadequate to the RAID array type and the number of HDDs involved. Therefore, I carried out a few additional tests for RAID 4:

Intel SCRS14L controller does support RAID 4, but this array type is not so widely spread as RAID 5, for instance, that is why we didn’t dwell on the results for it in this review.

Now come the results for Average Access Time:

The leadership here belongs to the mirrored arrays, namely to RAID 1 and RAID 10. All the others do not show any noticeable performance difference.

Now let’s check how fast our controller is in FAT32:

RAID 1 and RAID 10 moved one step lower compared with what we saw in NTFS. However, we immediately notice that the performance of RAID 0 array of three HDDs is higher than that of a RAID 0 array of four drives. As we have seen earlier today, something like that has already happened in SequentialWrite pattern.

Just like in NTFS, disabling the lazy writing matters a lot for the entire array performance.


Fault-Tolerance

To test how good the controller can secure the stored data in case one of the array drives fails, we imitated the typical “emergencies” for RAID 1, RAID 5 and RAID 10.

It turned out a much easier task to cause a SATA drive failure compared to the good old PATA HDDs. Since the WD Raptor drives can be powered via the old power supply connector as well as via the SerialATA one, it is pretty easy to imitate a drive failure by simply unplugging the SATA cable.

Intel SRCS14L controller reported a drive failure in all three cases and after a short pause started restoring the array integrity by using a HotSpare disk (in case of RAID 1 and RAID 5). For RAID 10 emergency we disconnected one of the drives, waited for the array indication in Degraded mode, replaced the failed drive and made sure that the controller recognized a new one and started restoring the array. I should say that the “pause” took much longer in the latter case than for the HotSpare drive. Moreover, during this pause the controller didn’t do anything at all…

Restoring the array under workload (that is when the controller not only checks the data integrity and regenerates it, but also processes user requests) takes a lot of time and we couldn’t wait until it is completely over. Therefore, we simply unloaded the controller (terminated the test) and waited for the array restoring to be completed.

It took about 30 minutes to restore RAID 1 array, while RAID 5 and RAID 10 required a bit over an hour.

Conclusion

So, the benchmark results for the Intel SRCS14L controller show that it managed to pass all the tests pretty successfully. We saw it adjust the array scalability depending on the number of hard disk drives involved, save the data in fault-tolerant arrays (such as RAID 1, RAID 5 and RAID 10).

At the same time, we have to admit that it proved quite slow in many work modes. Here I definitely have to stress its poor performance in RAID 5 and RAID 10 arrays when the workload wasn’t that high, as well as pretty low read speed from RAID 5 array…

As for the performance of Intel SRCS14L controller against the competitors, we are going to discuss it pretty soon, so stay tuned! :)