USB 3.0: Theory and Practice

New interfaces appear in computer life relatively rarely, because they cause pretty serious changes in the data transfer infrastructure. That is why it is especially interesting to see what USB 3.0, a long-anticipated replacement to the morally outdated but extremely popular USB 2.0 interface, actually is.

by Aleksey Meyev
06/07/2010 | 05:47 PM

Over the long years of its existence, the USB 2.0 interface has become a habitual feature of personal computers. Every computer has them and in good measure, too: every more or less modern mainboard offers up to a dozen or more USB 2.0 ports. This popularity has been based on multiple factors: easy connection (for a decade, generic drivers for basic USB device types have been available by default in every popular operating system), broad availability, compact connectors, versatility, and powering of the peripheral from the interface connector. External disk drives, sound cards, printers, scanners, modems, mouse devices and keyboards all come with USB connectors now. And there are also a lot of diverse accessories from desk fans to illuminated Christmas trees that need to be connected to a USB port for power.

 

The speed factor, however, has been failing recently, just as you can expect from an interface developed a decade ago. In fact, the theoretical peak bandwidth of 480 Mbps (60 MBps) is quite high, but you cannot get a practical data-transfer rate of more than 35 MBps from USB 2.0. Low-speed devices like a computer mouse don’t care about that limitation, but external storage products have long felt limited by USB 2.0 because today’s hard disk drives, even 2.5-inch ones, can read data from their platters much faster. In fact, even flash drives may be faster than the USB 2.0 interface, making their manufacturers develop flash drives with eSATA even though such products still have to be powered by a USB port (the current version of the eSATA interface does not provide for the powering-up of the connected device).

So, the next version of USB has been anticipated by both users and manufacturers, and it is here now. USB 3.0 has been implemented in over a dozen mainboards already but there are still very few peripheral devices complying with it. We’ve managed to get a couple of samples for our tests, though.

USB 2.0 and 3.0

Talking about the special features implemented in the new version of the interface, we should briefly recall its history which began in 1995 when the first version of the Universal Serial Bus protocol was introduced.

It had been developed by Microsoft and Intel with the aim to replace the abundance of external interfaces existing at that time such as a parallel port, serial port, joystick port, external SCSI (all of which indeed vanished from mainboards eventually). USB was also conceived as a fast but also inexpensive external interface. Three years later, in 1998, the updated 1.1 version was introduced. And in 2000 the version 2.0 specification prompted widespread popularity of USB. To the existing Low Speed (up to 1.5 Mbps) and Full Speed (up to 12 Mbps) modes, that version added Hi Speed (up to 480 Mbps) to compete with the FireWire IEEE1394a interface (400 Mbps). There was no competition, actually. Thanks to its simpler implementation and licensing, USB 2.0 quickly ousted FireWire into a small niche of digital camcorders, notwithstanding some technical advantages that FireWire possessed.

The Universal Serial Bus is quite easy to understand. First, it has a host controller which manages the whole data transfer process. This host controller is connected to hubs and to end devices (directly or via hubs). There can be a total of 128 devices is such a tree. A USB hub can be either passive or active. The active variety has a dedicated power source and can power up the connected devices without consuming any electric current from the host controller. A passive hub is not really passive, either. It is a rather complex electronic device.

So, the host controller is polling the end devices regularly and allotting them time intervals during which they can transfer data. The shortcomings of this whole mechanism are clear enough. The USB bandwidth is shared by all the devices. The more devices you connect, the less bandwidth is allotted to each of them. This problem is somewhat mitigated by multiple types of logical connections that can be established between the host and peripherals. It can be a control channel for transferring short commands, or an interrupt channel for short commands with guaranteed delivery time, or an isochronous channel with guaranteed delivery time for a certain number of packets within a given period, or a bulk transfer channel that guarantees delivery but does not specify the speed and latency. Thus, there are different channels types for different devices (an interrupt channel for a mouse or keyboard, an isochronous channel for disk drives, etc). During each period of operation, the bus transmits interrupt packets first. Next go the required number of isochronous packets. And the rest of the time is allotted to control and bulk transfer packets.

Again, the whole process is governed by the host controller which polls connected devices, listens to interrupts in the dedicated time periods and sends devices to sleep. A connected device cannot go to sleep or wake up, initiate a data transfer or say something important to the host (e.g. about a buffer overflow) of its own accord. Besides, every such channel is half duplex and cannot both send and receive data simultaneously. There is no equality in the USB architecture. Whatever devices you connect, one of them has to be the host and the others have to submit to it.

As USB devices were getting more popular, mainboards offered more and more USB ports. The manufacturers solved the problem of having to share a single USB bus by simply introducing a few buses. For example, the popular Intel P55 chipset turns to have as many as seven UHCI controllers (responsible for Low Speed and Full Speed devices) combined with seven dual-port hubs, and two EHCI controllers (responsible for Hi Speed devices). This is an intricate tree with multiple roots and a few trunks!

The final aspect of USB to be discussed is the power it provides. The load capacity of one port is limited to 0.5 amperes and you must make sure that multiple devices connected to it won’t overload that port. There is a simple mechanism for that. When connected, the device must tell the host how much electric power it needs and remain in sleep mode until the host allows it to turn on. If the total consumption current is higher than 0.5 amperes, the host won’t permit the last connected device to turn on. This mechanism has one vulnerability. Although it is possible to check out if the device indeed consumes as much power as it asks for, this checking would make the USB controller too complex and expensive. Therefore, the majority of USB hosts just trust what the device tells it. On one hand, this may overload the host and even damage it. But on the other hand, USB devices that consume slightly more than 0.5 amperes can work. External hard disk drives are in this category. According to our tests, they need about 0.7 to 0.9 amperes when spinning their spindle up. They inform the host of 0.5A consumption (and cannot report a higher current even theoretically because the USB specification does not provide for that) and their further operation depends on whether the host controller can provide the amount of power they really need. Various USB fans, lamps, etc behave even more irresponsibly. They often don’t have any USB controller inside and do not tell the host anything about their power requirements. No matter how many such devices you have plugged in, the host controller will think that they consume naught.

Of course, it is not normal for a large and popular class of devices, external HDDs, rely upon undocumented capabilities, therefore the low load capacity of a USB 2.0 port is a drawback, too. Many other consumers such as scanners, compact speaker systems, mini-monitors and various chargers wouldn’t refuse to have more power, either.

Winding up this overview of USB 2.0, we want to recall the physical level, i.e. cables. A USB cable has four wires: two for data, one for ground and one +5V line for power. The original USB specification intended the standard flat type A connector for the host controller’s side and the type B connector for the device’s side, but there soon appeared a lot of compact connectors (a few versions of mini-USB and micro-USB).

Now let’s talk about USB 3.0. The new version brings about a new operation mode called Super Speed which has a peak data-transfer rate of 4.8 Gbps. The developers of the new version of USB tried to keep it compatible with all existing USB devices and make it as simple as before.

So, they complemented the UHCI and EHCI controllers one more controller which is responsible for the Super Speed mode. This ensures compatibility and adds a new data channel that old and slow devices won’t affect.

The cables and connectors have changed as the result. Besides the existing four wires, there are now two pairs of signal wires, one for transferring data towards the controller and another, from it. There is also one additional ground wire. The USB connectors have acquired five more pins while retaining compatibility with the older connectors. This helps you easily identify a USB 3.0 device by taking a look at its connector.


USB 3.0 Type A


USB 3.0 Type B


USB 3.0 Type Micro-B

Besides the higher speed, USB 3.0 brings about a lot of other innovations. First, it increases the current for powering a peripheral device up to 0.9 amperes. This is especially good for external storage devices based on 2.5-inch HDDs which can now do without a Y-shaped cable they used to get power from two USB ports at once. Second, the two data transfer lines imply that USB 3.0 allows to send and receive data simultaneously. Third, the new version of USB introduces a full-featured interrupt mechanism that allows to get rid of the time-consuming polling. Fourth, a device can now establish more than one data transfer channel.

Power saving is not forgotten, either. The interrupt mechanism allows to manage the power consumption of devices using low-power modes initiated by the peripheral device itself. In fact, the whole architecture has been dramatically revised and the USB 2.0 compatibility may even look like an addition to a whole new interface.

But that’s enough of theory (you can get more documentation at the official site). Let’s check out how good the new USB is in practice!

Testing Participants

Buffalo HD-H1.OTU3

 

              

The Buffalo drive doesn’t look exceptional. It is a neat plastic brick with a 3.5-inch Samsung HD103SJ hard disk drive inside. The brick is supposed to stand upright (although it does not look steady considering the lack of any feet). On its side panel you can spot a power connector (alas, 0.9 amperes is still not enough for full-size HDDs), a small fan, and a type B USB 3.0 connector which differs noticeably from the connector of the old standard.

Vantec NextStar 3

 

Next goes an enclosure from the well-known Vantec. It stands upright on its longer side and uses a small stand. It still doesn’t look very steady, though.

The Buffalo enclosure could not be taken apart whereas the Vantec one revealed an ASMedia ASM1051 chip inside. USB 3.0 is implemented using a NEC µPD720200 root controller, which is virtually the only such chip available today.

 

It must be noted that the USB 3.0 controller from ASUS uses four PCI Express lanes and thus is not limited by the PCI Express bandwidth. At the current moment, the add-on card is the best option because we see the same NEC controller integrated into mainboards but the bandwidth of its PCI Express connection is unclear (one PCI Express 1.1 lane won’t be enough as its bandwidth is lower than the USB 3.0 bandwidth). And there are no chipsets with integrated USB 3.0 controllers as yet.

Testbed and Methods

The following testing utilities were used:

Testbed configuration:

We installed the OS’s generic drivers for the tested drives. We formatted them in FAT32 and NTFS as one partition with the default cluster size. For some tests 32GB partitions were created on the drives and formatted in FAT32 and NTFS with the default cluster size, too. The internal drives were connected to a mainboard connector and worked with enabled AHCI. The external drives were connected to a USB 3.0 port of the ASUS expansion card or to a mainboard’s USB 2.0 port.

We guess that it would be interesting to compare not only the two external drives with each other but also the two versions of the USB standard. We will also compare USB with SATA 300. Therefore you will see four sets of data: two sets for the two USB 3.0 external drives, one for the Vantec with USB 2.0 and one for a SATA 300 hard disk drive. We used the same Samsung HD103SJ hard disk drive for each test. However, the Buffalo enclosure could not be taken apart. Therefore, knowing that the Buffalo contained a Samsung HD103SJ, we took another such drive for the Vantec and for the enclosure-less variant. Of course, different samples of the same HDD may vary somewhat in performance, yet this is the best we could do to make the comparison as accurate as possible.

We also used a 160GB Intel X25-M G2 solid state drive in a couple of tests.

Performance in Intel IOMeter

Sequential Read & Write Patterns

IOMeter is sending a stream of read and write requests with a request queue depth of 4. The size of the requested data block is changed each minute so that we could see the dependence of the drive’s sequential read/write speed on the size of the processed data block. This test is indicative of the maximum speed the drive can deliver.

The numeric data can be viewed in tables by clicking the links below. We will discuss graphs and diagrams.

The new USB version is obviously superior to the old one. The maximum data-transfer rate with USB 3.0 is as high as with SATA 300 whereas USB 2.0 is limited to 33.5 MBps. Thus, the new interface is enough for today’s HDDs. However, it is still not free from high latencies as you can see by checking out the results of the drives with small data blocks where USB 3.0 is inferior to SATA 300. Interestingly, we get the same speed when we install the SSD into the external enclosure, so there is indeed some performance limitation. We don’t know for sure if this is due to low performance of this USB controller or some fundamental limitation of the new bus architecture.

We are also surprised at the results of the SSD in terms of maximum speed. We rechecked them and even tried other SSDs, but had the same speed of 160 MBps. Yes, this is much better than the speed of 35 MBps we have with USB 2.0, but nowhere near the promised tenfold performance boost. Hopefully, this is only due to some imperfections of early USB 3.0 implementations and we will see data-transfer rates closer to the declared 4.8 Gbps in the future.

We’ve got the same picture at writing: USB 3.0 is much better than its predecessor and has enough bandwidth to service a modern 3.5-inch HDD. However, we can still see a performance hit on small data blocks. This effect is too repetitive to be accidental.

Disk Response Time

In this test IOMeter is sending a stream of requests to read and write 512-byte data blocks with a request queue depth of 1 for 10 minutes. The total amount of requests processed by the drive is much more than its cache buffer, so we get a sustained response time that doesn’t depend on the amount of cache memory the drive has.

We can see obvious improvements here. The new interface has lower latencies than its predecessor but cannot match SATA 300. You can see this from the results of the Vantec enclosure in which we used the same HDD as connected via SATA. The Buffalo enclosure contained another sample of the same HDD model and performed differently. It may have a slow chip with poorly optimized firmware, yet we are prone to explain this by the difference between the HDDs. For example, the results of the SSD installed into the Vantec show that the interface does not increase the response time much above the SSD’s own response time. Thus, the interface’s influence is rather low in this test.

Random Read & Write Patterns

Now we will check out the dependence between the drives’ performance in random read and write modes on the size of the processed data block.

We will discuss the results in two ways. For small-size data chunks we will draw graphs showing the dependence of the amount of operations per second on the data chunk size. For large chunks we will compare the drives’ performance basing on the data-transfer rate in megabytes per second.

The new interface does not differ much from the older one when it comes to processing small data blocks. Both are somewhat inferior to SATA 300, yet it is the HDD rather than the interface that determines the resulting speed then. But when it comes to large requests (1 or 2 megabytes, which is similar to viewing photos from a fragmented disk), the new interface performs much better than the old one. Vantec’s USB 3.0 implementation is definitely better as it is but slightly slower than the HDD connected via SATA. The gap between the two USB 3.0 enclosures grows larger when the size of the processed data blocks increases.

We’ve got a different picture at writing. The SATA drive is faster with small data blocks whereas the peripheral interfaces deliver similar performance, being about half as fast as the leader. With large data blocks the USB 3.0 interface gets closer to the leader. USB 2.0 reaches its maximum speed at 2MB data blocks whereas SATA and USB 3.0 keep on accelerating. The Vantec is again considerably better than the Buffalo although the latter behaves more predictably.

Database Patterns

In the Database pattern the tested drive is processing a stream of requests to read and write 8KB random-address data blocks. The ratio of read to write requests is changing from 0% to 100% with a step of 10% throughout the test while the request queue depth varies from 1 to 256.

You can click this link to view the tabled results for the IOMeter: Database pattern.

We will build diagrams to illustrate each drive’s performance at different request queue depths.

This is all quite illustrative, especially due to the peculiar behavior of the Samsung hard disk drive. It slows down when connected via USB 2.0, losing nearly all of deferred writing and hardly showing any read request reordering. It is only at a request queue depth of 16 that you can see some notable performance growth.

Vantec’s USB 3.0 implementation looks better and shows a more considerable increase in performance at long queue depths. However, the graph for a queue depth of 4 requests is still almost the same as the graph for a queue depth of 1 request. The Buffalo’s USB 3.0 produces zigzagging graphs. If it were a SATA-connected HDD, we’d say that its firmware is poor. The enclosure’s controller seems to be trying to help the HDD at long queue depths as much as it can, but does not do that consistently. There is one thing that doesn’t change with this enclosure, though. There is almost no difference in its performance at short queue depths.

Web-Server, File-Server and Workstation Patterns

The drives are now going to be tested under loads typical of servers and workstations.

The names of the patterns are self-explanatory. The Workstation pattern is used with the full capacity of the drive as well as with a 32GB partition created on it. The request queue is limited to 32 requests in the Workstation pattern.

The results are presented as performance ratings. For the File-Server and Web-Server patterns the performance rating is the average speed of the drive under every load. For the Workstation pattern we use the following formula:

Rating (Workstation) = Total I/O (queue=1)/1 + Total I/O (queue=2)/2 + Total I/O (queue=4)/4 + Total I/O (queue=8)/8 + Total I/O (queue=16)/16.

We’ve got rather surprising results here. The Buffalo’s USB 3.0 implementation is better for server applications than the Vantec’s one although both are inferior to the SATA-connected hard disk drive. It’s the same with the workstation load except that the Buffalo only enjoys a big advantage when the test zone is limited to 32 gigabytes. USB 3.0 is faster than USB 2.0 with the Vantec and the gap is even larger with the Buffalo.

Multithreaded Read & Write Patterns

The multithreaded tests simulate a situation when there are one to four clients accessing the disk subsystem all at the same time – the clients’ address zones do not overlap. The number of simultaneous requests from each of them varies from 1 to 8, but we will discuss diagrams for a request queue of 1 as the most illustrative ones. When the queue is 2 or more requests long, the disk subsystem’s performance doesn’t depend much on the number of applications. You can also click the following links for the full results:

Offering more bandwidth for the hard disk drive, USB 3.0 is about twice as fast as USB 2.0 under multithreaded load. Interestingly, USB is better than SATA in this test for some unknown reason when reading three or four data threads.

There are no unexpected results at multithreaded writing. The HDDs have the same standings irrespective of the number of data threads. USB 2.0 is so much of a bottleneck here that the HDD connected via USB 2.0 is indifferent to the number of data threads whereas the other HDDs slow down when there are more threads to be processed.

Performance in FC-Test

For this test two 32GB partitions are created on the tested drive and formatted in NTFS and then in FAT32. A file-set is then created, read from the drive, copied within the same partition and copied into another partition. The time taken to perform these operations is measured and the speed of the drive is calculated. The Windows and Programs file-sets consist of a large number of small files whereas the other three patterns (ISO, MP3, and Install) include a few large files each. The ISO pattern has the largest files.

We’d like to note that the copying test is indicative of the drive’s behavior under complex load. In fact, the disk drive is working with two threads (one for reading and one for writing) when copying files.

This test produces too much data, so we will only discuss the results achieved in NTFS. You can use the following link to view the FAT32 results.

There is no point in commenting upon each diagram as they all show similar and predictable results. Overall, USB 3.0 indeed proves its ability to reveal the full speed potential of modern HDDs under any loads, unlike its predecessor USB 2.0. The overhead for the external design is rather low in these file-processing tasks: the HDDs in the USB 3.0 enclosures are but slightly slower than the same HDD connected via SATA. The gap is less than 10% at reading and about 15% at writing. It is the biggest when copying files, but you don’t often do this with an external storage device which is mostly used for either reading or writing. USB 2.0 looks downright poor and outdated in comparison with the newer version.

Conclusion

Our tests have shown that USB 3.0 offers enough bandwidth to reveal the full speed potential of a modern HDD. The dramatic innovations in the USB interface imply bright perspectives as well. On the other hand, we have not seen the promised tenfold performance boost. The devices we have tested cannot yield more than 160 MBps where SATA 300 easily delivers 250 MBps.

Early USB 2.0 implementations were not optimal in terms of data-transfer rate, either, so we do hope that we will see faster USB 3.0 controllers. We are also looking forward to mainboard chipsets with native USB 3.0 support. Until then, the new standard can hardly take off for real because it has a serious opponent, eSATA, when it comes to external HDDs. Although eSATA cannot power up the connected device, it is more widespread as yet than USB 3.0 and delivers higher speed. USB 3.0 will prevail eventually, but the question is how much time it will take.