You might think that SSDs should be tested in the same way as HDDs, but that’s not exactly so. With HDDs, there is usually an excellent repeatability of test results. It is easy to minimize potential deviations: use the same or as similar as possible testbeds, warm HDDs up before running your tests, and clear the buffer memory up by means of a cold reboot prior to launching the most sensitive tests. That’s all. It is not that simple with SSDs.
The problem is in the controller’s algorithms that are responsible for leveling out the load among different memory cells. Besides, operations with SSD memory are carried out in blocks or pages even if small chunks of data are processed. If an SSD has been working long under load with a large share of random writes, data get highly “fragmented”. This is not a file fragmentation as on HDDs: a file may reside in sequential LBA addresses but they will belong to different memory chips and different pages in those chips.
One more performance-affecting factor is that today’s OSes when removing data only change the allocation table but do not send requests to clean up the specific LBA addresses. Thus, from the controller’s point of view, the SSD soon becomes fully filled up, all of its cells being occupied. The controller does not know that from the OS’s point of view there are no valuable data at most of addresses. The consequence is that the SSD’s performance degenerates greatly. And then the controller’s algorithms come into play once again. Having identified the type of load, a modern SSD controller tries to adjust data access and data placement in such a way as to achieve maximum performance. Easy to guess, it depends on the controller’s algorithms how long and how successful this process is going to be. For a hardware tester, it means that an SSD’s performance depends heavily on the type of previous load and on the time period between two subsequent tests.
How big can this discrepancy be in practice? Let’s see. First, we will show you the variance of test results in PCMark05 at five successive runs of the benchmark.
The Corsair SSD passed these tests after a long idle period (during which it could have carried out all planned optimizations) without a previous random write load and without storing any data. Easy to see, even under such comfortable conditions the benchmark provokes some changes in the data placement structure and calls for the controller’s optimization algorithms because there are as many as five different loads. As the consequence, the results of some iterations of the benchmark differ up to 10% from the average of the five iterations.
And what if we make things more complicated by benchmarking an SSD that already has some data which occupy some of its storage space and having previously tormented it with a generous portion of write requests in IOMeter?
Yes, the SSD feels worse now. The average result across the five tests is obviously lower now. You can see this even from the graphs, without looking at the numbers. And what is the most disturbing thing for a hardware tester, the variance of results between the different runs of the benchmark has changed. Some tests now deliver repeatable results, but the results of the file write test vary by 15% from the average. The results of the Windows XP startup test vary by over 30% even! The SSD’s speed is fluctuating, without showing any pattern.
Perhaps this is only a problem of the Corsair P128’s controller? Alas, all modern SSDs behave in the same way. But again, how big can this variance in performance be? Let’s check out the performance hit provoked by the storing of data and the preliminary load on two SSDs: a Corsair P128 and an 80GB Intel X25-M.
So, the performance decreases depending on the type of load and on the controller installed in the specific SSD. The value of the performance hit varies from negligible to serious (up to 46%)!
This diagram suggests that SSDs’ controllers might be compared according to the performance hit, but we won’t do such a comparison because the performance hit depends on the type of load and the history of previous requests. Our results won’t be repeatable and thus won’t be verifiable.
Well, Intel’s early SSDs indeed had problems with performance and the firmware update was meant to target this issue.
To sum up this section, we will show you one more interesting thing we have found. It is the performance hit of the SSDs in FC-Test:
In fact, there is no performance hit altogether! The SSDs even speed up somewhat. We suspect that the SSD’s controller takes the time between the tests (the reboot of the testbed plus two minutes of waiting) to perform an optimization. Anyway, this is one more example of the unpredictable behavior of modern SSDs that lowers our confidence in the repeatability of test results.