by Sergey Lepilov
10/15/2012 | 10:21 AM
Every graphics card from both Nvidia and AMD supports multi-GPU technology (called SLI and CrossFireX, respectively), the only exception being junior product series, which are not really meant for gaming. Frankly speaking, these technologies are far from popular, yet their developers keep on optimizing them with each new graphics architecture to ensure maximum performance scalability. However, even hardcore gamers, let alone ordinary users, prefer to buy a single top-end graphics card instead of two mainstream ones running in SLI or CrossFireX mode. This approach is perfectly right in the majority of applications, but there are situations when the two mainstream cards are going to be faster than the single top-end card in average as well as bottom speed, which is important for playability.
So, in this review we will study the efficiency of Nvidia’s SLI technology with two and three GeForce GTX 660 Ti graphics cards and will compare such multi-GPU configurations with the fastest single-processor card and with the dual-processor GeForce GTX 690. We won’t describe the cards themselves because we’ve recently posted a large roundup of eight GeForce GTX 660 Ti products. Three of them were used to prepare this article.
All participating graphics cards were tested in the following testbed:
We used MSI, Gigabyte and KFA2 graphics cards to build our 2-way and 3-way SLI configurations:
The GPU frequencies of these graphics cards were locked at 1020 MHz, and the graphics memory frequency remained unchanged at 6008 MHz:
We will compare the performance of the 2-way and 3-way GeForce GTX 660 Ti systems against the performance of Asus GeForce GTX 680 DirectCU II TOP and Nvidia GeForce GTX 690:
While the nominal clock frequencies of the first graphics card remained unchanged, we increased the GK104 frequencies of the dual-processor card to match the speed of GeForce GTX 660 Ti, i.e. to 1020 MHz. There will be no “red” cards in our today’s test session.
In order to lower the dependence of the graphics cards performance on the overall platform speed, we overclocked our 32 nm six-core CPU with the multiplier set at 37x, BCLK frequency set at 125 MHz and “Load-Line Calibration” enabled to 4.625 GHz. The processor Vcore was increased to 1.49 V in the mainboard’s BIOS:
Hyper-Threading technology was enabled. 16 GB of system DDR3 memory worked at 2 GHz frequency with 9-11-10-28 timings and 1.65V voltage.
The test session started on September 8, 2012. All tests were performed in Microsoft Windows 7 Ultimate x64 SP1 with all critical updates as of that date and the following drivers:
We ran the tests in two resolutions: 1920x1080 and 2560x1440 pixels. The tests were performed in two image quality modes: “Quality+AF16x” – default texturing quality in the drivers with enabled 16x anisotropic filtering and “Quality+ AF16x+MSAA 4(8)x” with enabled 16x anisotropic filtering and full screen 4x or 8x antialiasing if the average frame rate was high enough for comfortable gaming experience. We enabled anisotropic filtering and full-screen anti-aliasing from the game settings. If the corresponding options were missing, we changed these settings in the GeForce driver Control Panel. We also disabled Vsync there. There were no other changes in the driver settings.
The list of games and applications used in this test session includes two semi-synthetic benchmarking suites, one technical demo and seventeen games of various genres with all updates installed as of the beginning of the test session date. We included two new gaming titles – F1 2012 and Borderlands 2:
If the game allowed recording the minimal fps readings, they were also added to the charts. We ran each game test or benchmark twice and took the best result for the diagrams, but only if the difference between them didn’t exceed 1%. If it did exceed 1%, we ran the tests at least one more time to achieve repeatability of results.
SLI and CrossFireX tandems always show good performance scalability in Futuremark benchmarks, especially at high settings. The GeForce GTX 660 Ti is no exception, the second card adding 50 to 82% to the single card’s performance in 2-way SLI mode. The performance growth is smaller in 3-way SLI mode. It only amounts to 14-32% compared to the 2-way SLI configuration. The three GeForce GTX 660 Ti cards are a mere 2-8% ahead of the dual-processor GeForce GTX 690 which costs less. So, 3DMark Vantage doesn’t think much of the GeForce GTX 660 Ti when it comes to multi-GPU configurations. What about 3DMark 2011?
The newer and heavier version of 3DMark is more favorable towards the 2- and 3-way GeForce GTX 660 Ti configurations, reporting that they increase the single card’s performance by 65 to 86% and an additional 27 to 44%. Unfortunately, the 3-way SLI configuration is still not fast enough to justify its cost, even though it enjoys a larger advantage over the GeForce GTX 690 than in 3DMark Vantage.
The 2-way SLI has a large advantage of 84 to 94% over the single card in Unigine Heaven but the 3-way SLI is less efficient, adding 21-45% to the performance of the 2-way GeForce GTX 660 Ti and being but slightly faster than the single GeForce GTX 690.
This game produces interesting results. Where not limited by the platform, the SLI configurations can substantially boost both the average and bottom frame rate. The latter is rather an exception, as we’ll see in the next game.
The SLI configurations improve the average frame rate in this game well enough. Even the 3-way SLI adds 44 to 53% to performance. The bottom frame rate is a different story, though. It even lowers as we add more GeForce GTX 660 Ti cards into our system.
Unfortunately, the benchmark integrated into Just Cause 2 does not report the bottom frame rate but the average speed data indicate that the GeForce GTX 660 Ti cards provide high scalability in the 2-way tandem and above-average scalability in the 3-way configuration. We should also note that the dual-processor GeForce GTX 690 isn’t much slower than the GTX 660 Ti trio.
SLI technology works even better in Aliens vs. Predator (2010): the tandem offers twice the average frame rate of the single card whereas the 3-way SLI adds 42 to 47% more.
Being a CPU-dependent application, Lost Planet 2 can only show us the SLI effect in the heaviest mode, i.e. at the resolution of 2560x1600 pixels. But even with enabled antialiasing the two GeForce GTX 660 Ti cards are only 66% ahead of the single card and the 3-way SLI configuration is a mere 16% faster than the tandem. This isn’t high scalability at all.
Although CPU-dependent as well, Civilization V allows the multi-GPU configurations built out of GeForce GTX 660 Ti to show their best in every test mode. At 1920x1080 the 2-way SLI is 89 to 95% faster than the single card while the 3-way SLI is 45 to 46% ahead of the tandem. That’s very good. As for the resolution of 2560x1440 pixels, the 2-way SLI configuration boosts performance by 92-93% and the 3-way SLI adds an extra 46-48%. Running ahead, we can tell you that SLI technology shows its near-maximum scalability in this test. Meanwhile, the GeForce GTX 690 with its two top-end GK104 processors is almost as fast as the 3-way GeForce GTX 660 Ti setup.
The SLI configurations look good in Total War: Shogun 2, even though their scalability isn’t as impressive as in Sid Meier’s Civilization V. The bottom speed increases at any settings, which is good news for those who plan to build a multi-GPU setup to play this game.
It’s different from the previous test:
The average frame rate is okay in this shooter, but the bottom speed is just awful. The single-processor GeForce GTX 680 and the dual-processor GeForce GTX 690 ensure higher bottom speeds and, consequently, higher comfort for the gamer. Moreover, the GTX 690 is just as good as the 3-way SLI in average frame rate, so the choice of graphics hardware for playing this game is obvious.
Hard Reset is too easy for high-performance graphics hardware. It is only at 2560x1440 pixels with enabled antialiasing that SLI technology can boast 80% scalability.
SLI works in Batman: Arkham City well enough, the 3-way SLI being 42% ahead of the 2-way configuration. The bottom speed isn’t perfect, but the GeForce GTX 690 has problems with it, too.
The two GeForce GTX 660 Ti are faster than the single card by 80% and more, both in average and bottom frame rates. The third card adds 17-47% more to their performance.
It is in Nexuiz that the SLI configurations built out of GeForce GTX 660 Ti cards show their maximum scalability in our today’s test session. The 2-way setup is almost 100% faster than the single card whereas the 3-way SLI is 50 to 98% ahead of the 2-way SLI. If it were not for the low bottom speed, the multi-GPU technology from Nvidia would be just perfect in this game.
DiRT Showdown has always worked well with SLI and CrossFireX. Here, the two GeForce GTX 660 Ti cards are 84-86% ahead of the single one while the 3-way SLI is 35-41% faster than the tandem. This can hardly justify the price of the third card, though. But we can note that the bottom speed increases in a linear manner here.
While the scalability of the SLI configurations in Sniper Elite V2 is okay in terms of average frame rate, the bottom speed is too low.
The 2-way SLI is 65 to 90% ahead of the single card in Sleeping Dogs. The 3-way SLI adds up to 47% more, being not much faster than the dual-processor GeForce GTX 690 card.
The recently released F1 2012 shows odd results. SLI technology only works with MSAA turned on. Otherwise the speed of the multi-GPU configurations plummets to an extremely low level. Of course, the single GeForce GTX 680 looks preferable in this case. This problem will hopefully be corrected with driver updates or game patches.
MSAA didn’t work in Borderlands as of the time of our writing this, so the results were obtained without MSAA but with FXAA:
The 2-way GeForce GTX 660 Ti configuration is worth the trouble of building it as it adds 62 to 79% to the performance of the single such card. The 3-way SLI doesn’t look justifiable, though. It is inferior to the single GeForce GTX 690.
Here is a table with the full test results:
The first pair of our summary charts shows you the scalability of our 2-way SLI configuration in comparison with the single GeForce GTX 660 Ti, the latter serving as a baseline.
So, we can see performance benefits in most of our tests with the exception of the CPU-dependent Lost Planet 2, the AA-less mode of Hard Reset, and the new F1 2012. The two GTX 660 Ti cards are an average 60-80% faster than the single such card at 1920x1080 and 69-82% faster at 2560x1440. The scalability is higher when antialiasing is turned on. We can also add that the GeForce GTX 660 Ti tandem is generally faster than the single overclocked GeForce GTX 680, but more expensive, too.
The second pair of charts helps you evaluate the effect from adding the third card to the 2-way SLI.
It is only in Nexuiz at 2560x1440 that the 3-way GeForce GTX 660 Ti configuration shows its very best. In the rest of the games and benchmarks its scalability is less impressive than that of the 2-way SLI. The average scalability is 22-34% at 1920x1080 and 33-46% at 2560x1440. SLI technology is the least efficient in Lost Planet 2, Hard Reset, Borderlands 2 and F1 2012.
Finally, let’s compare the 3-way GeForce GTX 660 Ti SLI configuration with the dual-processor GeForce GTX 690 card, the latter serving as a baseline in the charts:
The 3-way SLI is a little faster than its opponent in most of our tests. It is only in Hard Reset, Borderlands 2, F1 2012, and certain test modes of Total War: Shogun 2, Sleeping Dogs and S.T.A.L.K.E.R.: Call of Pripyat, that the GeForce GTX 690 is preferable in terms of average frame rate.
We measured the power consumption of computer systems with different graphics cards using a multifunctional panel Zalman ZM-MFC3 which can report how much power a computer (the monitor not included) draws from a wall socket. There were two test modes: 2D (editing documents in Microsoft Word and web surfing) and 3D (the benchmark from Metro 2033: The Last Refuge at 2560x1440 with maximum settings). Here are the results:
It is no wonder that the 3-way GeForce GTX 660 Ti SLI configuration needs the most power. Its system draws over 650 watts, which is 100 watts higher than the power consumption of the system with one GeForce GTX 690. Remarkably, the system with two GeForce GTX 660 Ti needs 155 watts more than the system with one such card whereas the third GTX 660 Ti only increases the power draw by 113 watts. This seems to indicate some inefficiency in the 3-way SLI design.
Our tests show that building a SLI configuration out of GeForce GTX 660 Ti is a viable solution. We’re talking about the 2-way SLI tandem in the first place as it showed excellent scalability in terms of average frame rate and was always ahead of the factory-overclocked GeForce GTX 680 from ASUS. However, if you want to build such a tandem, you must keep it in mind that the bottom speed may even be lower than that of the single card. But if you don’t mind losing something in terms of comfort in order to get a higher average performance, the GeForce GTX 660 Ti is going to be an interesting solution indeed.
As for joining three such cards in 3-way SLI mode, there are certain problems that must be noted. First, the scalability is lower by half compared to the tandem vs. the single card. Second, the minimal speed may be downright unplayable, like 10 or 12 frames per second. Third, the high power consumption and heat dissipation of this configuration call for a high-quality (and expensive) power supply and efficient cooling (perhaps even liquid cooling). If these downsides do not bother you, you can bravely build your 3-way GeForce GTX 660 Ti setup and enjoy high benchmarking results that even GeForce GTX 690 owners would be envious of.