<%BANNER[top_768x90]%>

<%BANNER[banner_468x60_h]%>

Intel Pentium III 500 with SSE Technology Review

Well, Intel has finally introduced its Pentium III, which is none other but the same Pentium II supporting SSE technology.

by FastSite
04/03/1999 | 12:00 AM

We all were looking forward to the end of February, when the launch of new products of twoleading hardware manufacturers was due. The sensational Intel Pentium III and AMD K6-III wereexpected to come out at that time. Everybody was impatiently willing to witness another battleof these two giants for domination in our computers, which was about to happen and seemedinevitable. Moreover, both parties tried to draw our attention and to win public support bymeans of various marketing tricks. However, winter is over and spring has already set in butthe show hasn't even started! AMD announced the appearance of the product, which launching hadn'tyet been put into practice. So, now Intel Pentium III is the only one to satisfy growing curiosityand impatience of the disappointed users since its main competitor is still unavailable in themarket. Unfortunately, the reality turned out not so cool, as it seemed.<%BANNER[article]%>

What did everybody expect? And what was this Pentium III supposed to be like? About half a yearago we imagined that Pentium III - called Katmai for conspiracy - was a Pentium II with 64KB of L1cache, operating at 133MHz FSB speed, which uses AGP 4x through the Camino chipset, and which hasa set of SIMD instructions for accelerating 3D-graphics as well as other applications working withimages and speech.

And what did we get in reality? In brief, it is none other than a well-known and widely spreadPentium II with a set of new SIMD instructions. And that's it. Further we'll try to find out if it'scool or not. So, imagine a Pentium III. It looks like this:

Actually, it is not quite right to say that SSE was the only change in the Katmai core. Even ifwe close our eyes to the CPU serial number (processor ID), it is still worth mentioning thatPentium II MMX instruction set has been enlarged a bit by a couple of new instructions. Besides,memory streaming was also improved. However, these "drastic" changes do not influence theperformance that much, and mostly play a quite insignificant role.

Practical value of Intel Pentium III seems more interesting to us, so let's get down to it.First, you should bear in mind that you don't need a new motherboard for working with a new CPU.The only thing that is required is the latest BIOS revision, which is actually offered by almostevery mainboard manufacturer by now. This updated BIOS should be able to correctly recognize thenew core and should have a corresponding microcode. As for the Pentium III core voltage, it remainsthe same notwithstanding all the forecasts - 2V. Though a new motherboard with 1.8V core voltagesupport will be soon advisable for upcoming models of Pentium III, which are going to be specifiedfor operation at 1.8V only.

The new processor as all its predecessors (Pentiums II) operates at 100MHz FSB speed. By anold tradition it has a locked clock multiplier that's why it overclocks only by increasing FSB speed.

Performance

And now a few words about tests. We studied Intel Pentium III 500MHz performance in all standardapplications. In non-special tasks Pentium III 500 showed practically the same results asPentium II 500 would have shown since there are almost no changes in the main Katmai core comparedto Deschutes.

The testing system was configured as follows:

The results obtained:

The chart shows that Pentium III doesn't provide any particular benefit except the additional50MHz. However, taking into account the peculiarities of such applications as Word or Excel whichusually wait for some data to be input by the user, using such a powerful processor may seemexcessive luxury.

In the next test Pentium III performed a bit quicker than Pentium II though the test didn'tinvolve the application of any SIMD-instructions. These results were probably obtained due toimproved memory streaming.

The following test proves that the arithmetical co-processor of Pentium III compared toPentium II remained unchanged. The increase in performance occurred only due to additionalmegahertz.

And now let's have a closer look at 3D-graphics gaming performance in non-optimized graphicapplications.

The test was carried out with a massive1 demo. And as we had expected in advance, the resultsdidn't strike as something special. The game hardly performed any quicker without optimization andapplying new instructions.

This chart is much more interesting. There isn't any tangible gain though we performed thistest under DirectX 6.1 powered by SSE optimization. And the explanation is very simple: 3Dmark99,as the overwhelming majority of modern games, does not use the part of DirectX where new SIMDinstructions can be applied. We have already seen the same effect with 3DNow! and AMD K6-2processor. It turns out not enough to agree upon the support of new instructions in DirectX withMicrosoft. You should either persuade the developers that they should use these new features intheir programs or make them activate DirectX Lighting and Transformation Engine. However, all theprevious experience shows that program-makers prefer writing their own algorithms for lighting andtransformation to using the already written for this purpose DirectX, and it is a relatively lowspeed and limited possibilities of Lighting and Transformation Engine that cause this mistrust.That is why now this procedure is applied only in 3D WinBench. :)

All the above mentioned results were not very illustrative as far as real advantages of anew Pentium III compared to its predecessors are concerned. However, this doesn't mean that Intelfailed to create a cool processor, and the reasons that make us think so will follow below. Havinglaunched Pentium III Intel offered a springboard for further innovations.

SECC2

Before going over to the main part of this article - SSE Review - we'd like to draw yourattention to the thing which never remains unnoticed: it is a new CPU SECC2 package.

Some time ago when Pentium II was about to appear, Intel designed a unique processor case.In order to get rid of its pesky competitors Intel decided not to license the case to anybody.But the outcome of this policy left much to be desired. As a result they simply lost all thesub-$1000 PC market and now Intel spares no effort to make up for the lost demand. The cost of aprocessor cartridge certainly adds to the final price of the whole product that is why Celeron andthe like intended for the lower segment of the market do not have any cartridge at all.

SECC2 represents a sort of an intermediate link between a standard SECC and its completeabsence. Cartridge SECC2 has lost its front part, the one where the cooler is usually fastened.And this is another big advantage because now the radiator touches the chip and not the separatingmetallic plate pressed to the core. It means superior heat dissipation in SECC2, and the trueevidence of this statement is the outlook of Pentium III sample, which we have received fortesting: it has just a pin radiator without any coolers. However, despite such serious changes,it operated perfectly not only at normal frequency 500MHz but also at 560MHz (5x112MHz) afteroverclocking.

But this is also not the last point that is worth noting here. A new crystal coverage has beenintroduced. A revolutional organic copper alloy - Organic Land Grid Array (OLGA) is now used insteadof the former Plastic Land Grid Array (PLGA). So, even the processor core has changed its image,which results into its size reduction on the one hand, and a better cooling due to a better heatconduction on the other. On the photo below you can see Pentium III without its case:


Serial number (Processor ID)

The second most scandalous innovation is a so-called serial number given to each Pentium IIIprocessor, which serves as a unique identifier. The idea itself was actually not that bad.Each processor could be easily identified and hence all the problems with user authenticationand CPU protection against unauthorized overclocking could be easily solved in no time. Besides,they also worked out some special software, which allowed PC owners to register a serial number fortheir CPU through Internet.

However, this turned out to be a stumbling block of Intel's successful marketing policy. Assoon as the opportunity to register one's processor through Internet had been announced, mass mediaoffered us not the most optimistic view of the possible outcome: intensive searching and spying uponusers through Internet and as a result interfering with their privacy! No wonder that everybodywas - to say the least of it! - displeased.

But, please, don't give way to panic and despair! Frankly speaking, it is not so easy toidentify one's processor ID through Internet, as it might seem. And even if some crazy guy makesup his mind to find out your CPU serial number, his success will definitely depend on good luck andfulfillment of the following requirements:

These arguments make me feel pretty safe as far as my privacy is concerned, and I do not regardthis serial number launching as Intel's persistent effort to violate my rights. Moreover, allharddisks produced within the last 10 years have unique serial numbers, which can also beidentified. But in this case no questions arise and everybody seems to snap their fingers atit.

So, taking into account all these things, we arrive at the conclusion that CPU serial numbershould be added to Intel's highs and regarded positively.

SSE

And now we're coming over to the most important part of our review. It is devoted tothe thing that first was known as MMX2, then as KNI and now comes under a new name SSE(Streaming SIMD Extensions). Pentium III got 70 new SIMD instructions, which handle 128-bitregisters XMM0-XMM7. Each register stores 4 single precision real numbers. It means thatperforming certain operations over just 2 registers, SSE is in fact operating with 4 pairs ofnumbers. In other words, the described feature allows the processor to perform simultaneously upto 4 operations, and this peculiarity found its direct reflection in the abbreviation "SIMD" -Single Instruction Multiple Data.

However, in order to perform 4 instructions with one stroke of the pen, a softwaredeveloper should be ready to apply some special commands, and to find the most suitable locationfor the data extracted from a 4-wide register. So, if you intend to use all Pentium III enhancementsin full, the optimization will be most welcome.

That means that Pentium III can be regarded as a block for real numbers similar to the alreadyexisting MMX. And this innovation proves very useful for a wide range of applications:

As for the new name of the processor, it seems a bit strange that Intel decided not to call itsoffspring Pentium II SSE (remember the same occasion with Pentium MMX). Does it mean that SSE hasbrought real qualitative changes? Actually, it sounds rather doubtful. And the true reason lies inIntel's skilful marketing campaign. Having added a new abbreviation to the old name, Intel couldsimply give away the fact that SSE is the only valuable novelty in the new processor. And this couldthreaten Intel's privileged position in the market, since following Intel's bright example all theircompetitors could also announce a new SSE support for their products. Actually, it happened to MMX,and Intel tried to avoid repeating this mistake. So, they launched not a Pentium II SSE but a "fullynew" Pentium III and forced their competitors to dance to Intel's tune and to agree to"compatibility" with Intel's cool product. Let me remind you of a mysterious change of AMD K6-3to AMD K6-III. :)

And now let's have a look at the main advantages of applying new SIMD instructions indifferent SSE optimized applications. Intel hasn't forgotten its MMX bitter experience whennew instructions appeared without any corresponding software, which led to a kind of "boycott".This time Intel exercised more circumspection and prudence and sent dozens of working samples tovarious software developers for optimization long before the official launching of the newprocessor. As a result we already have some SSE optimized applications.

This test fulfils standard MPEG-1 encoding, playback MPEG-1 files, carries out imageprocessing (turning), two images merging and color enhancement, as well as sound processing.All the mentioned tasks can be SSE optimized, and the result is shown on the diagram: the newprocessor's effectiveness is about 40-50% higher than the one of Pentium II.

The peculiarities of 3D-graphics and games optimization should be discussed separately.Optimization of a 3D-game can be carried out in 3 different ways:

  1. Video drivers optimization;
  2. DirectX optimization (the application is supposed to use DirectX optimized functions);
  3. Application optimization.

As for the first point, almost all video cards manufacturers expressed their desire toprovide the required drivers. Some of them, for example, Nvidia, already have SSE optimizeddrivers, which we actually used for our tests. However, according to Quake2 test this optimizationproved to be absolutely useless once again (remember the results obtained with 3Dnow! not so longago). There isn't even a slightest increase in speed.

As for DirectX, its 6.1 version and all further releases accordingly are SSE optimized. Thechanges occurred in Lighting and Transformation Engine, i.e. in a set of instructions, whichserve for 3D-scenes transformation and calculations of the lighting parameters. In fact,optimization here may help to achieve really good and quick performance only if the applicationuses the optimized instructions for their direct purpose instead of making all calculations itself.Just take a look at the results achieved in 3D Winbench 99 Lighting and Transformation test andall your doubts concerning the necessity of DirectX optimization will vanish at once.

Having applied SSE we managed to achieve a 80-90% gain, which is a top value for cases whenthe optimized DirectX is utilized. Nevertheless, this option still exists only theoretically becauseyou can hardly find a program using DirectX Lighting and Transformation Engine, as well as RetainedMode. And the reasons are again very down-to-earth: slow performance and scant opportunities.However, the situation with DirectX 7.0 is not that hopeless and may soon change for the better,since Microsoft announced that they are working on their own improved engine. So, 3D Winbench isthe only program by now, which vividly demonstrates the advantages of optimized DirectX and givesus a clear idea of its true purpose.

And now let's pay attention to optimized programs. The best known one is a 3D-game called RageDispatched due in the 2nd quarter this year. All game scenes are made up of 55000 triangles andhave several light sources. However, such detailed elaboration doesn't occur anywhere because allthe presently existing processors power. While running a test for Intel Pentium II 450 withresolution set to 800x600x16, the fsp number dropped under 10 fsp and variable motion could benoticed. The same test performed for Intel Pentium III with the same resolution settings showedthat fsp number didn't drop lower than to 25 fsp. The chart below demonstrates more detailedresults of the mentioned test.

Due to new SIMD instructions there is an evident 50% gain. That means new cool opportunitiesfor both: game fans and game developers.

Another test intended for new processor optimization is a new release of 3Dmark99 MAX.

This synthetic test, though it is based on a real engine, shows a 20% gain due to SSE. As inDispatched, which was mentioned earlier, software developers didn't use any DirectX optimizedinstructions. And the achieved increase is smaller because 3Dmark not only generates and displays3D-scenes, but also checks such characteristics as video memory bandwidth (which is absolutelyindependent of CPU), and includes them into the final index.

To evaluate the effectiveness of Pentium III processor in 3D-games, 3Dmark99 MAX offers aspecial index CPU 3Dmark, which fulfils all the necessary calculations for 3D-scenes but doesnot display them on the screen. So, the obtained result depends only on CPU ability to process3D-graphics and on system memory bandwidth. In the current case we see that SSE provides a 60-70%gain. Looks not bad, eh? This is exactly the very theoretical top value, which can be reached invarious 3D-games due to a new Pentium III. And if you compare these results to the ones of 3DWinbench 99 Lighting and Transformation test you'll get another proof of their correctness.

If you have managed to read that far, you seem to have got almost all the information about thenew processor. The only ones who may still remain disappointed are probably AMD fans. That is whywe decided to touch upon 3DNow! technology and to compare it to SSE. Theoretically SSE operateswith 128-bit registers, and 3DNow! in its turn - with 64-bit ones. It means that SSE-pipelineof Pentium III can manage 4 pairs of values in one time step, while 3DNow! is able to processonly 2. However, Pentium III has only 1 SSE-pipeline, while K6-2 and K6-3 have 2 of them. In otherwords, both CPUs can process 4 pairs of numbers in 1 time step, but K6-2 and K6-3 pipelines areorganized in a different way, so that they cannot perform the same operations simultaneously,though this limitation seems to me absolutely insignificant.

Thus, Pentium III has twice as many registers for effective optimization, since KNI utilizeseight 128-bit registers instead of eight 64-bit ones as in 3DNow!, and this advantage seems to bethe only important difference between them. However, having twice as many registers does notimply having a twice as well optimized code. And even though SSE contains over three times asmany instructions (70, compared to 21 of 3DNow!) there is no use to envy Pentium III ownersbecause 3DNow! runs all the most important 3D-graphics operations in SIMD regime.

Summing up, I would like to say that it's pretty hard to find out what is better followingonly theoretical matters because the features proved to be almost the same in both cases.Moreover, there is no information about time needed to carry out various operations. So, let'spass over to the tests. We tested AMD K6-2 on a Chaintech 5AGM2 mainboard with analogousconfiguration.

Here you can see that AMD K6-2 was left far behind. And as for SIMD instructions, thesituation seems not quite simple to characterize.

Well, speaking about 3DNow! it is probably true that good things come in small packages.Despite all "buts", despite 64-bit registers and smaller number of instructions, 3DNow! suddenlydemonstrated a better geometric processing and lighting parameters calculation. According tosoftware developers 3DNow! shows more flexibility working with small amounts of data, and thisis the explanation of its wonderful performance. But all this has to do only with gaming. Thepicture may be quite different if we take some non-gaming applications with different initialconditions. Though the result of the test is taken as true only supposing that 3Dmark99 MAX iswell optimized for both sets of instructions, we are pretty sure that FutureMark program-makerswould never dare to juggle with facts and to offer a fake superiority of 3DNow!. So, we preferto believe that everything is fair, first of all in terms of quality and new gaming facilities.But until AMD processors get a pipelined FPU (it is allegedly appearing in K7), they willnever dominate in gaming technologies. This may also be regarded as another reason of 3DNow!'svictory, which easily makes up for a slow arithmetical co-processor in many cases, and thedevelopers try to make 3DNow! share some FPU functions. Pentium III doesn't have any problemslike that because each CPU block has its task and SSE stimulates a lower pure increase. So, let'shope that AMD K7 will finally become a unique merging of all cool microprocessor technologies ifAMD sticks to its promise and improves FPU.

Conclusions

So, let's sum up in brief what we've got. Intel Pentium III 500 is a faster (50 MHz faster)Pentium II with additional instructions, which are hardly used anywhere nowadays. That iswhy wasting some $500-700 on it seems not the best way to spend your money, though its newinstructions possess huge potential (up to 80% gain in optimized applications). Nevertheless,SSE will definitely find its way since these instructions are of great use for softwaredevelopers. But by that time Pentium III will surely become cheaper. Besides, a new Copperminemay also come out later. This new processor with SSE support and 133MHz bus frequency will have256KB core integrated cache working on the processor frequency. Isn't it a much better purchase? :)

<%BANNER[banner_468x60_f]%>