Microprocessors and servers developed and manufactured by Sun Microsystems are widely adopted throughout the world for a lot of applications, including many mission-critical tasks, such as telecommunication company’s databases. One of the advantages Sun CPUs and servers provide is that they are considered as absolutely reliable; this opinion has its ground since Sun has long history as CPU and Solaris operating system manufacturer. Sadly, it appears that even legendary reliable microprocessors and systems may have some problems.
In the recent article, published as part of Sun Alert program, Sun Microsystems warns its customers about possible problems with Floating-Point Unit (FPU) of its UltraSPARC III family of processors and their derivatives. Sun claims that due to an electrical degradation effect FPU may produce inaccurate or “unpredictable” results. However, the main problem is not in this fact lonely, but in the fact that Solaris OS will not detect such problem in the FPU and will not issue an error messages and reboot the system. Since FPU results are unpredictable, it may be a problem even for most simple applications, not talking about mission-critical tasks.
Sun claims that this error may occur only in 1 of about 83 000 CPUs and only after long period of operation. Additionally, diagnostic tool, called SunVTS, which ships with Solaris 8 and 9, is able to detect this error, so it is recommended to run it regularly. Hopefully, system administrators will listen to Sun’s recommendations.
This is not the first time when glitches are found with CPUs designed for mission-critical applications. Earlier this year Intel notified its customers about a glitch that causes some of its Itanium 2 processors at 900MHz and 1000MHz to behave erratically or crash in certain cases. According to Intel’s spokesperson, errors with Intel Itanium 2 processor 900 and 1000MHz only occur with a specific set of operations in a specific sequence with specific data. Reportedly, the electrical glitch had been found by a PC maker with only 900 and 1000MHz processors during stress-test. 800MHz chips were unaffected.
You can read the full description of the problem on the Sun web-site over here.