User Aware Systems Features: Closer Look
Take Care of Yourself
Another aspect of the user aware systems is the ability of a system like that to take care of itself . This feature of a user aware system will be of great practical value in datacenters. We can be talking about tens of thousands of discrete server elements within a limited space. And this server space needs constant monitoring, so that we could identify the failure and respond to it before any data loss occurs. Here is an example:
On the picture above you can see a thermal scan of the data center room. Should the air conditioning fail, we need to act quickly. This is not a kind of situation that human administrator can respond in time to prevent problems. This is when platforms can handle the change of the environmental conditions and respond appropriately, like enable reserve air conditioning units, for instance and backup the data from the system in risk.
IBM introduced the notion of autonomic computing to describe systems like that. For the past years we’ve gone from dozens to tens of thousands of servers that need to be managed. We are at the point now when IT shops use 80% of the investment on maintenance. When you look at the environments today, there are different hardware pieces, it is a tremendous conglomeration of technologies. In this environment there is lack of standards so we need some kind of instrumentation to resolve problems. Autonomic computing is about building intelligence into the systems at different levels. Besides, it has to be an open standard approach and then you can build intelligent behavior on top of that. Intel delivered hundreds of self managing features in numerous products. One of these features is the mechanism for capturing event info.
Take, for example, Intel’s AMT Active Management Technology, which gives us the tools for reading and analyzing the events. Intel does a lot of research of the autonomic computing potential. One of the latest developments looks as follows.
There are wireless Intel modes that measure ambient conditions around the servers in a datacenter (server utilization, internal temperature, external temperatures and humidity in the room).
Say we have a situation when we’ve got two servers, and one of them is gets extremely dangerously overheated. What is going to happen in this case? Intel modes should pick up the changes in the ambient thermal conditions. The server that got overheated will be detected, and the load will migrate from it to another server, so that the system could hibernate without threatening to lose all the data.