|
If the only computer you're responsible for is your desktop or
notebook computer it is easy to keep an eye on the health of the
system and to get an idea of how well it is running. As soon as a
component fails the service relying on that component fails as well.
If a disk starts to fail or the CPU gets too hot you get pretty
immediate feedback on that issue.
However, in the server space there is a continual push to provide
services on redundant hardware. The systems we can put in place these
days may sport redundant hardware such as raid controllers and disks,
power supplies, cooling elements, memory and CPUs. Some services can
even take advantage of using multiple computers to provide fail-over
in the event of a whole machine dying. All of this means that if one
of these components fails the service will just keep running. The
problem with this is that we no longer get the friendly phone call
from our users telling us when something isn't working anymore.
Instead we need to be able to get the status of the components so
that in the event of a failure we can take appropriate action and
maintain redundancy.
Fortunately servers these days are built with a variety of sensors
onboard so you can monitor their health. In the case of raid, the
controller is aware of the health of the disks and is able to report
their state. However, like all hardware, unless there is appropriate
software you cannot take advantage of all this information simply
because you can't get at it. We intend to present the sensors and bio
interfaces and several drivers in OpenBSD as examples of how you can
take advantage of this hardware to keep an eye on your systems.
speakers: David Gwynne, Marco Peereboom
location: SITE H0104
|