Linux Today: Linux News On Internet Time.

Linux high-performance cluster monitoring with Ganglia

Mar 07, 2009, 20:03 (0 Talkback[s])
(Other stories by Vallard Benincosa)

[ Thanks to An Anonymous Reader for this link. ]

"As data centers grow and administrative staffs shrink, the need for efficient monitoring tools for compute resources is more important than ever. The term monitor when applied to the data center can be confusing since it means different things depending on who is saying it and who is hearing it. For example:

"* The person running applications on the cluster thinks: "When will my job run? When will it be done? And how is it performing compared to last time?"
* The operator in the network operations center (NOC) thinks: "When will we see a red light that means something needs to be fixed and a service call placed?"
* The person in the systems engineering group thinks: "How are our machines performing? Are all the services functioning correctly? What trends do we see and how can we better utilize our compute resources?"

"Somewhere in this frenzy of definitions you are bound to find terabytes of code to monitor exactly what you want to monitor. And it doesn't stop there; there are also myriads of products and services. Fortunately though, many of the monitoring tools are open source -- in fact, some of the open source tools do a better job than some of the commercial applications that try to accomplish the same thing."

Complete Story

Related Stories: