Ganglia is a scalable, distributed
monitoring
Monitoring may refer to:
Science and technology Biology and healthcare
* Monitoring (medicine), the observation of a disease, condition or one or several medical parameters over time
* Baby monitoring
* Biomonitoring, of toxic chemical compounds, ...
tool for high-performance computing systems, clusters and networks. The software is used to view either live or recorded statistics covering metrics such as
CPU
A central processing unit (CPU), also called a central processor, main processor or just processor, is the electronic circuitry that executes instructions comprising a computer program. The CPU performs basic arithmetic, logic, controlling, an ...
load averages or network utilization for many nodes.
Ganglia software is bundled with enterprise-level Linux distributions such as Red Hat Enterprise Level (RHEL) or the CentOS repackaging of the same. Ganglia grew out of requirements for monitoring systems by Berkeley (University of California) but now sees use by commercial and educational organisations such as Cray, MIT, NASA and Twitter.
Ganglia
It is based on a hierarchical design targeted at federations of clusters. It relies on a
multicast
In computer networking, multicast is group communication where data transmission is addressed to a group of destination computers simultaneously. Multicast can be one-to-many or many-to-many distribution. Multicast should not be confused wit ...
-based listen/announce protocol to monitor state within clusters and uses a tree of point-to-point connections amongst representative cluster nodes to federate clusters and aggregate their state. It leverages widely used technologies such as
XML
Extensible Markup Language (XML) is a markup language and file format for storing, transmitting, and reconstructing arbitrary data. It defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. ...
for data representation,
XDR for compact, portable data transport, and
RRDtool
RRDtool (''round-robin database tool'') aims to handle time series data such as network bandwidth, temperatures or CPU load. The data is stored in a circular buffer based database, thus the system storage footprint remains constant over time. ...
for data storage and visualization. It uses carefully engineered data structures and algorithms to achieve very low per-node overheads and high concurrency. The implementation is robust, has been ported to an extensive set of operating systems and processor architectures, and is currently in use on over 500 clusters around the world. It has been used to link clusters across university campuses and around the world and can scale to handle clusters with 2000 nodes.
Ganglia Monitoring System
/ref>
The ganglia system comprises two unique daemons, a PHP
PHP is a General-purpose programming language, general-purpose scripting language geared toward web development. It was originally created by Danish-Canadian programmer Rasmus Lerdorf in 1993 and released in 1995. The PHP reference implementati ...
-based web front-end, and a few other small utility programs.
Ganglia Monitoring Daemon (gmond)
''Gmond'' is a multi-threaded daemon which runs on each cluster node you want to monitor. Installation does not require having a common NFS filesystem or a database back-end, installing special accounts or maintaining configuration files.
Gmond has four main responsibilities:
# Monitor changes in host state.
# Announce relevant changes.
# Listen to the state of all other ganglia nodes via a unicast or multicast channel.
# Answer requests for an XML description of the cluster state.
Each gmond transmits information in two different ways:
* Unicast
Unicast is data transmission from a single sender (red) to a single receiver (green). Other devices on the network (yellow) do not participate in the communication.
In computer networking, unicast is a one-to-one transmission from one point in ...
ing or Multicast
In computer networking, multicast is group communication where data transmission is addressed to a group of destination computers simultaneously. Multicast can be one-to-many or many-to-many distribution. Multicast should not be confused wit ...
ing host state in external data representation (XDR) format using UDP messages.
* Sending XML over a TCP connection.
Ganglia Meta Daemon (gmetad)
Federation in Ganglia is achieved using a tree of point-to-point connections amongst representative cluster nodes to aggregate the state of multiple clusters. At each node in the tree, a Ganglia Meta Daemon (gmetad) periodically polls a collection of child data sources, parses the collected XML, saves all numeric, volatile metrics to round-robin databases and exports the aggregated XML over a TCP socket to clients. Data sources may be either gmond daemons, representing specific clusters, or other gmetad daemons, representing sets of clusters. Data sources use source IP address
An Internet Protocol address (IP address) is a numerical label such as that is connected to a computer network that uses the Internet Protocol for communication.. Updated by . An IP address serves two main functions: network interface ident ...
es for access control and can be specified using multiple IP addresses for failover. The latter capability is natural for aggregating data from clusters since each gmond daemon contains the entire state of its cluster.
Ganglia PHP Web Front-end
The Ganglia web front-end provides a view of the gathered information via real-time dynamic web pages. Most importantly, it displays Ganglia data in a meaningful way for system administrators and computer users. Although the web front-end to ganglia started as a simple HTML
The HyperText Markup Language or HTML is the standard markup language for documents designed to be displayed in a web browser. It can be assisted by technologies such as Cascading Style Sheets (CSS) and scripting languages such as JavaScri ...
view of the XML tree, it has evolved into a system that keeps a colorful history of all collected data.
The Ganglia web front-end caters to system administrator
A system administrator, or sysadmin, or admin is a person who is responsible for the upkeep, configuration, and reliable operation of computer systems, especially multi-user computers, such as servers. The system administrator seeks to en ...
s and users. For example, one can view the CPU
A central processing unit (CPU), also called a central processor, main processor or just processor, is the electronic circuitry that executes instructions comprising a computer program. The CPU performs basic arithmetic, logic, controlling, an ...
utilization over the past hour, day, week, month, or year. The web front-end shows similar graphs for memory usage, disk usage, network statistics, number of running processes, and all other Ganglia metrics.
The web front-end depends on the existence of the gmetad which provides it with data from several Ganglia sources. Specifically, the web front-end will open the local port 8651 (by default) and expects to receive a Ganglia XML tree. The web pages themselves are highly dynamic; any change to the Ganglia data appears immediately on the site. This behavior leads to a very responsive site, but requires that the full XML tree be parsed on every page access. Therefore, the Ganglia web front-end should run on a fairly powerful, dedicated machine if it presents a large amount of data.
The Ganglia web front-end is written in PHP
PHP is a General-purpose programming language, general-purpose scripting language geared toward web development. It was originally created by Danish-Canadian programmer Rasmus Lerdorf in 1993 and released in 1995. The PHP reference implementati ...
, and uses graphs generated by gmetad to display history information. It has been tested on many flavours of Unix
Unix (; trademarked as UNIX) is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...
(primarily Linux
Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, which i ...
) with the Apache webserver
The Apache HTTP Server ( ) is a free and open-source cross-platform web server software, released under the terms of Apache License 2.0. Apache is developed and maintained by an open community of developers under the auspices of the Apache Sof ...
and the PHP5 module.
References
External links
*
*
*
Wikimedia Ganglia instance
{{DEFAULTSORT:Ganglia (Software)
Free network management software
Free software programmed in C
Free software programmed in Perl
Free software programmed in Python
Internet Protocol based network software
Parallel computing
System administration
System monitors
Software using the BSD license