Shinken is an
open source
Open source is source code that is made freely available for possible modification and redistribution. Products include permission to use the source code, design documents, or content of the product. The open-source model is a decentralized sof ...
computer system
A computer is a machine that can be programmed to carry out sequences of arithmetic or logical operations ( computation) automatically. Modern digital electronic computers can perform generic sets of operations known as programs. These prog ...
and
network monitoring
Network monitoring is the use of a system that constantly monitors a computer network for slow or failing components and that notifies the network administrator (via email, SMS or other alarms) in case of outages or other trouble. Network monitorin ...
software application
Software is a set of computer programs and associated documentation and data. This is in contrast to hardware, from which the system is built and which actually performs the work.
At the lowest programming level, executable code consists ...
compatible with
Nagios
Nagios Core , formerly known as Nagios, is a free and open-source computer-software application that monitors systems, networks and infrastructure. Nagios offers monitoring and alerting services for servers, switches, applications and servic ...
. It watches
hosts
A host is a person responsible for guests at an event or for providing hospitality during it.
Host may also refer to:
Places
*Host, Pennsylvania, a village in Berks County
People
* Jim Host (born 1937), American businessman
* Michel Host ...
and services, gathers performance data and alerts users when error conditions occur and again when the conditions clear.
Shinken's architecture aims to offer easier
load balancing and
high availability
High availability (HA) is a characteristic of a system which aims to ensure an agreed level of operational performance, usually uptime, for a higher than normal period.
Modernization has resulted in an increased reliance on these systems. F ...
. The administrator manages a single configuration, the system automatically "cuts" it into parts and dispatches it to worker nodes. It takes its name from this functionality: a
Shinken
is a Japanese sword that has a forged and sharpened blade. The term ''shinken'' is often used in contrast with ''bokken'' (wooden sword), ''shinai'' (bamboo sword), and iaitō (unsharpened metal sword).
Shinken are often used in battōdō, ia ...
is a Japanese sword.
Shinken was written by Jean Gabès as a
proof of concept
Proof of concept (POC or PoC), also known as proof of principle, is a realization of a certain method or idea in order to demonstrate its feasibility, or a demonstration in principle with the aim of verifying that some concept or theory has prac ...
for a new Nagios architecture. Believing the new implementation was faster and more flexible than the old
C code, he proposed it as the new development branch of Nagios 4.
This proposal was turned down by the Nagios authors, so Shinken became an independent
network monitoring
Network monitoring is the use of a system that constantly monitors a computer network for slow or failing components and that notifies the network administrator (via email, SMS or other alarms) in case of outages or other trouble. Network monitorin ...
software application
Software is a set of computer programs and associated documentation and data. This is in contrast to hardware, from which the system is built and which actually performs the work.
At the lowest programming level, executable code consists ...
compatible with
Nagios
Nagios Core , formerly known as Nagios, is a free and open-source computer-software application that monitors systems, networks and infrastructure. Nagios offers monitoring and alerting services for servers, switches, applications and servic ...
.
Shinken is designed to run under all
operating system
An operating system (OS) is system software that manages computer hardware, software resources, and provides common daemon (computing), services for computer programs.
Time-sharing operating systems scheduler (computing), schedule tasks for ef ...
s where
Python runs. The development environment is under
Linux
Linux ( or ) is a family of open-source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991, by Linus Torvalds. Linux is typically packaged as a Linux distribution, which i ...
, but also runs well on other
Unix
Unix (; trademarked as UNIX) is a family of multitasking, multiuser computer operating systems that derive from the original AT&T Unix, whose development started in 1969 at the Bell Labs research center by Ken Thompson, Dennis Ritchie, a ...
variants and
Windows
Windows is a group of several proprietary graphical operating system families developed and marketed by Microsoft. Each family caters to a certain sector of the computing industry. For example, Windows NT for consumers, Windows Server for ...
. The reactionner process (responsible for sending notifications) can also be run under the
Android OS. It is
free software
Free software or libre software is computer software distributed under terms that allow users to run the software for any purpose as well as to study, change, and distribute it and any adapted versions. Free software is a matter of liberty, ...
, licensed under the terms of the
Affero General Public License
The Affero General Public License (Affero GPL and informally Affero License) is a free software license. The first version of the Affero General Public License (AGPLv1), was published by Affero, Inc. in March 2002, and based on the GNU General Pu ...
as published by the
Free Software Foundation
The Free Software Foundation (FSF) is a 501(c)(3) non-profit organization founded by Richard Stallman on October 4, 1985, to support the free software movement, with the organization's preference for software being distributed under copyleft ("s ...
.
Overview
* Design
** Monitoring system written in Python
** Distributed architecture using Pyro remote objects
* Active and Passive monitoring methods
** Monitoring of network services (
SMTP
The Simple Mail Transfer Protocol (SMTP) is an Internet standard communication protocol for electronic mail transmission. Mail servers and other message transfer agents use SMTP to send and receive mail messages. User-level email clients typic ...
,
POP3
In computing, the Post Office Protocol (POP) is an application-layer Internet standard protocol used by e-mail clients to retrieve e-mail from a mail server. POP version 3 (POP3) is the version in common use, and along with IMAP the most comm ...
,
HTTP
The Hypertext Transfer Protocol (HTTP) is an application layer protocol in the Internet protocol suite model for distributed, collaborative, hypermedia information systems. HTTP is the foundation of data communication for the World Wide Web, ...
,
NNTP
The Network News Transfer Protocol (NNTP) is an application protocol used for transporting Usenet news articles (''netnews'') between news servers, and for reading/posting articles by the end user client applications. Brian Kantor of the Univ ...
,
ICMP,
SNMP
Simple Network Management Protocol (SNMP) is an Internet Standard protocol for collecting and organizing information about managed devices on IP networks and for modifying that information to change device behaviour. Devices that typically ...
,
FTP
The File Transfer Protocol (FTP) is a standard communication protocol used for the transfer of computer files from a server to a client on a computer network. FTP is built on a client–server model architecture using separate control and data ...
,
SSH
The Secure Shell Protocol (SSH) is a cryptographic network protocol for operating network services securely over an unsecured network. Its most notable applications are remote login and command-line execution.
SSH applications are based on a ...
)
** Monitoring of host resources (
processor load,
disk usage, system logs) on a majority of
network operating system
A network operating system (NOS) is a specialized operating system for a network device such as a router, switch or firewall.
Historically operating systems with networking capabilities were described as network operating systems, because they a ...
s, including
Microsoft Windows
*** Using agents such a
NSClient++ send_nsca,
Check MK, Thrift TSCA
*** Using agents permitting remotely run scripts via
Nagios Remote Plugin Executor (An embedded pure-Python implementation is included with Shinken)
*** Using agent-less methods such as SNMP,
WMI, scripted SSH or HTTP(SSL)
*** Send check results directly from programs using
Apache Thrift
Thrift is an interface definition language and
binary communication protocol
used for defining and creating services for numerous programming languages. It was developed at Facebook for "scalable cross-language services development" and as of 2 ...
(Java, Python, Ruby)
** Monitoring of systems which have the ability to send collected data via a network to specifically written plugins (Ex. VMWare ESX3/4/5, Collectd)
** Remote monitoring supported through
SSH
The Secure Shell Protocol (SSH) is a cryptographic network protocol for operating network services securely over an unsecured network. Its most notable applications are remote login and command-line execution.
SSH applications are based on a ...
or
SSL SSL may refer to:
Entertainment
* RoboCup Small Size League, robotics football competition
* ''Sesame Street Live'', a touring version of the children's television show
* StarCraft II StarLeague, a Korean league in the video game
Natural language ...
encrypted
In cryptography, encryption is the process of encoding information. This process converts the original representation of the information, known as plaintext, into an alternative form known as ciphertext. Ideally, only authorized parties can decip ...
tunnels.
** Simple plugin design that allows users to easily develop their own service checks depending on needs, by using the tools of choice (
shell scripts
A shell script is a computer program designed to be run by a Unix shell, a command-line interpreter. The various dialects of shell scripts are considered to be scripting languages. Typical operations performed by shell scripts include file manip ...
,
C++,
Perl
Perl is a family of two High-level programming language, high-level, General-purpose programming language, general-purpose, Interpreter (computing), interpreted, dynamic programming languages. "Perl" refers to Perl 5, but from 2000 to 2019 it ...
,
Ruby
A ruby is a pinkish red to blood-red colored gemstone, a variety of the mineral corundum (aluminium oxide). Ruby is one of the most popular traditional jewelry gems and is very durable. Other varieties of gem-quality corundum are called sapp ...
,
Python,
PHP
PHP is a General-purpose programming language, general-purpose scripting language geared toward web development. It was originally created by Danish-Canadian programmer Rasmus Lerdorf in 1993 and released in 1995. The PHP reference implementati ...
,
C#, etc.)
** Ability to calculate KPIs from State and performance data in the Shinken core to create new services and performance data
* System external interfaces
*
Livestatuscompatible API that exposes state, configuration and performance information
** Exports data to graphing modules
PNP4Nagios Graphite
Graphite () is a crystalline form of the element carbon. It consists of stacked layers of graphene. Graphite occurs naturally and is the most stable form of carbon under standard conditions. Synthetic and natural graphite are consumed on la ...
, and others available)
** Support for native messaging API of Android
** Export event data to logging systems using syslog and RabbitMQ
** Modules can be attached to any Shinken process to extend its capabilities in very efficient ways
* Performance
** Parallelized service and host checks available
** Ability to distribute poller processes on multiple servers
** Support for implementing easily
redundant and load balanced monitoring hosts
** Support for multiple redundant external interfaces
** Ability to route checks to dedicated pollers (processes specialized in executing plugins)
* Correlation and business intelligence
** Parent child relations
*** Ability to define network host hierarchy using "parent" hosts, allowing detection of and distinction between hosts that are down and those that are unreachable
*** 1 to 1, 1 to N
** Free form dependency trees between any service and host
*** 1 to 1, 1 to N
** Support for integrated business rules
*** Calculated hosts or services representing the state of a business service
*** Support assigning a business impact to each service, host or business process
** Ability to show only root problems
** Automatically changes child states to unknown when parent is unavailable
* Other features
** Contact notifications when service or host problems occur and get resolved (via
e-mail
Electronic mail (email or e-mail) is a method of exchanging messages ("mail") between people using electronic devices. Email was thus conceived as the electronic ( digital) version of, or counterpart to, mail, at a time when "mail" mean ...
,
pager
A pager (also known as a beeper or bleeper) is a wireless telecommunications device that receives and displays alphanumeric or voice messages. One-way pagers can only receive messages, while response pagers and two-way pagers can also acknowl ...
,
SMS
Short Message/Messaging Service, commonly abbreviated as SMS, is a text messaging service component of most telephone, Internet and mobile device systems. It uses standardized communication protocols that let mobile devices exchange short text ...
, or any user-defined method through plugin system)
** Ability to define event handlers to be run during service or host events for proactive problem resolution
** Ability to redefine the severity of an alert based on regular expression rules
** Support for UTF-8 objects names
** Support for monitoring multiple customers with one administration point
** Support for recurring downtimes through the maintenance_period attribute
** Advanced template system with inheritance and overloading
Architecture
A Shinken installation consists of several processes, each optimized for a specific task.
* Arbiter
** Loads the configuration files and dispatches the host and service objects to the scheduler(s)
** Watchdog for all other processes and responsible for initiating failovers if an error is detected
** Can route check result events from a Receiver to its associated Scheduler
** Arbiter modules
*** There is a variety of modules to manipulate configuration data
* Scheduler
** Plans the next run of host and service checks
** Dispatches checks to the poller(s)
** Calculates state and dependencies
** Applies KPI triggers
** Raises Notifications and dispatches them to the reactionner(s)
** Updates the retention file (or other retention backends)
** Sends broks (internal events of any kind) to the broker(s)
* Poller
** Gets checks from the scheduler, execute plugins or integrated poller modules and send the results to the scheduler
** Poller modules
*** NRPE - Executes active data acquisition for Nagios Remote Plugin Executor agents
*** SNMP - Executes active data acquisition for SNMP enabled agents (In beta stage using PySNMP)
*** CommandPipe - Receives passive status and performance data from check_mk script, will not process commands
* Reactionner
** Gets notifications and eventhandlers from the scheduler, executes plugins/scripts and sends the results to the scheduler
* Broker
** Has multiple modules (usually running in their own processes)
** Gets broks from the scheduler and forwards them to the broker modules
** Modules decide if they handle a brok depending on a brok's type (log, initial service/host status, check result, begin/end downtime, ...)
** Modules process the broks in many different ways. Some of the modules are:
*** webui - updates in-memory objects and provides a webserver for the native Shinken GUI
*** livestatus - updates in-memory objects which can be queried using an API by GUIs like Thruk or
Check_MK Multisite
*** graphite - exports data to a Graphite database
*** ndodb - updates an ndo database (MySQL or Oracle)
*** simple_log - centralize the logs of all the Shinken processes
*** status_dat - writes to a status.dat file which can be read by the classic cgi-based GUI
* Receiver (optional)
** Receives data passively from local or remote protocols
** Passive data reception that is buffered before forwarding to the appropriate Scheduler (or Arbiter for global commands)
** Allows to set up a "farm" of Receivers to handle a high rate of incoming events
** Modules for receivers
*** NSCA - NSCA protocol receiver
*** Collectd - Receive performance data from collectd via the network
*** CommandPipe - Receive commands, status updates and performance data
*** TSCA - Apache Thrift interface to send check results using a high rate buffered TCP connection directly from programs
*** Web Service - A web service that accepts http posts of check results (beta)
There can be multiple instances for each type of process, either on a single host or spread over many hosts. Adding more processes automatically distributes the load.
The Shinken WebUI is the builtin Web interface that provides near real time status information, configuration, interaction, a dashboard to visualize trending data from Graphite databases and the visualization of dependency tree graphs.
The Shinken skonfUI is an independent web front-end used to manage the discovery process and configuration tasks.
The shinken-admin CLI script is used to manage during runtime process level aspects of the system, such as changing logging levels and getting health reports.
The install.sh CLI script is the main management script to install, remove or update Shinken and its associated software.
Development
Shinken has an open and
test-driven development
Test-driven development (TDD) is a software development process relying on software requirements being converted to test cases before software is fully developed, and tracking all software development by repeatedly testing the software against a ...
approach, with contributors to the project providing new features, code refactoring, code quality and bug fixing.
The source code is hosted on
GitHub
GitHub, Inc. () is an Internet hosting service for software development and version control using Git. It provides the distributed version control of Git plus access control, bug tracking, software feature requests, task management, co ...
.
source code
on GitHub
GitHub, Inc. () is an Internet hosting service for software development and version control using Git. It provides the distributed version control of Git plus access control, bug tracking, software feature requests, task management, co ...
A
integration server
runs tests at each commit and in depth tests at regular intervals.
Th
Shinken documentation
is hosted on a wiki.
See also
* Comparison of network monitoring systems
The following tables compare general and technical information for a number of notable network monitoring systems. Please see the individual products' articles for further information.
Features
Legend
; Product Name : The name ...
* NRPE
* Icinga
* Nagios
Nagios Core , formerly known as Nagios, is a free and open-source computer-software application that monitors systems, networks and infrastructure. Nagios offers monitoring and alerting services for servers, switches, applications and servic ...
* Collectd
References
External links
*
Monitoring Plugins
the home of the official plugins
Linux Magazin
article about Shinken in the German Linux Magazin 04/2010
{{DEFAULTSORT:Shinken (Software)
Internet Protocol based network software
Free network management software
Multi-agent systems
Free software programmed in Python
Network analyzers
Nagios
Software using the GNU AGPL license