Introduction
Shinken Enterprise can be configured to support distributed monitoring of network checks and resources.
...
The goal in the distributed monitoring environment is to offload the overhead (CPU usage, etc.) of performing and receiving check checks from a "central" server onto one or more "distributed" servers. Most small to medium sized size shops will not have a real need for setting up such an environment. However, when you want to start monitoring thousands of hosts (and several times that many checks) using Shinken Enterprise , this becomes quite important.
The global architecture
Shinken Enterprise's architecture has been designed according to the Unix Way: one tool, one task.
Shinken Enterprise has an architecture where each part is isolated and connects to the others via standard interfaces. Shinken Enterprise is based on the a HTTP backend. This makes building a highly available or distributed monitoring architecture quite easyeasily.
Shinken core uses distributed programming, meaning a daemon will often do remote invocations of code on other daemons, this means that to ensure maximum compatibility and stability, the core language, paths and module versions must be the same everywhere a daemon is running.
Shinken Enterprise Daemon roles
This part is analysed on the daemon page.
The smart and automatic load balancing
Shinken Enterprise is able to cut the user configuration into parts and dispatch it to the schedulers. The load balancing is done automatically: the administrator does not need to remember which host is linked with to another one to create shards, Shinken Enterprise does it for him.
The dispatch is a host-based one: that means that all checks of a host will be in the same scheduler as than this host. The major advantage of Shinken Enterprise is the ability to create independent configurations: an element of a configuration will not have to call an element of another shard. That means that the administrator does not need to know all relations among elements like parents, hostdependencies or check dependencies: Shinken Enterprise is able to look at these relations and put these related elements into the same shards.
...
- parent relationship for hosts (like a distant server and its router)
- hostdependencies
Shinken Enterprise looks at all these relations and creates a graph with it. A graph is a relation shard. This can be illustrated by the following picture :
...
When all configurations are created, the Arbiter sends them to the N active Schedulers. A Scheduler can start processing checks once it has received and loaded it's configuration without having to wait for all schedulers to be ready(v1.2). For larger configurations, having more than one Scheduler, even on a single server is highly recommended, as they will load their configurations(new or updated) faster. The Arbiter also creates configurations for satellites (pollers, reactionners and brokers) with links to Schedulers so they know where to get jobs to do. After sending the configurations, the Arbiter begins to watch for orders from the users and is responsible for monitoring the availability of the satellites.
The high availability
The shinken Shinken Enterprise architecture is a high availability one. Before looking at how this works,let's take a look at how the load balancing works if it's now already done.
...
The administrator needs to send orders to the schedulers (like a new status for passive checks). In the Shinken Enterprise way of thinking, the users only need to send orders to one daemon that will then dispatch them to all others. In Nagios the administrator needs to know where the hosts or checks are to send the order to the right node. In Shinken Enterprise the administrator just sends the order to the Arbiter, that's all. External commands can be divided into two types :
...
For each command, Shinken Enterprise knows if it is global or not. If global, it just sends orders to all schedulers. For specific ones instead , it searches which scheduler manages the element referred by the command (host/check) and sends the order to this scheduler. When the order is received by schedulers they just need to apply them.
Different types of Pollers: poller_tag
The current Shinken Enterprise architecture is useful for someone that uses the same type of poller for checks. But it can be useful to have different types of pollers, like GNU/Linux ones and Windows ones. We already saw that all pollers talk to all schedulers. In fact, pollers can be "tagged" so that they will execute only some checks.
...
Advanced architectures: Realms
Shinken Enterprise's architecture allows the administrator to have a unique point of administration with numerous schedulers, pollers, reactionners and brokers. Hosts are dispatched with their own checks to schedulers and the satellites (pollers/reactionners/brokers) get jobs from them. Everyone is happy.
Or almost everyone. Think about an administrator who has a distributed architecture around the world. With the current Shinken Enterprise architecture the administrator can put a couple of scheduler/poller daemons in Europe and another set in Asia, but he cannot "tag" hosts in Asia to be checked by the asian scheduler . Also trying to check an asian server with an european scheduler can be very sub-optimal, read very sloooow. The hosts are dispatched to all schedulers and satellites so the administrator cannot be sure that asian hosts will be checked by the asian monitoring servers.
In the normal Shinken Enterprise Architecture,it is useful for load balancing with high availability, for single site.
Shinken Enterprise provides a way to manage different geographic or organizational sites.
...