How does the Monitoring Software of Bloonix work

Overview over the framework

Components

The Bloonix monitoring software consists of 5 main components:

  • WebGUI
  • Server
  • Agent
  • Satellite
  • Plugins

Bloonix WebGUI

The Bloonix WebGUI is the user interface and is used to manage hosts and services, users, groups, contacts and much more. The WebGUI runs on the client side as a JavaScript application in the browser. On the server side, the WebGUI runs as a REST API behind a reverse proxy (Nginx), which makes it possible to query the WebGUI automatically. The data format of the REST API is JSON.

A modern and HTML5-capable browser is required for the WebGUI. We recommend using Chrome, Safari, Firefox or Opera. Internet Explorer should also work, but this browser is not tested in any version.

Bloonix Server

The Bloonix server is the core of the monitoring framework and the interface for the Bloonix agents and satellites. When the Bloonix server is started, several process pools are started, with each process pool having a specific task to perform. The following pools will be started:

  • Listener
  • DB Manager
  • Keepalived
  • Remote Scheduler
  • Remote Checker
  • WTRM Scheduler
  • WTRM Checker
  • Timeout Scheduler
  • Timeout Checker

WTRM stands for Web transaction manager.

Pool Listener

In the pool Listener, multiple processes are started that listen on a port and wait for requests from the Bloonix agents in order to transmit to the agents the checks that should be performed for the host. The pool also receives the status and metrics from the Bloonix agents in order to validate them, save them in the database and check whether an event needs to be triggered, such as sending an email to an administrator to inform them that a service is overloaded or no longer available.

Pool DB Manager

A single process is started in the pool DB Manager, which is responsible for managing the database and, for example, creates or deletes the required partitions for metrics and events.

Pool Keepalived

In the Keepalived pool, a single process is started that is only there to determine the master in a cluster setup in which multiple servers are used. The master is selected via a Redis queue in which each Bloonix server registers.

Pool Remote Scheduler, Remote Checker

The Bloonix Server is able to perform checks like the Bloonix Agent, with the exception that these services must be accessible via a TCP/IP or UDP/IP connection. This makes it possible to check Routers and switches, as no agent can be installed on these devices. However, many other checks can be performed, e.g. to check websites via HTTP or servers via ping.

In addition, there is the option of routing certain checks via satellites, e.g. HTTP checks can be performed from any location worldwide. This allows you to check how fast access to a website is from other countries.

In the pool Remote Scheduler, only 1 process is started, which writes the available services into a Redis queue. The processes in the pool Remote Checker pull the checks from the Redis queue. The number of processes to be started for this pool can be set in the Bloonix server configuration file.

Pool WTRM Scheduler, WTRM Checker

The execution of web transactions is controlled in this pool. Since web transactions require a huge amount of CPU power, this pool can be used to control how many web transactions can run in parallel. The pool WTRM Scheduler starts with 1 process and writes the available web transactions to a Redis queue and the pool WTRM Checker pull the checks from the Redis queue. The number of processes in the pool WTRM Checker can be set in the Bloonix Server configuration file.

Pool Timeout Scheduler, Timeout Checker

This pool is used to check all services that have not been checked in a certain period of time. If, for example, a server crashed and the agent that was running on the server can no longer deliver data, then this pool triggers a pseudo alarm with level CRITICAL.

The pool Timeout Scheduler starts with one process and writes expired services to the Redis queue and the pool Timeout Checker pulls the services from the Redis queue. The number of processes in the pool Timeout Checker can be set in the Bloonix Server configuration file.

Bloonix Agent

The Bloonix agent is installed on the system to be monitored and runs there as a daemon that permanently monitors the system. The Bloonix agent connects to the Bloonix server, authenticates itself with a host ID and password and queries the services that are to be monitored.

The Bloonix agent can also be installed on a central server in order to monitor web services and network components such as routers, switches and load balancers.

The Bloonix Agent can also be installed on a central server in networks to which the Bloonix Server does not have access, in order to monitor routers, switches or other network services from there. This feature can be found in the WebGUI under Location Groups. Once set up, no host ID with password is entered into the agent’s configuration file, but rather a location group ID with password.

Bloonix Plugins

Bloonix plugins are installed together with the Bloonix agent, server and satellites. These are small scripts that check the status of one or more services and at the same time provide metrics of the services.

On Linux systems, the plugins are usually installed under /usr/lib/bloonix/plugins. The execution of a plugin looks like this:

./check-http --stdin --pretty <<EOT
{"url":"https://www.bloonix.de/"}
EOT
{
    "message": "HTTP/2 200, Total time = 49.299ms, cert expires at 2024-12-03 11:27:54 GMT [OK]",
    "status": "OK",
    "stats": {
        "time_connect": 0.688,
        "time_first_byte": 2.377,
        "time_namelookup": 1.763,
        "time_overhead": 0.097,
        "time_ssl_handshake": 43.762,
        "time_total": 49.299,
        "time_transfer": 0.612
    }
}

Bloonix Satellite

With the Bloonix satellite it is possible to monitor external services from multiple locations and can be installed on any system around the globe. The Bloonix server connects to the Bloonix satellite and transmits the services, such as HTTP or ping checks, that need to be executed.

Once satellites have been set up via the WebGUI, they are available for selection in the service configuration form.