Uptime Kuma – Statuspage with possibilities

Goal

In this post we will create a statuspage with Uptime Kumato display cloud as well as on prem services for non admin users.

Background

I have during the years I have worked with IT used many monitoring tools. Some good and some bad. Some of them are Nagios, Zabbix, Check MK and OP5. To display data I have used tools like Grafana and Nagvis. If I remember correct Nagvis was the only tool I was able to use to create a public static info page for my users to get a clear picture of what was actually working. The result were ugly, bad and it was a challenge to make the Nagvis map public. In general Nagvis is crap.

I think it looked something like this. And the green dots were Nagvis icons from Nagios.

I have also used Grafana as a public dashboard. Grafana is great to present data but it is not a statuspage. I am now using Check Mk and the Grafana datasource but still I am not able to display up/down status. It can look something like this.

Solution

So recently I found this amazing project called Uptime Kuma. It is a self hosted statuspage just like Sorry app or status.io. I deployed this as a docker container. Created an internal dns name and made a virtual nginx host pointing to that container.  The frontend of the statuspage looks like below. It is simple and clean. I won´t explain in detail how to do this as it is so simple. Just login and create your monitors.

In my case my container is running in our lan. And this makes it so good. I am able to access all my internal servers as well as the internet and do not need to expose servers as I would have to do if I was using a service like sorry app. So I decided to divide the page into two categories. Cloud and On Prem.

First of all I created monitors in the Kuma dashboard for my cloud services. In the example below Zoom from the Zoom statuspage. I was able to put all statuspages in one simple page so from now on our users would just have to take a look here.

There are some different options to use for a monitor. In my case I have used keywords.

When created go to the dashboard and one and place it under the group you want.

Polling from Check MK

I have read some requests for Uptime Kuma to implement various checks. I would like to pint out that it is a statuspage displaying red or green. If you want something more detailed use something else like Grafana. My main monitoring tool is Check MK. It can monitor almost anything and you will get details like CPU, Memory, Disk etc. See the image below. This is one Linux host. However regular users are not really interested  in kernel performance. They just want to know if things works.

In Check MK and other systems you can use something called Business Intelligence and Aggregations. It will create one state that will be up or down depending of al lot of other hosts and/or services. In the example below it is dependent of the cloud service Slack, some local file servers and some other api endpoints. If some parts are broken the whole system might stop. Make sure you have created BI first.

So this is where the magic happens and what makes Uptime Kuma so fantastic. I can actually poll Check MKs API. However there are some drawbacks. Kuma can only use basic authentication. The workaround is to put a small script on the Check MK server. It is not a perfect solution that would scale but it works. The script could be better but works.

Start by adding a script on the CMK server.

cd omd/sites/white/local/bin
nano myservice_aggr.py

Add the code below and change sitename, service description and the output textfile.

#!/usr/bin/env python3
import pprint
import requests
from datetime import datetime

HOST_NAME = "localhost"
SITE_NAME = "mysite"
API_URL = f"http://{HOST_NAME}/{SITE_NAME}/check_mk/api/1.0"

USERNAME = "automation"
PASSWORD = "mypassword"
state = ""

session = requests.session()
session.headers['Authorization'] = f"Bearer {USERNAME} {PASSWORD}"
session.headers['Accept'] = 'application/json'

resp = session.get(
f"{API_URL}/objects/host/MYHOST.MYDOMAIN.SE/actions/show_service/invoke",
params={ # goes into query string
"service_description": 'Aggr Test', # The service description of the selected host
},
)
if resp.status_code == 200:
myresponse = resp.json()['extensions']['state']
if myresponse == 0:
state = state + "OK"
else:
state = state + "WARNING"

elif resp.status_code == 204:
print("Done")
else:
raise RuntimeError(pprint.pformat(resp.json()))

now = datetime.now()

with open("/omd/sites/white/var/www/result/aggr.txt","w") as file1:
file1.write(str(now) + "\n")
file1.write(state)

Now run the script. Look at

https://yoursite.domain.se/mysite/result/aggr.txt

It should display a warning. Or an error because the service “Aggr Test” does not exists yet. To fix this we will have to use an existing dummy host in Check MK and add the aggregation as a service.

First create a rule in CMK

Other integrations->BI Aggregations

Make sure to use an exact match of the BI name.

Rescan the host and find the new service like below.

There is also a check called Script at the bottom. We need this in order to schedule the script we created first. To add this we need a rule like below. Enter the path for the command. Run the check and now it should find the service that we added below.

We now have a service for the aggregation and this service is created as a textfile. Time to finish it. Go back to your Uptime Kuma. Add a new monitor with http keywords. Use the url of the textfile. Make sure to skip certificate and also add basic authentication at the bottom right. Save and test. Add the monitor on your dashboard.

Final thoughts

Uptime Kuma is great and it works out of the box as a statuspage for all your cloud services. In this example I have also added about 20 hosts and about 150 services from my monitoring system into the public statuspage. If the disk of one of my on-prem server fails it will display as down in the statuspage.

I will still use my normal monitoring. However this is for our users as well for more advanced users. You can create a special dashboard in Kuma for a specific departement with more hosts and checks.