Monitoring (Server)
Tip
This page covers all aspects regarding monitoring on the server level. For informations about the monitoring of individual websites, see Monitoring (Website).
Availability (external)
We closely monitor all aspects of your server. According to your service level, our on call organisation will take appropriate actions if required.
Availability (internal)
Monit, nginx and PHP FPM (if installed) status pages are available at http://localhost:2813/:
http://localhost:2813/monit/: Monit service manager displaying status of all locally monitored processeshttp://localhost:2813/nginx/: nginx stub status outputhttp://localhost:2813/fpm-<poolname>/: PHP FPM per pool status page
Tip
this status vhost is running on localhost only. Expose port 2813 through SSH to access locally: ssh <hostname> -L 2813:localhost:2813
Tip
The monit status can also be checked in the terminal with monit-status as devop user (see Generic Admin User).
Reboot
A automatic reboot is initiated to solve certain high usage scenarios:
5 minute average load higher than CPU count * 10 for 5 minutes
memory usage higher than 95% for 5 minutes
Tip
always make sure that any required services will be up and running automatically
If your managed server was rebooted too many times in certain period of time, this is detected by our monitoring system and you’ll be notified by e-mail.
To understand what happened, you can use the following commands with the devop user (see Generic Admin User):
# see a list of the latest reboots of your server
last reboot
# check whether monit triggered a reboot
grep 'monit-reboot:' /var/log/syslog*
Out of Memory (OOM)
In case the memory of your managed server is exhausted by the running processes, the Linux operating system starts to protect itself by killing processes that consume bigger amounts of memory. Doing so frees up system memory with the intention to keep the overall system running and responsive.
If the OOM-Killer get’s invoked too often this is a sign that your managed server could be short of resources. Our monitoring will detect this and notify you by e-mail.
To troubleshoot memory exhaustion, you can use the following commands with the devop user (see Generic Admin User):
# list what caused the oom-killer to do something
cat /var/log/syslog* | grep oom-killer -A2
# see what processes got killed by the oom_reaper
cat /var/log/syslog* | grep oom_reaper
Utilization
Netdata
Netdata is a real-time, interactive web dashboard collecting data every second. Metrics are saved in memory
and kept for 1 hour only. You can reach its webinterface at http://localhost:19999.
To connect from your local computer, either forward port 19999 through SSH (ssh <hostname> -L 19999:localhost:19999),
or add a reverse proxy website forwarding requests to http://localhost:19999
Warning
when using the reverse proxy method, make sure to enable HTTPS and password protection
collectd
System statistics are collected every 10 seconds by collectd and written to RRD files in
/var/lib/collectd. For performance reasons, we don’t create graphs by default, therefore you have
to download and render them with a tool of your choice by yourself.
Please select a rendering-tool from list of frontends
within the collectd wiki. We recommend collectd-web.
For Debian-based Linux Distributions
Installation:
sudo apt-get install librrds-perl libjson-perl libhtml-parser-perl
git clone https://github.com/httpdss/collectd-web.git
echo 'datadir: "/tmp/rrd"' | sudo tee /etc/collectd/collection.conf
Fetch data and render graphs:
rsync -avz <server>:/var/lib/collectd/rrd/ /tmp/rrd/
cd /path/to/collectd-web
python runserver.py
Then open collectd-web at http://127.0.0.1:8888/.
collectd-web with Docker
A Docker image is also available.
rsync -avz <server>:/var/lib/collectd/rrd/ /tmp/rrd/
docker run -p 8888:80 --volume /tmp:/tmp -it registry.gitlab.com/opsone_ch/docker-collectd-web:latest
Then open collectd-web at http://127.0.0.1:8888/.