First version of borgbackup monitoring dashboard and alerts
Refs privatestorageops#429.
This adds monitoring and alerting of our borg backup jobs of our user ciphertext to borgbase.com.
It relies on the systemd
unit status - which at least for the "check repo" job needs further investigation (need to check: does borg correctly exit with an error code when encountering an issue).
I would also like to add statistics (backup size, run time, etc) from the logs, but currently Loki does not have any of the borgbackup output and journal data (Investigate: why is that?)
There's alerts for timers not firing and jobs failing.
The dashboard looks like this (on staging, only one running ciphertext backup):
Edited by Florian Sesser