Skip to content

First version of borgbackup monitoring dashboard and alerts

Florian Sesser requested to merge 429.monitor-ciphertext-backup-dashboard into develop

Refs privatestorageops#429.

This adds monitoring and alerting of our borg backup jobs of our user ciphertext to borgbase.com.

It relies on the systemd unit status - which at least for the "check repo" job needs further investigation (need to check: does borg correctly exit with an error code when encountering an issue).

I would also like to add statistics (backup size, run time, etc) from the logs, but currently Loki does not have any of the borgbackup output and journal data (Investigate: why is that?)

There's alerts for timers not firing and jobs failing.

The dashboard looks like this (on staging, only one running ciphertext backup):

2022-08-03_backups-dashboard

Edited by Florian Sesser

Merge request reports