Loki/Grafana doesn't show borg-check job logs
https://privatestorage.slack.com/archives/C02FLDJ4T1C/p1676158758808339
monitoring.private.storage
[Alerting] Monthly check-repo run time alert
A borg check-repo job ran for more than five hours. After six hours it could collide with the daily backup job, depending on that job's "random" delay. If the backup set is large and this is expected to happen again, consider using borgbackup partial checks (--max-duration SECONDS parameter).
storage001
18352.934408903
Grafana v8.4.7
Flo
The borg check-repo job on storage1 ran for an exceptionally long time - over 5 hours instead of the normal ~ 1 hour.
I had set and alarm for jobs running longer than 5 hours because after 6 hours they could in theory collide with daily backup jobs from the same machine.
For now I'd do nothing and see if this happens again - might well be something spurious (the network weather in LA maybe?).
I tried to find out details in Grafana/Loki / the collected journald logs, but couldn't. I don't know if there's nothing there (why?) or it just isn't displayed (because Grafana is really buggy when querying Loki). Log aggregation not working well is an issue waiting to blow up and I created a ticket for it.