- Apr 21, 2023
-
-
Florian Sesser authored
-
- Mar 13, 2023
-
-
Florian Sesser authored
Recently this alarm fired a couple of times when the backup self- check ran. Set it to alert only when load is over 1 for longer than 2 hours.
-
- Feb 06, 2023
-
-
Florian Sesser authored
-
- Nov 09, 2022
-
-
Florian Sesser authored
-
- Oct 12, 2022
-
-
Florian Sesser authored
Refs #129
-
- Sep 13, 2022
-
-
Florian Sesser authored
-
Florian Sesser authored
-
Florian Sesser authored
-
Florian Sesser authored
This comes from nixpgks commit 81291cc793cf88bd6eff3fd8512e5eb9d037066c and will be included with nixos 22.11.
-
Florian Sesser authored
-
- Sep 12, 2022
-
-
Florian Sesser authored
... and use a smarter Prometheus query to combine the two.
-
- Sep 10, 2022
-
-
Florian Sesser authored
-
- Sep 08, 2022
-
-
Florian Sesser authored
This should implement my actual intentions: Alert when backups run for longer than 3h, and the the repo check for more than 6h.
-
Florian Sesser authored
-
Florian Sesser authored
Fixes privatestorageops#287
-
Florian Sesser authored
Forgot to add a second alert when I added the workaround to not count the ZFS ARC into used memory :/
-
Florian Sesser authored
... which is both governed by our retention policy.
-
- Sep 07, 2022
-
-
Florian Sesser authored
-
-
- Sep 06, 2022
-
-
Florian Sesser authored
-
Florian Sesser authored
-
- Sep 05, 2022
-
-
Florian Sesser authored
-
Florian Sesser authored
-
- Aug 31, 2022
-
-
Florian Sesser authored
This is a bit buggy still in our version of Grafana, but already nice to look at / maybe useful. Refs privatestorageops#429
-
Florian Sesser authored
This adds alerting to the backup job duration graph: Grafana alerting works with systemd unit metrics, i.e. a backup job unit being "active" for too long. Use that fact for alerting on long-running backup jobs.
-
Florian Sesser authored
... instead of connected lines default, also with working label for host Refs privatestorageops#429
-
Florian Sesser authored
Refs privatestorageops#429
-
- Aug 29, 2022
-
-
Florian Sesser authored
Failed backups now have a filled red area instead of a thin yellow line. Refs privatestorageops#429.
-
- Aug 17, 2022
-
-
Florian Sesser authored
, a dashboard that "displays a lot of data about one single host". This is
-
Florian Sesser authored
-
Florian Sesser authored
One query for hosts with ZFS and one for those without.
-
- Aug 16, 2022
-
-
Florian Sesser authored
Since ZoL frees ARC under memory pressure, let's not count it as "used" but instead as "free" memory.
-
- Aug 03, 2022
-
-
Florian Sesser authored
-
- Jul 11, 2022
-
-
Florian Sesser authored
-
- Jun 13, 2022
-
-
Florian Sesser authored
node_memory_MemAvailable_bytes is a better estimator than the sum I used before says some Prometheus documentation. It is also almost the same, but reads nicer.
-
- Apr 29, 2022
-
-
Florian Sesser authored
-
- Apr 13, 2022
-
-
Florian Sesser authored
This is only semi correct. We keep logs up to what's specified in the privacy policy. Currently, that is implemented as keeping them for up to 30 days on the individual nodes as well as on the central Loki server.
-
Florian Sesser authored
-
Florian Sesser authored
-
Florian Sesser authored
-