- Aug 16, 2022
-
-
Florian Sesser authored
Since ZoL frees ARC under memory pressure, let's not count it as "used" but instead as "free" memory.
-
- Aug 03, 2022
-
-
Florian Sesser authored
-
- Jul 11, 2022
-
-
Florian Sesser authored
-
- Jun 13, 2022
-
-
Florian Sesser authored
node_memory_MemAvailable_bytes is a better estimator than the sum I used before says some Prometheus documentation. It is also almost the same, but reads nicer.
-
- Apr 29, 2022
-
-
Florian Sesser authored
-
- Apr 13, 2022
-
-
Florian Sesser authored
-
- Mar 14, 2022
-
-
Florian Sesser authored
This should fix the current alerts for our RAID arrays. It's only "should" because I can't test it proper without said RAID arrays in the dev or staging machines.
-
- Feb 25, 2022
-
-
Florian Sesser authored
The newer "Time Series" panel does not support two axes.
-
- Feb 22, 2022
-
-
Florian Sesser authored
-
Florian Sesser authored
-
Florian Sesser authored
- Also alert on "negative" (== receiving) errors - Use 'rate' so we can get out of the reporting if situation normalizes
-
- Feb 17, 2022
-
-
Florian Sesser authored
-
Florian Sesser authored
Grafana 8 recommends these for added performance and capabilities.
-
- Feb 14, 2022
-
-
Florian Sesser authored
-
Florian Sesser authored
-
Florian Sesser authored
-
Florian Sesser authored
- (tryfix) Switch to Loki datasource; - Filter requests to metrics instead of all GET; - Show the last week by default - Show log times Interesting and not quickly fixable as it looks like: the GET ... line comes *after* the result. We might be logging this wrong to begin with, but probably this is some next gen AI line ordering intelligence
-
- Feb 11, 2022
-
-
Florian Sesser authored
Or rather, let it listen only on localhost. I thought Grafana needed it, or Promtail needed it, but I don't remember clearly, the web doesn't say clearly, and on my local dev stack Promtail/Loki seems to still work just fine without GRPC on the network.
-
Florian Sesser authored
-
- Feb 04, 2022
-
-
Florian Sesser authored
Besides migrating all charts to the faster and more capable Grafana 8 TimeSeries charting tool this mainly introduces two panels to view the logs of PaymentServer. It has some naive and minimal filtering in place to only show lines that are not caused by the metrics gathering itself or that I just deemed to be OK (like "GET" requests) so we hopefully catch those dreaded "Unexpected Exception" lines. Fixes privatestorageops#207.
-
- Feb 02, 2022
-
-
Florian Sesser authored
-
Florian Sesser authored
This is my latest version of this, updated to work with the packages in NixOS 21.05.
-
- Jan 18, 2022
-
-
Florian Sesser authored
Thanks to @jcalderone for the suggestion!
-
- Jan 13, 2022
-
-
Florian Sesser authored
Fixes privatestorageops#408
-
- Jan 07, 2022
-
-
Tom Prince authored
Also strip that domain component from the labels collected.
-
- Nov 12, 2021
-
-
Florian Sesser authored
This adds a detailed dashboard for tahoe-lafs running on one node.
-
- Nov 03, 2021
-
-
Florian Sesser authored
-
Florian Sesser authored
-
Florian Sesser authored
Copy everything from how the issuer does it.
-
Florian Sesser authored
-
Florian Sesser authored
literalExample is deprecated and overused. See: https://whetstone.privatestorage.io/privatestorage/PrivateStorageio/-/merge_requests/201#note_16966
-
Florian Sesser authored
Thanks for the suggestion @jcalderone!
-
- Oct 25, 2021
-
-
Florian Sesser authored
-
- Oct 21, 2021
-
-
Florian Sesser authored
-
Florian Sesser authored
-
- Oct 15, 2021
-
-
Florian Sesser authored
Response time, probe fails and TLS expiry incl. alerts.
-
Florian Sesser authored
The Blackbox exporter can be used to check whether some services do answer; We'll use it for our HTTPS endpoints. Especially handy is it's checking for TLS cert expiry.
-
- Oct 14, 2021
-
-
Florian Sesser authored
This change reflects my intention of having an alert fire when a node is not scrapable for five minutes or longer. I hadn't understood the default avg() function well enough before, yielding an alert that would fire too soon (i.e. any time a node was down, even only one minute for a reboot).
-
- Oct 10, 2021
-
-
Florian Sesser authored
Fixes privatestorageops#358.
-
Florian Sesser authored
-