Skip to content

Monitoring: Available memory should include ZFS ARC buffer cache

The "available memory" graph in Grafana currently excludes the ZFS ARC buffer cache. Since that cache is reduced under memory pressure, AFAIK it should be excluded from "unavailable" memory.

I wanted to add that in !304 (merged) , but couldn't finish the job. This issue ticket is a reminder to myself, should I stumble upon a way later.

I couldn't forge a Prometheus query for adding the ARC cache to the count of available memory, since nodes without ZFS won't return a value for node_zfs_arc_size and the Prometheus query language makes it hard for me to just use 0 if that is not available:

Unfortunately Prometheus doesn't provide an easy ability to fill gaps with zeroes if q returns multiple time series with distinct labelsets. Other Prometheus-like solutions such as VictoriaMetrics provide default operator for this case.

Screen shot:
image
After staging storage001 ran its first full backup, filling the ARC cache, the "RAM used" graph is forever not showing what I intended it to show - memory that is committed and cannot be freed.

Edited by Florian Sesser