Calls to `curl` will now use the `--fail` option, in addition to
`--silent` and `--show-error`, in an effort to catch issues with server
or client-side errors.
Containers for Prometheus and Grafana can take longer to start due to
migrations on large databases etc., which in turn can cause systemd to
kill these mid-execution.
This includes a complete overhaul of the Containerfile for moving to a
self-contained build based on Bookworm, a move to Quadlet, and the
inclusion of a node exporter for node metrics with a default Grafana
dashboard.
As the former does not do exactly what it says it does in documentation.
Also, we decrease the default scrape interval for Prometheus from 1m to
30s to improve granularity of data.
This commit adds two services, `grafana` and `prometheus`, and sets up
some existing services (`dovecot` and `prosody`) to expose metrics into
Grafana. In addition, systemd services have been added to facilitate
registering metrics for services into Prometheus, as well as
automatically provisioning Grafana dashboards based on static JSON
representations.
This work will continue to evolve as more services gain proper Grafana
dashboards, and Loki is also integrated for access to the systemd
journal.