I recently noticed that the Triton Inference Server reports metrics in the Prometheus format and decided to learn more about this tool. I picked up Prometheus Up and Running written by one of its developers named Brian Brazil.
Prometheus is an infrastructure and application performance monitoring tool for use in datacenters running web applications or databases. Like classic Unix tools, it has been designed to focus only on monitoring and nothing else. It even ships as a single static binary and it ingests YAML text files to read its configuration. It is based on the pull model (instead of push) and so will scrape the applications or systems you want to monitor at a given frequency.
Prometheus is popular, so most well-known applications and databases already support exporting their runtime performance metrics to Prometheus. For applications which don’t, Prometheus provides client libraries and runtimes in every conceivable language, so that you instrument your custom application with the Prometheus API.
Exposition is how Prometheus reads the metrics from your application. Again for most applications there is support of their metrics. For custom applications, the simple text format of the Prometheus metrics are demonstrated with examples. To enhance the metrics, labels can also be used to annotate further about subtypes of metrics or system information. While online and offline systems can work with the pull model of Prometheus, batch systems which run at preset times cannot. There is a description of how to support such batch systems using the Pushgateway.
The author makes the singular vision and focus of Prometheus very clear. For example, Prometheus does not even try to do dashboards. If you need to look at the metrics on dashboards, we are shown how to use Grafana to hook up to Prometheus to do that. Several later chapters delve into the details of the PromQL query language, how to set alerts, and how to deploy Prometheus in a datacenter.
The book shows the steps to try out Prometheus with various simple custom applications and also with well-known applications (like Kubernetes). The examples can be run on a single machine, which is convenient to try things out in this age of cloud nodes. Since the author is also a core developer of this tool, he succeeds in giving a good mental model of its architecture, how to think about it, what it is good at and what it’s not.