The idea was to run 2 Prometheus instances in a high-availability (HA) setup with docker. Furthermore, we would like to introduce Thanos into this setup, in order to have a single datasource in Grafana for all Prometheus instances, and also to use long-term cloud storage for Prometheus data, which Thanos also enables.

Usually, a set of HA Prometheus instances would have identical configuration files (prometheus.yml) - same targets, same labels etc. In case something would happen to one of the instances, a switch on some level could easily be made to the second instance, which would have virtually the same data as the first instance.

Thanos, among other things, offers the ability to deduplicate metrics coming from HA Prometheus instances. To be able to do that, it needs external_labels, which must differ from instance to instance in the same HA set, so we would need to have at least one unique external_labels parameter in prometheus.yml on each of the instances.

One could use separate, static files for each of the instances, but that seemed too cumbersome to maintain. An approach, which would enable something in the container to "fill in" a configuration file template with correct values upon startup, seemed more convenient. This template would be shared among all instances, and deployed with the container.

To accomplish that, we used:

  1. prom/prometheus container as base for our customized Prometheus container
  2. consul-template to fill in prometheus.yml template inside the container
  3. dumb-init to run consul-template inside the container, which would run Prometheus itself, after the template would be filled in

consul-template

consul-template is a tool from HashiCorp, which:

queries a Consul or Vault cluster and updates any number of specified templates on the file system. As an added bonus, it can optionally run arbitrary commands when the update process completes.

At this point we will not integrate with Consul or Vault just yet, but it will be useful nonetheless.

For our case we need 2 files:

  1. Configuration file for consul-template
  2. A prometheus.yml.ctpl template file to fill in.

As consul-template configuration file (config.hcl) we used:

exec {
  command       = "/bin/prometheus --config.file=/etc/prometheus/prometheus.yml"
  reload_signal = "SIGHUP"
  kill_signal   = "SIGTERM"
  kill_timeout  = "15s"
}

template {
  source      = "/etc/prometheus/prometheus.yml.ctpl"
  destination = "/etc/prometheus/prometheus.yml"
  perms       = 0640
}

The important parts are command, source and destination.

source defines the path of the template within the container, which should be used to generate the actual Prometheus configuration file in the destination path.

command defines, which command should be run, after the configuration file has been generated.

dumb-init

dumb-init is a simple process supervisor and minimal init system. It would allow us to string together consul-template and Prometheus inside the same container. We would use dumb-init as ENTRYPOINT for our Prometheus container.

Besides the project description on GitHub, I also found Yelp's Engineering blog post on dumb-init useful to understand its role in this case.

prometheus.yml template

This is the initial template file (prometheus.yml.ctpl) we prepared for consul-template to fill in:

global:
  scrape_interval:     30s
  evaluation_interval: 30s
  external_labels:
    prometheus_replica: {{ or (env "PROMETHEUS_REPLICA") "none" }}

scrape_configs:
- job_name: prometheus
  honor_labels: true
  static_configs:
  - targets:
    - localhost:9090
  relabel_configs:
  - target_label: instance
    replacement: {{ or (env "PROMETHEUS_INSTANCE") "local" }}

We replace 2 parameters with values stored in specific environment variables on the docker host with consul-template. These parameters could come from other sources as well, e.g., Vault or Consul, as mentioned earlier. Some default values are set here too, in case the environment variables are not set for some reason, so we would not end up with an invalid Prometheus configuration file.

These environment variables would be set on the docker  instances we would run Prometheus containers on, and passed on to the containers when they are run via --env-file parameter for docker or env_file parameter in docker-compose file.

The rest of the file is quite basic - Prometheus is configured to scrape itself for built-in metrics and do some relabeling.

This template, along with consul-template configuration and binary, should be inside the container, so we need a new Dockefile and docker image.

Dockerfile

FROM hashicorp/consul-template:0.20.0-scratch as consul-template
FROM prom/prometheus:v2.7.1

# Get dumb-init 
RUN wget --no-check-certificate -O /home/dumb-init \
    https://github.com/Yelp/dumb-init/releases/download/v1.2.2/dumb-init_1.2.2_amd64 && \
    chmod +x /home/dumb-init

# Copy consul-template from consul-template image
COPY --from=consul-template /consul-template /bin/consul-template

# Copy consul-template configuration
COPY ./config/consul-template/config.hcl /consul-template/config.hcl

# Copy Prometheus configuration file template
COPY ./config/prometheus/prometheus.yml.ctpl /etc/prometheus/prometheus.yml.ctpl

ENTRYPOINT ["/home/dumb-init", "--"]

CMD ["/bin/consul-template", "-config=/consul-template/config.hcl"]

The important part here is the ENTRYPOINT/CMD - we configure dumb-init to run consul-template as the container starts up. It would then, based on the configuration we provided in config.hcl, populate prometheus.yml.ctpl template, creating prometheus.yml. Lastly, it would start Prometheus itself, as configured config.hcl.

Summary

This setup enables dynamic setting of external_labels in Prometheus configuration files on HA Prometheus instances running in docker containers. Thanos would then be able to use these labels to perform metric deduplication, which is crucial in similar HA setups.