Skip to main content

Configure Alertmanager

Alertmanager is used by Calico Cloud to route alerts from Prometheus to the administrators. It handles routing, deduplicating, grouping, silencing and inhibition of alerts.

More detailed information about Alertmanager is available in the upstream documentation.

Updating the AlertManager config

  • Save the current alertmanager secret, usually named alertmanager-<your-alertmanager-name>. Our manifests will end up creating a secret called: alertmanager-calico-node-alertmanager.

    kubectl -n tigera-operator get secrets alertmanager-calico-node-alertmanager -o yaml > alertmanager-secret.yaml
  • The current alertmanager.yaml file is encoded and stored inside the alertmanager.yaml key under the data field. You can decode it by copying the value of alertmanager.yaml and using the base64 command.

    echo "<whatever-you-copied>" | base64 --decode > alertmanager-config.yaml
  • Make necessary changes to alertmanager-config.yaml. Once this is done, you have to re-encode and save it to alertmanager-secret.yaml. You can do this by (in Linux):

    cat alertmanager-config.yaml | base64 -w 0
  • Paste the output of the running the command above back in alertmanager-secret.yaml replacing the value present in alertmanager.yaml field. Then apply this updated manifest.

    kubectl -n tigera-operator apply -f alertmanager-config.yaml

Your changes should be applied in a few seconds by the config-reloader container inside the alertmanager pod launched by the prometheus-operator (usually named alertmanager-<your-alertmanager-instance-name>).

For more advice on writing alertmanager configuration files, see the alertmanager configuration documentation.

Configure Inhibition Rules

Alertmanager has a feature to suppress certain notifications according to defined rules. A typical use case for defining inhibit rules is to suppress notifications from a lower priority alert when one with a higher priority is firing. These inhibition rules are defined in the alertmanager configuration file. You can define one by adding this configuration snippet to your alertmanager.yaml.

- source_match:
severity: 'critical'
severity: 'info'
# Apply inhibition for alerts generated by the same alerting rule
# and on the same node.
equal: ['alertname', 'instance']

Configure Grouping of Alerts

Alertmanager also has a feature to group alerts based on labels and fine tune how often to resend an alert and so on. In the case of Denied Packet metrics, simply defining a Prometheus alerting rule would mean that you will get an page (if so defined in your alertmanager configuration) for every policy on every node for every Source IP. All these alerts can be combined into a single alert by configuring grouping. The Alertmanager configuration file that is provided with Calico Cloud by default, groups alerts on a per-node basis. Instead, if the goal is to group all alerts with the same name, edit (and apply) the alertmanager configuration file like so:

resolve_timeout: 5m
group_by: ['alertname']
group_wait: 30s
group_interval: 1m
repeat_interval: 5m
receiver: 'webhook'
- name: 'webhook'
- url: 'http://calico-alertmanager-webhook:30501/'

More information, including descriptions of the various options can be found under the route section of the Alertmanager Configuration guide.