Extend the platform,
empower your team.
Receive Prometheus Alertmanager events and open problems in Dynatrace
ExtensionReceive Alertmanager events and turn them into Dynatrace Problems
In Dynatrace, create a monitoring configuration for this extension, this extension should run on a single Activegate, later Prometheus Alertmanager will need to be configured to send events to this specific Activegate.
The parameters are:
In Alertmanager, create a webhook config where the url field points to the Dynatrace Activegate, example: http://<my.ag.ip.address>:9393/webhook.
Example receiver configuration:
receivers:
- name: 'dynatrace'
webhook_configs:
- url: 'http://100.94.253.97:9393/webhook'
The endpoint always ends in /webhook. Note that problems will only be created for alerts where label called severity matches the ones configured in Dynatrace.
All other alerting options and routes are to be configured in Alertmanager.
You can troubleshoot the integration by sending a POST request directly to the Activegate:
curl -H 'Content-Type: application/json' -d '{"receiver": "dynatrace", "status": "firing", "alerts": [{"status": "firing", "labels": {"alertname": "TargetDown", "namespace": "kube-system", "service": "kubelet", "severity": "warning"}, "annotations": {"message": "11.11% of the kubelet/kubelet targets in kube-system"}, "startsAt": "2021-03-19T01:35:45.72Z", "endsAt": "0001-01-01T00:00:00Z", "generatorURL": "http://openshift.com", "fingerprint": "e425bb91067b6c9e"}], "groupKey": "{}:{alertname=\"Test Alert\", cluster=\"Cluster 02\", service=\"kubelet", "groupLabels": {"alertname": "Test Alert", "cluster": "Cluster 02", "service": "Service 02"}, "commonLabels": {"alertname": "Test Alert", "cluster": "Cluster 02", "service": "Service 02"}, "commonAnnotations": {"annotation_01": "annotation 01", "annotation_02": "annotation 03"}, "externalURL": "http://8598cebf58a1:9093"}' http://<ACTIVEGATE_IP_ADDRESS>:9393/webhook
Version 2.1.1 uses the Event v2 API, which allow us to close problems gracefully instead of manually.
This means that the token now must have the Ingest events scope.
Problems closed this way can be reopened by Alertmanager, which solves an issue where a problem that is resent with the same body was supressed, even though the alert was active in Alertmanager.