Extend the platform,
empower your team.
Automatic and intelligent observability with trace and metric insights.
ExtensionWith Dynatrace, you can get observability for Kafka without touching any code, thanks to automatic monitoring. Seamless end-to-end traces for connected producer and consumer clients allow you to diagnose anomalies and pinpoint the root cause of the broken code before your customers are affected. Comprehensive metrics give you insight into your Kafka servers with health and performance metrics for brokers, topics, producers, and consumers. Events point you to critical anomalies, reducing the mean repair time.
To get log insight:
To get metric insight:
Java Metric Extensions 2.0 (JMX)
.With the Kafka extension, you can get additional insight into your Kafka server with metrics for brokers, topics, producers, consumers, and more. The extension also provides alerts for the most critical metrics. It creates a custom topology and entities for brokers, topics, producers, and consumers. It provides a dashboard for easier access and configuration of the extension and its entities.
The extension will gather different metrics depending on if it's monitoring a Kafka broker, producer or consumer. Make sure to activate the extension on all of them to get all metrics.
Java Metric Extensions 2.0 (JMX)
activatedSelect Add to environment to get started.
Below is a complete list of the feature sets provided in this version. To ensure a good fit for your needs, individual feature sets can be activated and deactivated by your administrator during configuration.
Metric name | Metric key | Description | Unit |
---|---|---|---|
Kafka Consumer - Requests | kafka.consumer.consumer-metrics.request-rate | The average number of requests sent per second for a node. | PerSecond |
Kafka Consumer - Request size | kafka.consumer.consumer-metrics.request-size-avg | The average size of all requests in the window. | Byte |
Kafka Consumer - Incoming byte rate | kafka.consumer.consumer-metrics.incoming-byte-rate | Bytes/second read off all sockets. | BytePerSecond |
Kafka Consumer - Outgoing byte rate | kafka.consumer.consumer-metrics.outgoing-byte-rate | The average number of outgoing bytes sent per second to all servers. | BytePerSecond |
Kafka Consumer - Request latency | kafka.consumer.consumer-metrics.request-latency-avg | The average request latency in ms for a node. | MilliSecond |
Kafka Consumer - Messages consumed rate | kafka.consumer.consumer-metrics.records-consumed-rate | The average number of records consumed per second. | PerSecond |
Kafka Consumer - Bytes consumed rate | kafka.consumer.consumer-metrics.bytes-consumed-rate | The average number of bytes consumed per second for a topic. | PerSecond |
Kafka Consumer - Fetch latency | kafka.consumer.consumer-metrics.fetch-latency-avg | The average time taken for a fetch request. | MilliSecond |
Kafka Consumer - Consumer lag | kafka.consumer.consumer-metrics.records-lag | The latest lag of the partition. | Count |
Kafka Consumer - Consumer lag average | kafka.consumer.consumer-metrics.records-lag-avg | The average lag of the partition. | Count |
Kafka Consumer - Consumer lag maximum | kafka.consumer.consumer-metrics.records-lag-max | The max lag of the partition. | Count |
Metric name | Metric key | Description | Unit |
---|---|---|---|
Kafka Server - Purgatory Produce Size | kafka.server.purgatory.produce-delay-size | Requests waiting in the produce purgatory. | Count |
Kafka Server - Purgatory Fetch Size | kafka.server.purgatory.fetch-delay-size | Requests waiting in the fetch purgatory. | Count |
Metric name | Metric key | Description | Unit |
---|---|---|---|
Kafka Producer - Incoming byte rate | kafka.producer.producer-metrics.incoming-byte-rate | The average number of responses received per second for a node. | BytePerSecond |
Kafka Producer - Outgoing byte rate | kafka.producer.producer-metrics.outgoing-byte-rate | The average number of outgoing bytes sent per second to all servers. | BytePerSecond |
Kafka Producer - I/O Wait time | kafka.producer.producer-metrics.io-wait-time-ns-avg | The average length of time the I/O thread spent waiting for a socket ready for reads or writes in nanoseconds. | NanoSecond |
Kafka Producer - Response rate | kafka.producer.producer-metrics.response-rate | Responses received sent per second. | PerSecond |
Kafka Producer - Request latency | kafka.producer.producer-metrics.request-latency-avg | The average request latency in ms. | MilliSecond |
Kafka Producer - Compression rate | kafka.producer.producer-metrics.compression-rate-avg | The average compression rate of record batches, defined as the average ratio of the compressed batch size over the uncompressed size. | PerSecond |
Kafka Producer - Request size | kafka.producer.producer-metrics.request-size-avg | The average size of all requests in the window. | Byte |
Kafka Producer - Requests | kafka.producer.producer-metrics.request-rate | The average number of requests sent per second. | PerSecond |
Kafka Producer - Byte rate | kafka.producer.producer-topic-metrics.byte-rate | The average number of bytes sent per second for a topic. | BytePerSecond |
Kafka Producer - Compression rate | kafka.producer.producer-topic-metrics.compression-rate | The average compression rate of record batches for a topic, defined as the average ratio of the compressed batch size over the uncompressed size. | BytePerSecond |
Kafka Producer - Failed Requests Rate | kafka.producer.producer-topic-metrics.record-error-rate | The average per-second number of record sends that resulted in errors for a topic. | PerSecond |
Kafka Producer - Requests Sent Rate | kafka.producer.producer-topic-metrics.record-send-rate | The average number of records sent per second for a topic. | PerSecond |
Metric name | Metric key | Description | Unit |
---|---|---|---|
Kafka Connector - Status | kafka.connector.status | Equals 1 if the status is running, 0 otherwise. | Count |
Kafka Connector - Task status | kafka.connector.task.status | Equals 1 if the status is running, 0 otherwise. | Count |
Kafka Connector - Task pause ratio | kafka.connector.task.pause-ratio | The fraction of time this task has spent in the pause state. | Count |
Kafka Connector - Task running ratio | kafka.connector.task.running-ratio | The fraction of time this task has spent in the running state. | Count |
Kafka Connector - Task success ratio | kafka.connector.task.offset-commit-success-percentage | The average percentage of this task's offset commit attempts that succeeded. | Percent |
Kafka Connector - Task commit time (max) | kafka.connector.task.offset-commit-max-time-ms | The maximum time in milliseconds taken by this task to commit offsets. | MilliSecond |
Kafka Connector - Task failure ratio | kafka.connector.task.offset-commit-failure-percentage | The maximum time in milliseconds taken by this task to commit offsets. | Percent |
Kafka Connector - Task commit time (avg) | kafka.connector.task.offset-commit-avg-time-ms | The average time in milliseconds taken by this task to commit offsets. | MilliSecond |
Kafka Connector - Task batch size (max) | kafka.connector.task.batch-size-max | The maximum size of the batches processed by the connector. | Byte |
Kafka Connector - Task batch size (avg) | kafka.connector.task.batch-size-avg | The average size of the batches processed by the connector. | Byte |
Metric name | Metric key | Description | Unit |
---|---|---|---|
Kafka Controller - Offline partitions | kafka.controller.KafkaController.OfflinePartitionsCount | The number of partitions that don't have an active leader and are therefore not writable or readable. | Count |
Kafka Controller - Active cluster controllers | kafka.controller.KafkaController.ActiveControllerCount.Value | Indicates whether the broker is the controller broker. | Count |
Metric name | Metric key | Description | Unit |
---|---|---|---|
Kafka Server - Handler Pool Idle Percent Rate | kafka.server.handler.average-idle-percent.rate | The average fraction of time the request handler threads are idle. Values are between 0 meaning all resources are used and 1 meaning all resources are available. | PerSecond |
Metric name | Metric key | Description | Unit |
---|---|---|---|
Kafka Server - Disk Read Rate | kafka.server.disk.read-bytes | The total number of bytes read by the broker process, including reads from all disks. The total doesn't include reads from page cache. Available only on Linux-based systems. | PerSecond |
Kafka Server - Disk Write Rate | kafka.server.disk.write-bytes | The total number of bytes written by the broker process, including writes from all disks. Available only on Linux-based systems. | PerSecond |
Metric name | Metric key | Description | Unit |
---|---|---|---|
Kafka Broker - Incoming byte rate | kafka.server.BrokerTopicMetrics.BytesInPerSec.OneMinuteRate | The rate at which data sent from producers is consumed by the broker. | BytePerSecond |
Kafka Broker - Outgoing byte rate | kafka.server.BrokerTopicMetrics.BytesOutPerSec.OneMinuteRate | The rate at which data sent from other brokers is consumed by the follower broker. | BytePerSecond |
Kafka Broker - Messages in rate | kafka.server.BrokerTopicMetrics.MessagesInPerSec.OneMinuteRate | The rate at which individual messages are consumed by the broker. | PerSecond |
Kafka Broker - Follower fetch requests rate | kafka.server.BrokerTopicMetrics.TotalFollowerFetchRequestsPerSec.OneMinuteRate | The follower fetch request rate for the broker. | PerSecond |
Kafka Broker - Produce message conversions rate | kafka.server.BrokerTopicMetrics.ProduceMessageConversionsPerSec.OneMinuteRate | The rate at which produce messages are converted, by topic. | PerSecond |
Kafka Broker - Partitions | kafka.server.ReplicaManager.PartitionCount | The number of partitions in the broker. | Count |
Kafka Broker - Under replicated partitions | kafka.server.ReplicaManager.UnderReplicatedPartitions | The number of partitions that have not been fully replicated in the follower replicas. | Count |
Kafka Broker - Produce request rate | kafka.server.BrokerTopicMetrics.TotalProduceRequestsPerSec.OneMinuteRate | The produce request rate per second. | PerSecond |
Kafka Broker - Fetch request rate | kafka.server.BrokerTopicMetrics.TotalFetchRequestsPerSec.OneMinuteRate | The fetch request rate per second. | PerSecond |
Kafka Broker - Failed produce requests | kafka.server.BrokerTopicMetrics.FailedProduceRequestsPerSec.OneMinuteRate | The produce request rate for requests that failed. | PerSecond |
Kafka Broker - Failed fetch requests | kafka.server.BrokerTopicMetrics.FailedFetchRequestsPerSec.OneMinuteRate | The fetch request rate for requests that failed. | PerSecond |
Kafka Server - Max follower lag | kafka.server.ReplicaFetcherManager.MaxLag.Replica.Value | The maximum lag between the time that messages are received by the leader replica and by the follower replicas. | Count |
Kafka Server - Current follower lag | kafka.server.FetcherLagMetrics.ConsumerLag.Value | The lag in number of messages per follower replica. | Count |
Kafka Server - Fetch Conversions Rate | kafka.server.FetchConversionsRate.OneMinuteRate | - | Count |
Kafka Server - Produce Conversions Rate | kafka.server.ProduceConversionsRate.OneMinuteRate | - | Count |
Metric name | Metric key | Description | Unit |
---|---|---|---|
Kafka Controller - Leader election rate | kafka.controller.ControllerStats.LeaderElectionRateAndTimeMs.OneMinuteRate | The broker leader election rate and latency in milliseconds. This is non-zero when there are broker failures. | MilliSecond |
Kafka Controller - Unclean election rate | kafka.controller.ControllerStats.UncleanLeaderElectionsPerSec.OneMinuteRate | The unclean broker leader election rate. Should be 0. | PerSecond |
Kafka Server - Leader count | kafka.server.ReplicaManager.LeaderCount.Value | The number of replicas for which this broker is the leader. | Count |
Metric name | Metric key | Description | Unit |
---|---|---|---|
Kafka Network - Produce requests per second | kafka.network.RequestMetrics.RequestsPerSec.Produce.OneMinuteRate | The total number of requests made for produce per second | PerSecond |
Kafka Network - FetchConsumer requests per second | kafka.network.RequestMetrics.RequestsPerSec.FetchConsumer.OneMinuteRate | The total number of requests made for fetch consumer per second | PerSecond |
Kafka Network - FetchFollower requests per second | kafka.network.RequestMetrics.RequestsPerSec.FetchFollower.OneMinuteRate | The total number of requests made for fetch follower per second | PerSecond |
Kafka Network - Total time per Produce request | kafka.network.RequestMetrics.TotalTimeMs.Produce.Count | Total time, in milliseconds, spent processing requests, for produce. | MilliSecond |
Kafka Network - Total time per FetchConsumer request | kafka.network.RequestMetrics.TotalTimeMs.FetchConsumer.Count | Total time, in milliseconds, spent processing requests, for fetch consumer. | MilliSecond |
Kafka Network - Total time per FetchFollower request | kafka.network.RequestMetrics.TotalTimeMs.FetchFollower.Count | Total time, in milliseconds, spent processing requests, for pfetch follower. | MilliSecond |
Kafka Network - Request queue size | kafka.network.RequestChannel.RequestQueueSize.Value | Size of the request queue. | Count |
Metric name | Metric key | Description | Unit |
---|---|---|---|
Kafka Server - ZooKeeper disconnects | kafka.server.SessionExpireListener.ZooKeeperDisconnectsPerSec.OneMinuteRate | A meter that provides the number of recent ZooKeeper client disconnects. | PerSecond |
Kafka Server - ZooKeeper expires | kafka.server.SessionExpireListener.ZooKeeperExpiresPerSec.OneMinuteRate | The number of ZooKeeper sessions that have expired. | PerSecond |
Kafka Server - Zookeeper Active Connections | kafka.server.active-connections | The number of currently open connections to the broker. | Count |
Metric name | Metric key | Description | Unit |
---|---|---|---|
Kafka Connect - Requests | kafka.connect.connect-metrics.request-rate | The average number of requests sent per second. | PerSecond |
Kafka Connect - Outgoing byte rate | kafka.connect.connect-metrics.outgoing-byte-rate | The average number of outgoing bytes sent per second to all servers. | BytePerSecond |
Kafka Connect - Request size | kafka.connect.connect-metrics.request-size-avg | The average size of all requests in the window. | Byte |
Kafka Connect - Incoming byte rate | kafka.connect.connect-metrics.incoming-byte-rate | Bytes/second read off all sockets. | BytePerSecond |
Metric name | Metric key | Description | Unit |
---|---|---|---|
Kafka Log - Log flush 95th percentile | kafka.log.LogFlushStats.LogFlushRateAndTimeMs.Percentile95th | Log flush rate and time in milliseconds. | MilliSecond |
Kafka Log - Log flush mean time | kafka.log.LogFlushStats.LogFlushRateAndTimeMs.Mean | Log flush rate and time in milliseconds. | MilliSecond |
client-id
is no longer used to generate these entities but only the process_group_instance
and the host
will be considered. This will generate new entity IDs for all your existing consumers and producers, which will break any configuration (metric events, dashboards, etc.) that use those individual IDs. The name of these entities will also change, as it was based on the client-id
previously.client-id
, instead of per entity.client-id
is no longer used for generating consumer and producer entities, we no longer run into the entity dimension limitation that prevented new data from being ingested.No release notes