Header background

Demo: Transform OpenTelemetry data into actionable insights with the Dynatrace Distributed Tracing app

The Dynatrace Distributed Tracing app redefines how teams work with OpenTelemetry data. By combining OTel's comprehensive data collection with the Dynatrace platform, you gain unparalleled visibility into your application's behavior. The user-friendly interface simplifies the complexity of distributed traces, allowing you to pinpoint and resolve performance issues quickly. With out-of-the-box contextual analysis and the flexibility to dive deep into your data, Dynatrace empowers you to maximize the value of your OpenTelemetry implementation.

In a recent blog post, we announced and demonstrated how the new Distributed Tracing app provides effortless trace insights. In this blog post, we’ll walk you through a hands-on demo that showcases how the Distributed Tracing app transforms raw OpenTelemetry data into actionable insights

Set up the Demo

To run this demo yourself, you’ll need the following:

  • A Dynatrace tenant. If you don’t have one, you can use a trial account.
  • A Dynatrace API token with the following permissions:
    • Ingest OpenTelemetry traces (openTelemetryTrace.ingest)
    • Ingest metrics (metrics.ingest)
    • Ingest logs (logs.ingest)

To set up the token, see Dynatrace API – Tokens and authentication in Dynatrace documentation.

  • A Kubernetes cluster (we recommend using this kind)
  • Helm, to install the demo on your Kubernetes cluster.

Once your Kubernetes cluster is up and running, the first step is to create a secret containing the Dynatrace API token. This will be used by the OpenTelemetry collector to send data to your Dynatrace tenant. The secret can be created using the following command:

API_TOKEN="<your API token>"
DT_ENDPOINT=https://<your-tenant-id>.dynatrace.com/api/v2/otlp

kubectl create secret generic dynatrace --from-literal=API_TOKEN=${API_TOKEN} --from-literal=DT_ENDPOINT=${DT_ENDPOINT}

After successfully creating the secret, the OpenTelemetry demo application can be installed using Helm. First, download the helm values file from the Dynatrace snippets repo on GitHub.

This file configures the collector to send data to Dynatrace using the API token in the secret you created earlier. Then, use the following commands to install the Demo application on your cluster:

helm repo add open-telemetry https://open-telemetry.github.io/opentelemetry-helm-charts
helm install my-otel-demo open-telemetry/opentelemetry-demo --values otel-demo-helm-values.yaml

After invoking the helm install command, the application will eventually be up and running, and the OpenTelemetry collector will send data to your Dynatrace tenant.

Install the dashboard

In your Dynatrace tenant, navigate to Dashboards.

On the Dashboards page, you can import a JSON file containing the dashboard configuration using the Upload button. To install the OpenTelemetry Demo application dashboard, upload the JSON file. The file can be downloaded here.

Once the dashboard is imported, you’ll see several charts representing the application’s overall health.

The Service Level Monitoring section contains the following charts:

  • Top Spans: An overview of the most frequent spans ingested into Dynatrace.
  • Response Time Per Service: An overview of the response times for each service within the demo application.
  • Error Rate per Span: An overview of how many errored spans are generated per service.
  • Failed Spans over Time: A time series of how each service’s error rate increases/decreases over time.
  • P95 Response time over Time: A time series of how each service’s response time develops.
  • Errored Spans with Logs: A table that joins errored spans with related log entries.

These charts give you a quick overview of the overall application health, allowing you to quickly identify any services currently not behaving as expected. In combination with the time series charts, this will aid you in determining the point in time at which a service started to cause problems.

In addition to service-level monitoring, certain services within the OpenTelemetry demo application expose process-level metrics, such as CPU and memory consumption, number of threads, or heap size for services written in different languages.

Note that the developers of the respective services need to make these metrics available by exposing them via, for example, a Prometheus endpoint that can be used by the OpenTelemetry collector to ingest them and forward them to your Dynatrace tenant. Once the data is available in Dynatrace, DQL makes it easy to retrieve and visualize it on a dashboard.

Troubleshoot problems using the dashboard

Now, we’ll see how the dashboard can help you spot problems and find their root cause. For this purpose, we’ll use the in-built failure scenarios included in the OpenTelemetry demo. To enable failure scenarios, we need to update the my-otel-demo-flagd-config ConfigMap containing the feature flags of the application. The feature flags defined here contain the productCatalogFailure flag, for which you need to change the defaultVariant from off to on. After a couple of minutes, the effects of this change will be noticeable in the service level metrics as the failed spans start to increase:

Also, in the Errored Spans with Logs table, you’ll notice a lot of entries that seem to be related to the retrieval of products, as indicated in the related log messages. Since all requests the load generator generates go through the frontend service, most logs related to failed spans are generated here. To pinpoint exactly where those requests are failing, use the trace.id field that is included in each table entry. Select a value within this column to go to the related distributed trace in the Dynatrace web UI.

Within the Distributed traces view, you get an overview of which services are involved in the errored trace and which of the child spans of the trace caused errors.

Here, notice that the error seems to be caused by the product service, particularly instances of the GetProduct call. Select the failed span to go to a detailed overview of the failed GetProduct request, including all attributes attached to the span, as well as a status description.

Here, you see that the status message indicates that the failures that occur are related to the feature flag we changed earlier. However, not all GetProduct spans are failing; only some are. Therefore, we need to investigate further by adding a specialized tile to our dashboard to evaluate whether the product ID impacts the error rate. For this, we use the following DQL query, which fetches all spans generated by the product service with the name oteldemo.ProductCatalogService/GetProduct, and summarizes the number of errored spans by the product ID.

This query confirms the suspicion that a particular product might be wrong. All the errors seem to be caused by requests for a specific product ID or a faulty entry in the product database.

Of course, this example is somewhat easy to troubleshoot as it’s based on a built-in failure scenario. Still, it should give you an impression of how DQL enables you to investigate problems by analyzing how specific attributes attached to spans might affect the outcome of requests sent to a faulty service.

Conclusion

In this blog post, we explored how the Distributed Tracing app can be harnessed to visualize data ingested from the OpenTelemetry collector to get an overview of application health. This end-to-end tracing solution empowers you to swiftly and efficiently identify the root causes of issues. Enjoy unprecedented freedom in data exploration, ask questions, and receive tailored answers that precisely meet your needs.

This powerful synergy between OpenTelemetry and the Dynatrace platform creates a comprehensive ecosystem that enhances monitoring and troubleshooting capabilities for complex distributed systems, offering a robust solution for modern observability needs.

Get started

If you’re new to Dynatrace and want to try out the Distributed Tracing app, check out our free trial.

We’re rolling out this new functionality to all existing Dynatrace Platform Subscription (DPS) customers. As soon as the new Distributed Tracing Experience is available for your environment, you’ll see a teaser banner in your classic Distributed Traces app.

If you’re not yet a DPS customer, you can try out this functionality in the Dynatrace playground instead. You can even walk through the same example above.

If you’re interested in learning more about the Dynatrace OTel Collector and its use cases, see the documentation.

This is just the beginning. So, stay tuned for more enhancements and features.

Make your voice heard after you’ve tried out this new experience. Provide feedback for Distributed Tracing in the Distributed Tracing feedback channel (Dynatrace Community)