Cloud complexity and data proliferation are two of the most significant challenges that IT teams are facing today. Computing environments are scaling to new heights, resulting in more data that makes pinpointing root causes and vulnerabilities even more challenging. Modern cloud complexity is becoming nearly impossible for human beings to manage without AI and automation. Application observability helps IT teams gain visibility in their highly distributed systems, but what is developer observability and why is it important?
In a recent webinar, Dynatrace DevOps activist Andi Grabner and senior software engineer Yarden Laifenfeld explored developer observability. Grabner and Laifenfeld discussed how observability and dynamic debugging together empower developers to get the most out of the observability data that Dynatrace provides.
Why is developer observability important for engineers?
“Observability is about answering questions,” said Laifenfeld. “DevOps, SREs, developers… everyone will ask questions. Observability is about answering.”
The challenges that developers face with modern cloud environments are myriad. The scale and the highly distributed systems result in enormous amounts of data. When an incident occurs, developers need to know what data to look at, where the incident occurred, and other relevant metrics. Manually sifting through data to answer these questions is time-consuming and takes time away from innovation. These environments are also highly dynamic: “With environments constantly spinning up and down, a problem that occurred five minutes ago may no longer be relevant,” Laifenfeld said.
Observability is necessary for understanding complex environments and making sense of immense amounts of data. But not all teams use the same observability data in the same way. Laifenfeld described observability as an onion: each layer represents a different degree of granularity that different teams consider important. “The DevOps people looking end-to-end. They don’t care about individual services, but rather how they interact and go together,” she said. “They also care about infrastructure: SREs require system visibility and incident management. But developers need code-level visibility and code-level data.”
With traditional monitoring tools, the granular data that developers require typically involves manual preparation. This includes defining what classifies as an error, identifying the data from the error that will fix the issue, and prepping logs before deployment. “That’s not how I envision code-level observability,” Laifenfeld said. “Developer observability, as I see it, is pressing a button and getting observability right at your fingertips. I think Dynatrace and Rookout together are going to enable this future.”
Developer observability, Kubernetes, and expanding left
The shift left movement has changed developers’ scope of responsibility. As software delivery tasks traditionally performed at the end of the software delivery lifecycle – such as testing and deployment – have shifted to the beginning of the lifecycle, developers are now working with Kubernetes more directly. Laifenfeld argued that developers shouldn’t bear the burden of the additional workload when their focus is their code: “Learning Kubernetes as a developer is not easy,” she said. “Developers don’t need more responsibilities; they need more capabilities. More ways to understand their environments and what’s going on.”
Developer observability supports the idea of expanding left. Laifenfeld connects with the expand left concept because it makes unknown concepts more accessible to developers. “Observability should be easy, comfortable, and intuitive,” she said. “If developers are used to working in their integrated development environments, observability should bring unknown concepts to them and make it feel [accessible]. You won’t need to know Kubernetes to understand what’s going on with your code in Kubernetes.”
KubeCon North America is this week. Laifenfeld and Grabner are excited to attend and discuss all things observability. With topics ranging from best practices to cloud cost management and success stories, the conference will be a valuable resource for understanding observability and getting started.
Developer observability use cases
Dynatrace makes achieving a unified view of your environment easy. As Grabner put it, the “secret sauce” is AI-powered automatic data discovery. “Dynatrace [works by] ingesting all your data, understanding where it comes from, extracting metadata, analyzing it for you, and giving you the data you need whenever you need to fix a problem,” he said. By ingesting data from either cloud-native sources (such as OpenTelemetry or Kubernetes events) or what you already have instrumented (such as Open Standards), Dynatrace enriches data with context to provide a 360º view of your environment.
Opening the Dynatrace playground tenant, Grabner walked through some key observability use cases.
Data at your fingertips
Say you would like to focus on a specific Kubernetes workload called “adservice.” Dynatrace allows you to easily pull up that entity and understand key context such as metadata, where it’s running, when there was a deployment, and more. In this example, Grabner saw that the adservice workload was running on EKS and could see the relevant metrics, logs, services, events, error logs, and more.
Easily identify problems, receive context, and understand impact
Identifying and solving problems is nearly impossible with massive volumes of data to sift through. Instead of making problem-solving like finding a needle in a haystack, Dynatrace makes it easy by identifying problems for you. When Davis AI detects a problem, teams can easily filter the data and identify the issue and where it occurred.
In Grabner’s example, he could see there was a JavaScript error increase and could understand the impact of the error. “I could see that 116 users were affected and that there were 2,000 calls to my API,” he said. “[Dynatrace] tells me which app was affected, the root cause, and what led to the problem.” Enriching problem data with context allows teams to not only understand the scope and impact of the problem, but also drill down to understand the impact of a failing component on other dependencies.
Detect anomalies automatically
The Dynatrace platform uses seasonal baselining to automatically detect performance spikes or degradations in your apps and services. Dynatrace learns your environment by understanding baseline performance on certain days of the week or times of the month. When an anomaly crops up, the platform automatically detects the performance change and notifies you immediately.
In Grabner’s example, he understood that there was an increased Java error rate on the front end of the application. How do you know if this problem has business impact? This is where service-level objectives (SLOs) come in. Dynatrace enables teams to specify SLOs, such as latency, uptime, availability, and more. Therefore, when a problem arises, teams can easily identify if the problem affects these objectives. “I call this pre-crime alerting,” said Grabner.
Dynamic debugging
Developers can leverage Dynatrace to understand code-level problems and debug them without stopping a program from running. Laifenfeld used a to-do list app as an example. “I have a list of everything I have connected to the platform, and I write in what I want to debug,” she said. “Then, I add a breakpoint. A breakpoint won’t stop your program but will collect local variables, stack trace, process metrics, etc., and bring that to you while your program continues to run.”
“This gives us the whole picture of what happens in production,” Laifenfeld continued. “It’s dynamic, smartly detects problems, and once we add the breakpoint, it tells us what’s happening, giving us what we need and when we need it.” By delivering code-level context and easy, live debugging to developers, Dynatrace empowers developers with the capabilities they need to deliver high-quality software, faster.
Unlocking a 360º view with unified observability
As digital transformation continues picking up speed and multiclouds become more complex, understanding your environment is critical now more than ever. By bringing observability capabilities to developers, teams can unlock unprecedented insights from their services and applications.
To watch the full webinar, check out the on-demand recording here.
Looking for answers?
Start a new discussion or ask for help in our Q&A forum.
Go to forum