Seven criteria for unified observability and security
Observability and security in modern cloud environments have become mandatory. They should drive the optimization of those ecosystems to ensure flawless and secure digital services that scale. They should also create order from the chaos by enabling teams to have better control of their technology ecosystems.
Summary
The world’s leading companies and public-sector organizations are building the digital services we rely on using hybrid-cloud and multicloud applications and infrastructure. While agile, these environments are increasingly complex and challenging to manage.
IT, development, and security teams typically rely on multiple monitoring tools, data repositories, and time-consuming manual processes to tame this complexity. However, this reactive approach is unsuitable for rapidly changing cloud environments and evolving security threats because it’s not scalable, can expose organizations to security risks, and steals time from already overstretched teams.
Observability and security in modern cloud environments have become mandatory. They should drive the optimization of those ecosystems to ensure flawless and secure digital services that scale. They should also create order from the chaos by enabling teams to have better control of their technology ecosystems.
Organizations should seek a unified observability and security platform rather than a collection of disparate tools. The platform they select needs to leverage the full value of the data in their clouds to deliver precise answers and extensive, intelligent automation of workloads and processes.
Unifying observability and security analytics and delivering intelligent automation will enable organizations to reduce toil and accelerate innovation—enhancing users’ and customers’ experiences and creating a better working environment for development, security, and IT professionals.
The essential criteria for unified observability and security are the focus of this paper.
The rise of multicloud and cloud-native computing
Digital transformation is accelerating. To keep pace, organizations have shifted from on-premises to hybrid and multicloud environments and microservices-based, Kubernetes-orchestrated, cloud-native architectures.
Regardless of the stage where organizations are in their transformation journey, cloud computing is now integral to the services they deliver—and the ones the world depends on—for banking, retail, healthcare, government services, workplace productivity, and home entertainment.
The shift to hybrid, multicloud, and cloud-native architectures yields undeniable benefits, from greater and faster innovation to increased operational efficiencies. However, the dynamic nature and scale of these architectures and the explosion of data they
produce make them too complex to manage with static dashboards, alerts, and manual troubleshooting.
The landscape of monitoring and analytics tools has also expanded. Indeed, for each new solution they adopt, IT, development, and security teams often implement another monitoring tool or take a “do it yourself” (DIY) approach to bring insights together. Fragmented or DIY-type tools and techniques result in disparate data analytics, increased manual effort, and misaligned teams.
The hours teams spend managing or stitching together tools and chasing problems are stolen from innovation and reduce operational efficiency. In addition, fragmented tools and DIY approaches make it difficult to meet data governance and privacy requirements, which highly regulated industries, such as financial services, manufacturing, and healthcare, increasingly require.
Looking forward, increased demand for digital innovation will lead to increased reliance on clouds, exacerbating these challenges for the teams tasked with managing them.
IT, development, and security teams need a new, more mature approach to monitor, manage, and maximize the value of their investments in cloud ecosystems. To tame the complexity and enormous amount of data emanating from their clouds, they need a modern observability and security platform that delivers predictable, trustworthy, and precise insights in real time.
Those capabilities are essential to teams’ ability to automate processes and workflows to help enhance and secure end users’ experiences and reduce manual workloads, freeing more time for innovation.
Not all observability platforms, however, are equal. It’s only possible to achieve the meaningful insights teams need to manage modern clouds when platforms meet specific criteria.
— Dynatrace CIO research, Jan 2023
— 2022 Gartner®, The Future of Cloud Computing in 2027
— 2023 Gartner, A CTO’s Guide to Multicloud Computing
Seven criteria for unified observability and security
The solution that organizations select for their hybrid, multicloud, and cloud-native observability and security analytics and automation should deliver on each of these criteria:
- One unified analytics and automation platform for observability, security, and business data
- Ability to capture and process all data from all sources while retaining topological and dependency mapping context
- Ability to deliver cost-effective and scalable data analytics
- AI at the platform’s core, combining multiple techniques—predictive, causal, and generative AI
- Ability to deliver trustworthy automation of business, development, security, and operations workflows
- Ability to detect and mitigate security vulnerabilities in runtime environments, block attacks in real time, and conduct data-driven security analytics
- Extensibility to use observability, security, and business data to power custom digital business use cases
One unified analytics and automation platform for observability, security, and business data
Many organizations are converging their observability and security practices to help innovate quickly and without introducing unacceptable risks. They are adopting processes to shift security left into development and right into runtime or active production environments.
Organizations can further accelerate secure innovation and improve collaboration across their development, security, and IT operations teams by reducing their use of fragmented monitoring tools and point solutions. Instead, they should arm these teams with a unified platform that provides data-backed insights and drives trustworthy automation from the observability and security data emanating from their hybrid and multicloud ecosystems.
Organizations should look for a unified platform that automatically and continuously ingests, stores, and processes data in real time and at scale from all major cloud platforms, including Amazon Web Services (AWS), Google Cloud Platform (GCP), Microsoft Azure, and IBM Red Hat. The platform must also integrate with the other solutions across each organization’s technology ecosystem to enable extensive automation of custom, or organization-specific development, security, and operations tasks, without excessive manual intervention.
— Dynatrace CISO Research, April 2023
Capture and process all data from all sources while retaining topological and dependency mapping context
The data originating from cloud environments are a rich source for insights that can drive automation to optimize user experiences and increase operational efficiency.
However, organizations often struggle to operationalize their data because it is trapped in silos across the storage repositories and monitoring tools different teams use. This siloed data also lacks context reflecting the relationships and dependencies across hybrid and multicloud ecosystems. Without this context, it’s difficult to distinguish between the symptoms and cause of an issue, resulting in time wasted chasing false positives, duplicate alerts, or low-priority issues.
Getting the precise answers and context needed to drive real-time analytics and automation requires a platform that can automatically capture and process all data from cloud environments, regardless of source or format. This includes all logs, metrics, and traces, as well as data from the latest open source standards, including OpenTelemetry, and business events, such as conversions, cart abandonment, and sentiment analysis. Given the dynamic nature of modern cloud technology stacks with microservices and containers that come and go in seconds, the platform needs to be informed by a precise visualization, or map, that discovers and tracks the topology of each organization’s technology ecosystem as it changes in real time.
Sprawling and dynamic cloud-native and multicloud environments are an ecosystem of various technologies and services, and the composition changes by the second. This paradigm makes it critical for organizations to acquire a platform with advanced AI, analytics, and automation capabilities. The platform must be able to ingest all observability, security, and business data, put it in an accurate context in real time, and facilitate access to data-backed insights when needed.
Cost-effective and scalable data analytics
The cost constraints of conventional storage models often force organizations to be selective about which observability, security, and business data they keep, discard, or move to “cold storage,” where they can no longer access or query it easily. To extract value from this data using traditional solutions, ITOps, DevOps, SRE, and security teams rely on time-consuming manual processes. They try to structure their datasets to reflect the questions they expect to ask in the future, and they attempt to rehydrate or resurrect data that’s locked in cold storage. These procedures are time-consuming and don’t adapt quickly to rapidly changing application and cloud environments or evolving security threats.
Teams should not have to decide which data to keep or what to do with it before storing it. They should be empowered with a platform that enables them to ask any question about any data anytime, without rehydration.
Teams can achieve cost-effective and scalable data analytics only with a unified platform that has a data storage technology built specifically for analyzing observability, security, and business events data at a massive scale. A causational data lakehouse with a massively parallel processing (MPP) analytics engine delivers this by harnessing and unifying data from even the most complex multicloud architectures.
Platform with AI at its core that combines multiple techniques — predictive, causal, and generative AI
Artificial intelligence (AI) should help teams dramatically reduce manual work. But how can teams trust their AI is drawing the correct conclusions, making the right decisions, and implementing the right automation?
It’s important to remember that not all AI is created equal, and some types are better suited to certain tasks than others.
Traditional approaches to AI for observability and security use cases often require additional tools, which teams add or bolt onto their analytics and automation platform. These tools often use machine learning (ML) models that correlate data to produce probabilistic insights and alerts, which they display on dashboards. This approach can be difficult to scale because it relies heavily on human intervention. In addition to the challenge of managing “tool sprawl,” teams adhering to this approach must also train their correlation-based AI models. This training is the manual process of feeding the model or algorithm with data, testing the outcome, and adjusting to help ensure the model delivers the desired outcome. Moreover, correlation-based AI doesn’t provide continuous, real-time insight into an issue’s precise root cause. Instead, its insights reflect patterns from historical events, which may or may not reflect future outcomes.
Organizations should select a platform with AI at its core that combines multiple AI techniques, with each one excelling in specific capabilities:
- Forecasting — using predictive AI models that recommend future actions based on data from the past, including sales and customer experience trends, seasonality, cloud application health, and other metrics.
- Determining — using causal AI to deliver fact-based, deterministic, and precise answers and intelligent automation based on analyzing dependencies across large sets of observability, security, and business data while retaining an accurate context that reflects each data point’s source.
- Creating — using generative AI that is automatically fueled by causal and predictive AI insights and humans’ natural language prompts to deliver precise recommendations for solving specific tasks in the context of each organization’s unique environment and situation.
Generative AI is already proving useful in broadening the accessibility of operations insights to new personas and speeding workflows for users of observability solutions. However, when combined with other forms of AI, generative AI has the potential for additional notable impact. For instance, leveraging other forms of AI to feed generative AI with more than just user inputs can deliver more value for customers and help maximize the value of generative AI for business, development, security, and operations use cases.
Trustworthy automation of business, development, security, and operations workflows
Delivering precise and trustworthy answers from unified observability, security, and business data is only part of the journey. Ensuring flawless and secure digital experiences amid modern cloud ecosystems’ complexity and scale requires extensive and intelligent automation of error-prone manual tasks. This automation should include continuous discovery and instrumentation of applications and infrastructure, proactive vulnerability and anomaly detection, and optimization across the software lifecycle.
Organizations need a platform that uses causal AI to create and extend trustworthy automation across their ecosystems and processes, from software development to cloud operations and application security. The platform should also integrate with their business, development, security, and operations toolsets.
A platform with these capabilities empowers teams to automate workflows with confidence. For example, teams can automate workflows that react to changing user experiences or seasonality. They can also automate workflows to turn off features for security or quality reasons or to enhance software orchestration to reflect myriad external factors, from weather forecasts to energy consumption and supply chain delays. As a result, organizations can free their teams from many manual tasks and enable them to focus on driving innovation, growth, and customer impact.
Cloud-native and multicloud technology adoption is accelerating, driving platform fragmentation and an explosion in observability and security data variety and volume. AI/ML and automation are increasingly useful in taming this complexity, enabling organizations to find the right answers and automate fixes.
Detect and mitigate security vulnerabilities in runtime environments, block attacks in real time, and conduct data-driven security analyti
The complexity of cloud-native applications, the growing use of open source libraries, and higher frequency code releases increase organizations’ exposure to security risks. To mitigate these risks, organizations need to institute DevSecOps practices and embrace shift-left and shift-right methodologies. These steps help establish a cross-team culture where security is a shared responsibility.
Teams also need a platform that provides end-to- end visibility into their security posture to detect vulnerabilities and facilitate automated, continuous, and real-time remediation across the software lifecycle. With an automatic, data-driven approach to security, teams can immediately understand whether a detected vulnerability has been exploited and trace the impacts of that exploit to determine which data and applications were affected.
Detecting and mitigating vulnerabilities and blocking attacks in real time requires a platform with the following capabilities:
Runtime vulnerability analysis — Providing teams with a clear understanding of the most critical vulnerabilities to address and eliminating the time they spend chasing false positives.
Runtime application protection — Continuously detecting and blocking critical Open Web Application Security Project (OWASP) threats, including SQL injections and command injections that target critical vulnerabilities.
Security automation — Orchestrating a response to security incidents or detected vulnerabilities across tools and teams through precise insights and automated workflows that act in real time.
Advanced security analytics — Combining observability context with security events to enable teams to swiftly detect and respond to threats by uncovering advanced persitent threats (APTs) and analyzing tactics, techniques, and procedures (TTPs) for actionable insights that support proactive defense.
Support for shift left and shift right — Integrating data from security tools across the entire software lifecycle to provide runtime observability context that enables better prioritization of vulnerabilities and targeted insights for faster remediation.
Extensibility to use observability, security, and business data to power custom digital business use cases
Every organization has unique requirements and technology stacks, resulting in many custom analytics and automation use cases. To address these unique needs and use cases, teams need an extensible platform with an easy-to-use, low-code approach for creating custom, compliant, data-driven, and AI-powered apps and automations. To be effective, the platform must provide automatic scalability, runtime application security, safe connections and integrations across hybrid and multicloud ecosystems, and full lifecycle support, including security and quality certifications.
Armed with these capabilities, organizations will be poised to unlock the wealth of insights available in the explosive amounts of observability, security, and business data their modern cloud ecosystems generate. They will also be able to extend precise answers and intelligent automation to numerous business, development, security, and operations use cases, reflecting their unique technology ecosystems, addressing their specific needs, and empowering more people across the organization to make data-backed decisions.
The path to “cloud done right”
While each organization’s journey with cloud computing are different, there is no doubt that modern hybrid and multicloud environments and cloud-native architectures are playing an increasingly important role in powering digital transformation. A unified approach to observability and security has become mandatory for organizations seeking to tame cloud complexity and accelerate digital transformation. Identifying a unified observability and security platform that delivers on each of the seven criteria featured in this paper will help ensure the long-term success of these transformation initiatives. Adhering to these criteria will enable organizations to ensure they are on the path to cloud done right.
A Leader in the 2024 Gartner® Magic Quadrant™ for Observability Platforms
See why Gartner positioned us furthest for Completeness of Vision and highest for Ability to Execute in the latest Magic Quadrant.