Evolving Observability to Keep Up with Continuous Delivery

What DevOps teams want and need from their observability tools today.

Last Updated: January 20, 2023

In a continuous delivery model, the volume and pace of software roll-outs increase exponentially. Ozan Unlu, CEO of Edge Delta, says serious dangers are involved when traditional observability cannot keep pace. How must observability evolve?

Continuous delivery is getting software changes of all types – including new features, configuration changes, bug fixes, and more– into production or the hands of users while maintaining superior reliability and stability. It’s increasingly a pillar of modern software development, with high-performing continuous delivery-based teams shipping code an estimated 30 times fasterOpens a new window , reporting 50 percent fewer failed deployments and restoring service 12 times faster than their peers. 

If it sounds too good to be true, remember that continuous delivery is not a cure-all. It requires a deep commitment to its principles. It can create challenges for other processes that may maintain a different speed, like observability – or understanding the internal state or condition of a complex system based on a knowledge of its external data outputs (logs, metrics, and traces). With observability insights, teams gain comprehensive visibility across the entire IT environment (applications and infrastructure) and can ensure its ongoing health, reliability, and performance. 

In a continuous delivery model, the volume and pace of software roll-outs often increase by several orders of magnitude. But serious dangers are involved when traditional observability fails to keep up, remaining an overly time-consuming and expensive process. How must observability evolve to maintain a steady tempo with continuous delivery?  

Automatic Discovery

Like most rocket crashes occur during take-off, so do most system crashes within hours or even minutes of going live. Throughout the years, several highly publicized examples have borne witness to this. In 2019, Disney Plus famously crumpledOpens a new window on its launch day, while in 2013, the Obama Administration’s Healthcare.gov buckledOpens a new window within two hours of going live, to name a few. Certainly, a sudden onslaught of traffic played a role in these outages, but the point is that the first few minutes and hours of any system going live are critical, as they’re likely to be the most fraught with unanticipated problems.

In this context, development teams cannot afford any lag time between the moment a production deployment goes live to the time it is incorporated into an observability initiative. This leaves them with a huge blindspot and a major Achilles heel. In continuous delivery, huge production environments are being spun up almost constantly. Observability tools must be able to automatically detect and onboard new deployments and begin surfacing anomalies immediately – even issues developers may still need to look for and for which alerts and dashboards still need to be built. This makes it possible to detect “unknown unknowns,” or unanticipated issues that often occur at launch, right out of the gate, and cause the vast majority of outages. Instant, real-time visibility into mission-critical production systems is vital, no matter the volume or speed at which they are being created. 

Decentralized Data Management 

Many organizations adhere to a “store and explore” observability approach, whereby data is collected and integrated into a central repository for analysis. Not only is this centralization process slow in and of itself (taking hours instead of seconds), the performance of these platforms often slows to a crawl as more data is ingested, requiring much longer wait times for queries. It’s also an exorbitant process, with data being uniformly relegated to high-cost storage tiers. These costs drive many teams to “sample out” data sets indiscriminately, but what if a problem occurs and this “sampled out” data is precisely needed for troubleshooting?

A new method entails applying distributed stream processing and machine learning at the source so all datasets can be viewed and analyzed as they’re being created – changing this paradigm. When observability data is decentralized, developers are empowered in several ways. First, they always have full access to all the data they need to verify performance and health and make necessary fixes whenever a problem is detected. The concept of data limits becomes null, enabling all data to be pre-processed inexpensively. Painful trade-offs between cost and having the entirety of data at one’s disposal no longer have to be made. Second, when you analyze data as it’s being created, you can see exactly where the problem is at the source, which is critical in highly ephemeral cloud environments where continuous delivery drives constant workload proliferation and shifts.

See More: Rethinking Data Management in the Data Economy Era

Keeping the Developer Experience Top of Mind

A benefit of decentralizing data is that developers can access all their data whenever and wherever they need it. But what can inhibit the developer experience is that many observability platforms are complex and hard to master. Frequently, this expertise lives in the operations side of the house, making developers dependent on their ops counterparts to verify the health and performance of production applications and provide data access when developers need to troubleshoot.  

All the data generated can be useful, and development teams should be looking to tap it. But its utility is compromised if developers cannot easily harness and access it. Observability approaches must enable developers to access all their data in an easy, manageable way, not having to do the task and instead fixing their problems more quickly, which will be important as continuous delivery naturally increases the sheer number of environments they’re responsible for.

Seeing Beyond the Software Lifecycle

For years, there has been discussion about how various aspects of the software development lifecycle (testing and release automation, for example) must speed up to keep pace with newer, modern software development approaches. However, looking beyond the lifecycle, other operations also must adapt, and observability is surely one of them. 

Eliminating time lags between the deployment and discovery of new production environments; adopting decentralization and moving away from inefficient and costly “centralize and analyze” approaches; and empowering developers through easier, faster work experiences will be key to observability’s coincident evolution.

How do you think observability should evolve? Let us know on FacebookOpens a new window , TwitterOpens a new window , and LinkedInOpens a new window . We’d love to hear from you!

MORE ON DATA OBSERVABILITY

Ozan Unlu
Ozan Unlu is the CEO and Founder of Edge Delta, an edge observability platform. Previously he served as a Senior Solutions Architect at Sumo Logic; a Software Development Lead and Program Manager at Microsoft; and a Data Engineer at Boeing. Ozan holds a BS in nanotechnology from the University of Washington.
Take me to Community
Do you still have questions? Head over to the Spiceworks Community to find answers.