Why visibility into Kubernetes environments is becoming a cornerstone for delivering exceptional digital experiences

News Desk - 23/11/2022

By Gregg Ostrowski, Executive CTO, Cisco AppDynamics

Over the last few years, accelerated by the pandemic, we’ve seen a rapid increase in the adoption of cloud-native technologies. This has dramatically improved the ability for organisations to scale their applications at speed and deliver game-changing innovation.

But at the same time, this shift has also led to exponential growth in the complexity of their application topology, with thousands of microservices and containers now being deployed. This has left IT teams with gaps in visibility across the technology infrastructure that supports these cloud-native applications, which makes it extremely challenging for them to manage availability and performance.

As organisations in the United Arab Emirates (UAE), and across the globe for that matter, look to overcome these challenges and achieve visibility into these dynamic and distributed cloud-native environments, they are beginning to realise the value of full-stack observability. Indeed, an AppDynamics report, The Journey to Observability, reveals that more than half of businesses in the UAE (55%) have now started the transition to full-stack observability, and a further 38% plan to do so during 2022.

Technologists across the region are recognising that in order to properly understand how their application is performing, they need visibility across the application level, into the supporting digital services (such as Kubernetes), and into the underlying infrastructure-as-code (IaC) services (such as compute, server, database, network) that they’re leveraging from their cloud providers.

The big challenge currently is that the distributed and dynamic nature of cloud-native applications makes it extremely difficult for technologists to pinpoint the root cause of issues. Cloud-native technologies, such as Kubernetes, dynamically create and terminate thousands of microservices in containers and spawn a massive volume of metrics, events, logs and traces (MELT) telemetry every second. And many of these services have a short shelf life due to the dynamic scaling to meet demands. So when technologists attempt to diagnose an issue, they can often find that both the infrastructure and microservices elements involved no longer exist. Many monitoring solutions don’t collect the fine-grained telemetry data needed, making understanding and troubleshooting all but impossible.

The case for advanced Kubernetes observability

As noted, where organisations are leveraging Kubernetes technology, the footprint can expand exponentially and traditional monitoring solutions struggle to deal with this dynamic scaling. So technologists need a new generation solution that can monitor and serve these dynamic ecosystems at scale and provide real-time insights into how these elements of their virtual infrastructure are actually operating and impacting one another.

Technologists should be looking to achieve full-stack visibility for managed Kubernetes workloads and containerised applications, with telemetry data from Cloud providers for the infrastructure such as load balancer, storage and compute, additional data from the Managed Kubernetes layer, grouped and analysed with application-level telemetry from OpenTelemetry.

And when it comes to troubleshooting, technologists have to be able to quickly alert on and identify issues domain and root cause(s). In order to do this, they need a solution which is capable of navigating Kubernetes constructs, such as clusters, hosts, namespaces, workloads, and pods and their impact on supported containerised applications running on top. And they need to ensure they can get a unified view of all MELT data — whether that is Kubernetes events, pod status or host metrics, infrastructure data, application data or data from other supporting services.

Future-proofing innovation with cloud-native observability solutions

Recognising the need for technologists to get greater visibility into Kubernetes environments, technology vendors have rushed to market with propositions which promise cloud monitoring or observability. But technologists should think carefully about what they really require, both now and in the future.

Traditional approaches to availability and performance were often based on long-lived physical and virtualised infrastructure. Rewind 10 years, and IT departments operated a fixed number of servers and network wires – they were dealing with constants and fixed dashboards for each layer of the IT stack. The introduction of cloud computing added a new level of complexity and organisations found themselves continually scaling up and down their use of IT, based on real-time business needs.

While monitoring solutions have adapted to accommodate rising deployments of cloud alongside traditional on-premise environments, the reality is that most were not designed to efficiently handle the dynamic and highly volatile cloud-native environments that we increasingly see today.

It’s a question of scale… these highly distributed systems rely on thousands of containers and spawn a massive volume of MELT telemetry every second. And currently, most technologists simply don’t have a way to cut through this crippling data volume and noise when troubleshooting application availability and performance problems caused by infrastructure-related issues that span across hybrid environments.

Technologists need to remember that traditional and future applications are built in completely different ways and they’re managed by different IT teams. This means they require a completely different kind of technology to monitor and analyse availability and performance data in order to be effective.

Instead, they should look to implement a new generation, cloud-native observability solution that is built from the ground-up to meet the needs of future applications and that can scale functionality at speed. This will allow them to cut through complexity and provide observability into cloud-native applications and technology stacks. They need a solution that can deliver the capabilities they will need not just next year, but in 10 years’ time as well.