Containerisation in the enterprise - Sumo Logic: Observability is nice, clarity is better

As businesses continue to modernise their server estate and move towards cloud-native architectures, the elephant in the room is the monolithic core business application that cannot easily be rehosted without significant risk and disruption.

These days, it is more efficient to deploy an application in a container than use a virtual machine. Computer Weekly now examines the modern trends, dynamics and challenges faced by organisations now migrating to the micro-engineered world of software containerisation.

As all good software architects know, a container is defined as a ‘logical’ computing environment where code is engineered to allow a guest application to run in a state where it is abstracted away from the underlying host system’s hardware and software infrastructure resources.

So, what do enterprises need to think about when it comes to architecting, developing, deploying and maintaining software containers?

This post is written by Iain Chidgey in his role as VP EMEA at Sumo Logic — the company is a cloud-based service for logs & metrics management that uses machine-generated data for real-time analytics. The petabyte scale technology is built around a distributed ‘data retention’ architecture that keeps all log data available for instant analysis, eliminating the need for (and complexity of) of data archiving.

Chidgey writes as follows…

Containerisation makes it easier to deploy modern applications faster, and easier to implement across multiple cloud services too. Despite this, containers are also more challenging to manage at scale under the covers.

However, tracking container services and applications is different to previous application deployment methods. Getting the right data from your implemented containers requires some specific approaches, including a lot of open source tools to gather that information.

As Gartner describes in its best practices document for containers, “The deployment of cloud-native applications shifts the focus to container-specific and service-oriented monitoring (from host-based) to ensure compliance with resiliency and performance service-level agreements.”

In practice, this means looking at observability.

The traditional definition of observability from control theory put together by Rudolf Kalman involves looking at the outputs of a system in order to determine how well that system is performing internally. For modern applications built on containers, observability involves looking at logs, metrics and application tracing alongside events and security data.

Looking for different viewpoints

One of the biggest challenges around enterprise container deployments is that taking older approaches to monitoring and observability is no longer fit for purpose around containers. Rather than looking at the server level, there are numerous ways to understand what is happening within container deployments.

This means looking at each layer of a container deployment in order to see what is really going on. Looking at Kubernetes – the most popular container orchestration project – from the highest level to the most granular, this means looking at Services, Namespaces, Clusters, Nodes and individual Containers.

At the base level, containers are individual images that are used to host a specific application or service with everything needed in that specific wrapper. Each image will create its own runtime environment and that will produce data over time to show what it is working on. Containers are then grouped together to create Pods with shared storage and network resources, and a specification for how to run the containers.

Iain Chidgey, VP for EMEA region at Sumo Logic.

Pods then run on Nodes, which are the virtual or physical machines used to host and run workloads. These Nodes then provide the resources necessary to run the application.

Lastly, a Service is an abstract way to expose an application running on a set of Pods as a network service. This makes it easier to manage and consume the results of that application over time as a single element. From an observability perspective, each of these can have a specific view that provides information to the team. By looking at different levels, it is easier to understand what is taking place over time.

Open source tools provide observability data

In order to develop observability for containers, there are multiple open source projects that can provide information to you. Log and event data can be found courtesy of Fluent bit, an open source Log Processor and Forwarder, and FluentD, an open source data collector. Meanwhile, metrics data can be gathered using Prometheus, which is an open source service monitoring system as well as a time-series database.

Tracing data can be gathered using OpenTelemetry, which provides robust and portable telemetry data and aims to make this a built-in feature of cloud-native software implementations. Lastly, security data for containers can be found using Falco, which is a cloud-native runtime security project and threat detection engine for Kubernetes.

All of these different open source projects provide data from container implementations. However, to make the most of this data, the challenge is how to use it over time.

This involves bringing these data sets together and combining them to show what is most important to look at. In order to make observability data more useful, combining this mix of data and enriching it so that it can be used more quickly is needed. This can then help enterprise teams to navigate from issues affecting a customer at the service level down to specific problems at the container level.

Observability & continuous approaches

As you develop and deploy more applications on containers, the whole development process will involve more changes and expansions over time. Tracking changes across a Continuous Integration/Continuous Deployment (CI/CD) pipeline can be easier with containers, as the creation process can be automated with Kubernetes.

However, all these changes taking place need to be followed. Without the right approach to observability, it is more difficult to track and see what is taking place.

For example, if a problem occurs then it’s important to go from the initial problem or spike in performance to the data associated with that source application or infrastructure component. Once you can dive into that data, you can then observe how healthy the service is alongside any related infrastructure the entity sits on. Once you’ve identified the entity you want to look at more closely, you can then move into the raw logs, metrics, and traces for that component too.

As containers become more important, being able to diagnose the root causes of any problem will be more important. Using observability data will be essential to keep these services running smoothly within enterprise environments.