The Unchanging of Observability and Monitoring

In looking back over 20 years of building application performance monitoring and management tooling, it seems that not much has changed or achieved beyond that today’s tooling collects data from far more data sources, offers more attractive web-based interfaces and dashboards than previously, all of which can be rendered in dark mode. That last one is not a joke; one application performance monitoring vendor, now turned observability platform vendor, just yesterday announced dark mode as a killer feature.

Why is observability so much like what monitoring was back in 2000 when I started designing and developing my first profiling tool named JDBInsight? I get there is a repetitive cycle to much of what happens in fashion, but this is systems engineering. I suspect that not much has fundamentally changed to the degree a past me had hoped because of everything else changing within the environment of the tooling. Tooling has changed, but yet nothing has changed. That does not make sense, or does it? Much of the change touted by product marketing departments relates to engineering efforts to keep tooling applicable in these new environments of containers, cloud, and microservices. Vendors like Instana, Dynatrace, AppDynamics, and NewRelic spend a considerable amount of their engineering budget simply maintaining instrumentation extensions for the hundreds of platforms, products, projects, and programming languages. So when I say that not much has changed, I am referring to the positioning of tooling on a map of progress like the one shown below. Nearly all vendors listed above are stuck within the environment segment unable to deliver real breakthroughs that would effectively change the operational monitoring and management landscape for them and their customers.

Cognition, control, and communication are still largely deferred and delegated to humans, outside of tooling. Application performance monitoring vendors can keep talking up “intelligence” without ever having to deliver on what many, outside of the computing industry, consider intelligence to be – (re)action appropriate to the context, stimulus, and goal setting. There can never be real human-like intelligence delivered as a software service without, at minimum, the ability to link past and predicted observation to controllability – an intervention following awareness and reasoning of a situation. Today, it is next to impossible to automate the linking of observability to controllability because the shared communication model, internal and external to tooling and humans, does not exist.

Cognition and control will never emerge from data and details. Traces, metrics, and logs are just too low-level and noisy to be used as an effective and efficient model for tracking, predicting, and learning from human and machine interventions within a system. Irrespective, such yesteryear approaches are not sustainable. In the end, observability and controllability need to be embedded directly within the application software itself. The imbuing of software with self-reflection and self-adaptability has not occurred because observability instrumentation rarely considers the need for local decision making and steering through control valves or other similar control theory technologies and techniques. Instead of thinking about data, pipelines, and sinks, engineers need to refocus on the significance in signals and how they should be scored to infer a state; otherwise, the next 20 years will be much the same.