From Data to Dashboard – An Observability Anti-Pattern

Failing in Observability

Numerous initiatives around Observability, sometimes referred to as Visibility in the business domain, fail to meet expectations due to engineers naively expecting that once data is being collected, all that needs to be done is to put up a dashboard before sitting back to stare blankly at large monitoring screens hoping for a signal to emerge from the pixels rendered magically. This is particularly so when users blindly adopt Grafana and Prometheus projects where data and charts have replaced or circumvented genuine understanding through patterns, structures, and models.

Data-Laden Dashboards

This anti-pattern seems to repeat consistently at organizations with insufficient expertise and experience in systems dynamics, situation awareness, and resilience engineering. Once the first data-laden dashboard is rolled out to management for prominent display within an office, it seems like the work is all but done other than to keep creating hundreds of more derivatives of the same ineffective effort. Little regard is ever again given to the system, its dynamics, and situations arising. Many projects fail in thinking, more so acting, that they can leap from data to dashboard in one jump.

Fear of Unknowns

This is not helped by many niche vendors talking up “unknown unknowns” and “deep systems,” which is akin to giving someone standing on the tip of an iceberg a shovel and asking them to dig away at the surface. There is nothing profound or fulfilling other than discovering detail after detail and never seeing the big picture that pertains to the system moving and changing below the visibility surface that comes from event capture that is not guided by knowledge or wisdom. The industry has gone from being dominated by blame to fear, which shuts off all (re)consideration of effectiveness.

Data != Information

We suspect much of the continued failings in the Observability industry centers around the customary referencing and somewhat confused understanding of the Knowledge (DIKW) Hierarchy. Many “next-generation” application performance monitoring (and observability) product pitches or roadmaps roll out a pyramid graphic, explaining how they will first collect all this data, lots of it from numerous sources, and then whittle it down to knowledge throughout the company’s remaining evolution and product development.

Data Overload

What invariably happens is that the engineering teams get swamped by maintenance effort around data and pipelines and the never-ceasing battle to keep instrumentation kits and extensions up-to-date with changes in platforms, frameworks, and libraries. When some small stability window opens up, the team loses sight of the bigger picture and purpose. In a moment of panic, the team slaps on a dashboard and advanced query capabilities in a declaration of defeat by delegating effort to users. Naturally, this defecating defeat is marketed as a win for users.

Misunderstanding Understanding

This sad state of affairs comes about because of seeing the hierarchy as a one-way ladder of understanding. From data, the information will emerge; from information, the knowledge will emerge, etc. Instead of aiming for vision all too often, it is data straight to visualizations. The confusion is that this is a bottom-up approach, whereas the layers above steer, condition, and constrain the layers below through the continuous adaptive and transforming process. Each layer here frames the operational context of lower layers – direct and indirect. A vision for an “intelligent” solution comes from values and beliefs; this then contextualizes wisdom and, in turn, defines the goals that frame knowledge exploration and acquisition processing.

Situation Representation

One or more mental models is chosen for knowledge to emerge from information – a selection aligned to the overarching goals. It is here where we firmly believe we have lost our way as an engineering profession. If we can call them that, our models are too far removed from purpose, goal, and context. We have confused a data storage model of trace trees, metrics, log records, and events, as a model of understanding. In the context of Observability, an example of a goal in deriving wisdom would be to obtain intelligent near-real-time situation awareness over a large connected, complex, and continually changing landscape of distributed services. Here, understanding via a situation model must be compatible and conducive to cooperative work performed by both machines and humans. Ask any vendor to demonstrate a situation’s representation, and all you will get is a dashboard with various jagged lines automatically scrolling. Nowhere to be found are signals and states, essential components of a past, present, and unfolding situation.

Downward Shaping of Sensemaking

Without a model acting as a lens and filter, there is never knowledge, augmenting our senses and reasoning and defining importance – the utility and relevance of information in context. There is never information without rules, shaped by knowledge, extracting, collecting, and categorizing data. Data and information are not surrogates for a model. Likewise, a model is not a Dashboard built lazily and naively on top of a lake of data and information. A dashboard and many metrics, traces, and logs that come with it are not what constitutes a situation. A situation is formed and shaped by changing signals and states of structures and processes within an environment of nested contexts (observation points of assessment) – past, present, and predicted.

Models: Abstraction and Attention

Models are critical when it comes to grasping at understanding in a world of increasing complexity. A model is a compact and abstract representation of a system under observation and control that facilitates conceptualization and communication about its structure and, more importantly, dynamics. Modeling is a simplification process that helps focus attention on significance for higher-level reasoning, problem-solving, and prediction. Suitable models (of representation in structure and form) are designed and developed through abstraction and the need to view a system from multiple perspectives without creating a communication disconnect for all involved. Coherence is an essential characteristic of a model, as is conciseness and context.

Habitual Diehards

Unfortunately, introducing a model is not always as easy as a task, as it might look on paper if the abstraction does not pay off in terms of significant simplification and a shift in focus to higher levels of value. For example, Instana, a recent client of ours, had some trouble convincing many of those from an OpenTelemetry background that their abstraction of a Call over Span served a useful purpose.

Conceptual Dissonance

This mismatch between what a developer conceptualizes at the level of instrumentation and what is presented within the tooling, visualizations, and interfaces is seen as an inconvenience – an inconvenient truth stemming from an industry that does far too much selling of meme-like nonsense and yesteryear thinking and tooling than educating in theory and practice. A focus on systems and dynamics needs to win over data and details to get back to designing and building agile, adaptive, and reliable enterprise systems.