Simplicity and Significance in Observability

The Rise of System Complexity

Over the last few years, complexity has been rising within the computing infrastructure, especially with the movement to a finer granularity in deployment units. We’ve seen some companies adopt microservices so enthusiastically that what was once largely considered a monolith is now broken up into hundreds, even thousands, of pieces of execution units that are still by and large connected. One might naively expect that the tools and approaches employed should have done likewise. This, we believe, was, and still is, a grave misconception. We would argue that the opposite should have happened.

Wrestling with Data

As computing, and complexity, was scaling up, the models and methods should have reduced and simplified the communication and control surface area between man and the machines. Instead, monitoring (passive) and management (reactive) solutions have lazily reflected the complexity’s nature at a level devoid of simplicity and significance but instead polluted with noise. Today, engineering teams are far too busy wrestling with and wandering around an ever-expanding data fog of metrics, logs, and distributed tracing. Fearful of the complexity, many engineering teams worry over their ability to collect, store, and analyze more and more data and details – but never to question and reflect on the effectiveness of such.

Firefighting with Fire

One could very well argue that complexity has been replaced with complicated. We are not understanding or solving the complexity problem; we are just attending to and acting on another problem because it feels much more familiar than today’s changing world. There is seeing but no perceiving. There is doing but no direction. There is collection but no cognition. Application monitoring and management solutions are far more complicated than they need to be.

Single Pane (of Pain)

That single pane of glass that many vendors talk up consists of literally hundreds of layers, tabs, views, charts, and navigation aids. It has become such a sorry tale that some vendors have created an onboarding experience that consists of a game leading users through a path to some golden nugget of information. The problem here is that the data is detached from the service domain and systems dynamics unless one were a machine. This is a temporary bandaid for a far more troubling problem where data is valued over information and useful models.

[E|I]mitating Intelligence

With OpenSignals, the aim is to bring simplicity and significance back into the world of monitoring, observability, controllability, and management. The basic idea is pretty simple, as are many useful innovations: see, perceive, model, and reason about the computing world of microservices, much like how humans do so within societies and cultures consisting of multiple agents of offered services.

STATUSSTATUS
SERVICES
MODEL
SERVICES
MODEL
CONTEXTSIGNALSCONTEXT
SELFSELF
SERVICESERVICE

Communication and Cooperation

We find signals and inferred states at the heart of all human (and animal) communication and cooperation. Signals are emitted or received. Signals indicate operations or outcomes – signs and traces of the past and the sliver of the present. Signals are used to influence others and, over time to infer the state of others and ourselves on reflection. A signal is a direct and meaningful unit of information within a (social) context, much like an emoji. It is not a message that needs to be introspected in part and then interpreted. Humans emit and receive signals via body language and vocalization in any interaction and what is physically passed and contextually communicated. This signal processing and transmission are paramount to effective cooperation and coordination. But the signals are just a means to an end, and that end is the assessment of ourselves and others – state inference.

(Re)Framing Focus

When it comes to monitoring environments, the focus and frame of reference should always be about the status of operation of a service from the perspective of each other service that interacts with that service. An assessment of service quality should not be based on what a service tells us by way of published metrics – this misses the point that no service exists in isolation anymore within a network of high interconnectivity. Instead, an assessment should reflect how other services perceive a service through signals and the inference to a state, which can differ depending on the sensitivity to each service’s signals. Sensitivity manifests in the different weighting of signals and the decay rate of memories each service is configured for.

Representing Reality

OpenSignals brings simplicity and sensibility by way of a focus on what is effective and of significance to the vast majority of service management attention – what is the status of this service, this cluster (of services), or this system (of services). The conceptual model and language (terms) are small, and the sequence of processing is straightforward. A service creates a context that is a representation of the world. Within this context, the service itself is represented alongside the other services it interacts with. In the course of interaction, the service owning the context, acting like a mind or model, records signals against the representations of itself and the other services.

Scoring Signals, Synthezing Status

The recorded signals are then given a scoring, based on the configuration used to create the context, and then mapped to a status bucket for each of the possible status values per service represented. The scoring card will tally each bucket and make a generalized assessment of service with some decaying mechanism in play, much like a human memory system works. The context can transmit the status changes to other interested observers through a plugin functionality, where collective intelligence can manifest in additional aggregation, ranking, weighting, etc.