Author: autoletics

The need for a new more modern Reliability Stack

Google SRE Book In this post, we consider why it may be time to abandon the service-level management approach that has been strongly advocated for in the Google Site Reliability Engineering book series. But before proceeding, let us consider some excerpts from the O’Reilly Implementing Service Level Objectives Book to set the stage for a …

>

The Unchanging of Observability and Monitoring

Darkest Days In looking back over 20 years of building application performance monitoring and management tooling, it seems that not much has changed or achieved beyond that today’s tooling collects data from far more data sources, offers more attractive web-based interfaces and dashboards than previously, all of which can be rendered in dark mode. That …

>

OpenSignals is the right kind of Domain-Oriented Observability

Pendulumn of Instrumentation For many engineers new to observability and application performance monitoring, domain-oriented observability looks very much like adding interceptors and callbacks into an application codebase to simplify the calling out to a generic data collector library such as distributed tracing, metrics, or even logging as a last resort. Unfortunately, the domain-oriented approach rarely …

>

The Origins of Observability Signals

Profiling a Profiler Around 2012 and 2013, we started looking at ways to optimize the recording and playback of episodic machine memories, a challenge in tuning an instrument already used to profile and optimize many other low latency software systems. How do you profile the best profiler for a particular programming language without dropping down …

>

Observability – The Two Hemispheres

Two World Views Two very distinct hemispheres seem to form within the application monitoring and observability space – one dominated by measurement, data collection, and decomposition, the other by meaning, system dynamics, and (re)construction of the whole. For now, it seems the left hemisphere, the data-centric side, is winning attention in the theater though failing …

>

Scaling Observability to Service Level Management

Scaling: Abstract • Aggregate • Compress Scaling by way of abstraction, aggregation, and compression is critical in the effective and efficient service level management of large-scale and highly connected systems. Scaling here is not merely the storing of vast amounts of observability data often of questionable value. We have seen that play out in the …

>

Humanizing Observability and Controllability

Humanism: Progress and Agency Humanism is a philosophical stance at the heart of what OpenSignals aims to bring to the table for service-level management operations. It runs counter to the misguided trend of wanton and wasteful extensive data collection so heavily touted by those focused on selling a service rather than solving a problem, now …

>

The Missing “SLOW” Service Signal in OpenSignals

What exactly is Slow? We might not be entirely right on this one as we seem to keep revisiting this, but for now, we have decided not to include a SLOW signal in the set of signals that OpenSignals offers. While service status values are subjectively inferred, we are extremely reluctant to do so at the …

>