The Missing “SLOW” Service Signal in OpenSignals

I might not be entirely right on this one as I seem to keep revisiting this, but for now, I have decided not to include a SLOW signal in the set of signals that OpenSignals offers. While service status values are subjectively inferred, I am extremely reluctant to do so at the signal level. Slow might seem a straightforward term to use as a signal, but it is fraught with issues in how this is measured by either service within a conversational exchange and how reliable the signal can be over a medium such as a network. Can slowness be accurately attributed?

An overriding goal for OpenSignals is to move away from quantitative measures. SLOW seems to be a slight detour into this area, which is better served by another instrumentation technology I designed in 2008 and subsequently released as an Open API only – activity-based metering. If SLOW were to be introduced as a signal, the door opens wide for how much slowness. Users of the instrumentation library would then request that additional data, such as the timings, be attached to a signal and event, or worse, ask for different signals representing levels of slowness. This would very quickly explode and degrade the integrity of the design. Not on my watch.

My concept for a signal in the OpenSignals API is a phenomenon that is either an outcome or operation. SLOW does not pass the smell test here. In fact, I strongly suspect that service timing is well past its use-by date when it comes to the effective monitoring and managing highly complex inner networks and systems of services – whether macro or micro perceived or conceived. Signals can be emitted and received, so how does SLOW get received? A called (remote) service would need to send back a SLOW signal in its response payload – a receipt. It is extremely hard to imagine that realistically happening outside some sort of explicitly contracted deadline like a timeout!

In the early days of OpenSignals, in and around 2015, when it was under a different research project name, Signify, I felt the ELAPSE signal was a far better fit, that covered the expiration of service execution and call deadlines in the form of timeouts – an extreme form of slowness. It was not necessarily tied to the service request level, which was extremely important as OpenSignals aims to place far more emphasis on aspects of the service-to-service conversations (dialog-based observability). In the course of a conversation, it is easy to imagine time and task elapsing or expiring as opposed to being slow. The measurement of time is not entirely absent from OpenSignals; it is just captured and communicated at a different and far more effective level of scale and semantics.