Constructing Explanations

We seek explanation for what a system does and why. Microservices complicate this aspiration when mixed technologies have been assembled over years by evolving teams for changing business purpose.

Parnas and Clement recognized that systems weren't rationally designed but suggested one "fake it" for documentation clarity. pdf

Zane Bitter argues that even this is going too far. Change; complexity; uncertainty: these things are the stuff of life itself. Tire of them and you are tired of living. post

We've built an explainer that relishes the excruciating details present in today's systems while still telling stories that are compact and relevant enough for action.

# Facts

Observation. We favor direct measurement of running systems for this provides truths that are automatically current. Microservices enable instrumentation of running software and collecting these measurements over networking infrastructure.

Construction. Computerized launch and maintenance creates additional metadata about what a system is and does. A self-healing system must observe itself so as to know when to intervene. This too can be usefully shared.

Intention. The work of programming has authors reasoning about the future behavior of their work. This is captured in code, notes, catalogs and documentation.

Reflection. As we become familiar with the true behavior of a system we can condense its complexity into a smaller network of properties judged relevant because they are essential and possibly tenuous.

# Scenarios

We seek to know a system so that we can adjust some aspect of it. We might ask some question like, How does information get from here to there? Or, What step is misbehaving and who should be called about it?

Such questions fall into categories in line with properties like method, performance, reliability and ownership. We'll call the consequent form of a particular question and answer a scenario.

A rational design, even one faked after the fact, falls short of answering these questions because the questions themselves must be anticipated. It is not enough to recognize scenarios, for without specifics no answer is forthcoming.

When the systems are widely known and the caring audience large, a good search engine is amazingly effective at finding a good answer among sites like wikipedia or stack overflow. Yes, someone has had this particular question before and they have written about it.

A microservice explainer we desire has neither rationality or community at its service. It must select and assemble facts given only vague queries and present these in comprehensible form either including an answer or guiding the next question.

The explainer itself must be simpler than what it explains if it is to be a trusted source of information. We expect that each scenario can be encoding into one or more query templates with relevant rendering of resulting details.

# Realization

We have built such an explainer and are now getting encouraging results.

We use the data warehouse method of extract-transform-load to assemble deployed observations with intention and construction details into a graph database. neo4j

We query the graph by navigating a landscape of scenarios finding the nearest next query to makes use of accumulated facts chosen from previous results.

Each query result set is presented in tabular form, with column values hyperlinked when their choice would advance to a recognized next query.

Where query results describe a network, with columns such as "source" and "sink", we generate a vector sketch of this network with clickable nodes and arcs leading to obvious next queries, often drilling into details.

# Practice

We've found our success depends on reasonably clean data and relatively consistent terminology over organizational spans where this is not guaranteed. Within the graph database we can ask which data sources lineup and even better, what doesn't lineup where it should.

We've found we can raise the model's level of abstraction by querying for idiomatically arranged nodes and adding summarizing relations between them. For example we might say A streams to B if a kafka topic is produced by A and consumed by B.

We've found we can introduce and manage more abstract concepts still. For example, consider a subsystem defined as a collection of cooperating services. We need not encode for subsystems important properties such as connectivity or ownership since these can be queried from their members.

We can now draw a map where subsystems are shown to stream between each other and both nodes and edges parameterize more detailed queries. With this we have addressed one scenario, how information gets from here to there.

We expect many varied scenarios will guide the same construction method and thus provide sound and current explanations of modern systems.


These notes have been prepared in anticipation of a keynote at the Explore DDD conference, 2017. site

See Microservices Visualized for a lighthearted but still seriously challenging visual system.

See Ontology v. Stigmergy for where this work and federated wiki meet.