Modifiability by example

The architectural quality of modifiability of a system defines de cost of change of the system: how long it will take and how many resources it will be necessary to change a system.

Often seen as a static quality of the system, which impacts on the change of code, it is not necessarily like this. The change can occur when the system is built, for instance through the parameterization of the build script, or at initialization time, for instance by using a configuration file to decide which concrete factories should be created. Ultimately, it can also occur during runtime, for instance the end user decides the change the configuration of her dashboard.

Therefore, the later the change occurs in the system’s construction, from implementing it to running it, the lower is the cost of change. An end user can change the configuration of her dashboard in a few seconds but a developer may take several months to change the system code base in order to support a new functionality. However, a system that has a low cost of change may have a high cost of development because during the initial development of the system it is necessary to implement the mechanisms that will allow a low cost of change in the future. We can say that a modifiable system has low cost of change and high cost of development.

Consider the Graphite system architecture, as described by Chris David. In the description we can read Making multiple Graphite servers appear to be a single system from a user perspective isn’t terribly difficult, at least for a naïve implementation. How can we analyze this sentence from the modifiability viewpoint? When does the change occur? Was the change anticipated?

The fact that multiple Graphite servers will appear as a single system to the client applications gives us a hint that a mechanism for change should be in place. This mechanism allows the change from a single server to multiple servers without impacting on the client applications, which, actually, reduces the cost of change. The impact of this change will be confined which make us suppose that an interface that hides two concrete implementations, single server and multiple server, needs to be defined. This supposition is confirmed in the next sentences when the author tell us that The find and fetch operations of the webapp are tucked away in a library that abstracts their implementation from the rest of the codebase, and they are also exposed through HTTP request handlers for easy remote calls.

So, there is a mechanism that allows the change of the capacity of the Graphite system from single to multiple server without impacting on the client’s applications, but can we infer when does the change occur? Since the Graphite system description does not explicitly address this issue we can conjecture about the different possibilities of when the change may occur.

The first possibility is that it occurs at design time, which means that it is necessary to code in order to carry out the change. One possibility would be that due to a time to market requirement the first implementation of Graphite defines the find and fetch API but only provides the single server implementation. Therefore, to support the multiple server execution of Graphite it will be necessary to do the new implementation of the API, which may take a few days.

Consider now that the library already provides both implementations. How can they be instantiated? If it is during build time then there are actually two different libraries implementing the same API and a build script chooses which one to include in the final executable Graphite file. In this situation a change implies to regenerate the executable file which may take several minutes. However, it can be the case that the architect decides to have a single library and the decision on which implementation to use is done through a configuration file that is read by Graphite when it starts, which means that to accomplish the change it is enough to edit the configuration file and restart Graphite, the whole process may take a few minutes only.

A more demanding solution would be to allow the system to change during runtime from single to multiple server, and vice-versa. Actually, to adapt the configuration of the cluster according to its current load. In this case the library would need to deal with the start and shutdown of servers during runtime, and to synchronize these operation, since it is not possible to abruptly stop a server without ensuring that it processes all pending requests before being removed from service. This last solution has a better modifiability quality but it implies a higher initial development cost, because of the complexity of the mechanism that supports it.