Software Configuration

The simultaneous change of code by different developers may cause conflicts when developers change the same files, or develop their code, on top of an outdated version of the source code. These are typical transactional conflicts between reads and writes, which occurs in the context of long duration transactions. On the one hand, developers need to work in isolation in their code for a long period, inhibiting the access to their code. On the other hand, developers need to work on top of each other’s contributions. Software configuration manages the need for Isolation and Collaboration in a Software Code Base.

Today, the flexibility of distributed software configuration systems, that simplify the management of different shared repositories, brought the need to have a disciplined management of codelines, but already in 2002 Steve Berczuk and Brad Appleton defined the principles of Software Configuration Management[1] on top two conflicting forces that drive software configuration, Stability and Progress of a Codeline.

Git is probably the most used software configuration tool. It adds to the concepts defined by Berczuk and Appleton, the possibility of allowing versioning to occur locally, Local Versioning in Git.

GitFlow and the Git Branching Model applies the principles of stability and progress to the Git world.

Isolation and Collaboration in a Software Code Base

Every new piece of code is written on top of existing code that developers need to read in order to contribute to the code base. For that reason, the code base cannot change while a developer is coding, because the contribution should be consistent with a particular view of the code. This property of isolation is required by the development team when interacting with the project code base. At the same time, besides isolation, the code base should be the place where developers collaborate by submitting their code and using others contributions. This property of collaboration allows developers work to progress on top of the most recently produced code.

Software configuration management systems support the interaction between developers in the context of a code base. They define two kinds of repositories. A local repository where a developer changes the code base, and, one, or more, shared repositories, where developers commit the changes they have done in theirs local repositories and obtain the contributions from their colleagues. Local repositories support the property of isolation and shared repositories the property of collaboration.

Software configuration is responsible to guarantee the consistency of the code base when two developers simultaneously change the same file in their local repositories and commit them to a shared repository. To do so, software configuration systems enforce a transactional behaviour by serializing the commits to the shared repository. Every commit to the shared repository is stored as a version of the software artefact, and it points to the version it was built on. In case of a conflict, the two versions in conflict cannot be serialized because they contains different variations of the same file. Therefore, it is necessary to create a new version of the file that integrates the differences. To do so, the last user to commit needs to merge the most recent version in the shared repository with her local repository in order to preserve the history of changes. Note that in the case of conflict, from the perspective of the shared repository, the last commit occurs after all the other commits. A versioned code base is called a codeline.

The difference between the transactional behavior of configuration management systems, long transactions, and the transactional behavior of operational systems, short transactions, is that it is not possible to have an optimistic policy, and abort the conflicting transactions, because we cannot loose several hours of work, nor a pessimistic policy, by avoiding developers to work simultaneously in the same files, because it would impede developers to work. The transactional behaviour of software configuration systems fits into the class of long-running transactions.

Today, some models of software configuration, e.g. Git, allow versioning in the local repository, allowing developers to preserve the history of their local changes to code. Previously, to preserve the history of their local changes, developers were compelled to commit intermediate changes to the shared repository. By using software configuration systems with local versioning support it is possible to have the different granularities of commit, developer work granularity at the local repositories, and project work granularity at the shared repositories. Modern software configuration systems also simplified the creation and management of shared repositories, which has a large impact on the development of software, see Stability and Progress of a Codeline.

Fortunately, the burden of resolving conflicts by merging files is minimized because the code base is large and project managers tend to carefully assign developers to work in different parts of the code base.

Stability and Progress of a Codeline

Software configuration systems promote the collaboration among developers. They are invited to commit to the codeline often and to checkout the most recent code. However, it is necessary to define the level of quality of the codeline, to avoid that low quality commits disrupt other programmers work. Obviously, code that does not compile should not be committed but this criteria may be too weak. Suppose for instance that most of the tests does not pass after the commit. Therefore, any codeline should have a policy that defines the quality criteria, which is enforced by a set of tests.

The two forces that the policy has to balance are stability and progress. A stable codeline should pass most of the tests which means that developers will have to test their commits more thoroughly. This results in a codeline where the code is only shared when is stable, which reduces the collaboration. On the other hand, a codeline policy promoting progress fosters frequent, lower quality, commits to the codeline in order to allow developers to integrate with the most recent versions of code.

The active mainline is the codeline where progress is the predominant force. However, during the project lifetime it may need to be changed. One such situation occurs when it is necessary to release a version of the system. In this case, it is necessary to reduce number of new functionalities being implemented and increase the debugging activities. Therefore, the mainline will tend to become more stable. However, if only a small part of the team is necessary to prepare the release, the other developer cannot implement new functionalities and, if they do, they cannot commit them to the mainline.

The creation of several codelines in a project, usually referred as branches, allows the coexistence of progress and stability. In the situation above can be created a new codline where the tasks of preparation to release occur, while the rest of the team can continue committing to the mainline where the progress policy is supported.

There are situations when a new codeline is created to have more progress than in the mainline. This occurs when it is necessary to do a disruptive change to the code, for instance an architectural refactoring, which introduces a large amount of bugs into the code. In this case the new codeline supports a higher level of progress. However, the creation of disruptive codelines should be done carefully because it may be hard to merge them back into the mainline.