DVD’s Catalog, a toy example of modularity

As an example of the modularization process let us consider a simple case of a personal DVD Catalog system that is responsible to store a personal collection of DVDs, support search, track borrowing, and allow to import the DVD’s information from external sources like Amazon and IMDb.

Given this description of responsibilities, the initial decomposition is trivial. It follows a reduce size tactic and may split the system into four main modules, one for each responsibility; manage collection, search, borrowing, and import. Eventually another module may be identified due to the need to keep the information persistent, a data access module. Note that the need for an explicit data access model depends on the technology that is going to be used. For instance, in the case of a traditional enterprise application that uses different technologies to store data and to business logic execution it may be relevant to have a well-defined interface that encapsulates details of data access from the business logic (encapsulate tactic).

This system looks fairly simple. Does the architect need to do a further decomposition before he can deliver the architecture to the developers? Well, complexity depends on concrete requirements and it is the architect job to elicit them. The part that looks more complex is the import of information from external sources. Therefore the architect can ask several question, like for instance:

– What kind of information will be imported? And does it will change in the future?

– In the set the external information sources pre-defined or do we expect to add new sources of information after the system is in production?

– Should we consider that the external sources comply to a standard interface that can be queried for information?

Besides the answers from the stakeholders, it may be also necessary to do some research on the the existing sources, their diferences, whether they provide stable interfaces, and even to do some prototyping to evaluate the complexity of interoperating with them.

Suppose that after some discussion with the stakeholders the three following scenarios were written:

(1) The DVD’s Catalog system should be able to import information from the Amazon.com web site through a query on their web pages and get up to 80% of the DVD’s information. (interoperability scenario)

(2) The DVD’s Catalog system should be able to import information through the plain text interface IMDb offers and get 100% of the DVD’s information. (interoperability scenario)

(3) It should be possible to add a new external data source and it should be fully functional for a pre-defined level of interoperability in less than 2 months / 2 days / 2 hours / 2 seconds. (modifiability scenario)

From the first two scenarios we can identify modules with the responsibility to parse HTML pages and infer the required information, for the Amazon case, probably having to deal with possible changes in the page structure, while for the IMDb data source the parsing module should be simpler due to the provided pre-defined format.

The third scenario is a good example that complexity is in the details, which is made explicit by the different possible scenario responses: how long the modification can take? 2 months? 2 seconds?

This distinction has an impact on the architecture and on what should be the level of detail of the decomposition model. Suppose that the stakeholders state that 2 months will be enough. In this case the architect may decide that it is not necessary to prepare the system for this modification, because, due to its research on the existing sources, he’s convinced that it will be possible to implement a new specific import module from scratch for each new external source. However, if it is expected to include a new source in a shorter period it is necessary to do a detailed decomposition of the import module in order to separate it into the common parts, that can be reused for any external source, and source-specific parts, which need to be implemented for each new source. The shorter the period to make the change the smaller should be the source-specific modules.

Finally, what would be necessary to have a 2 seconds modification? To support a dynamic binding tactic, where a new source is automatically integrated, would require a definition of standards, both communication and data format, that the external sources comply with and the DVD’s Catalog system can use. However, this is not te case, which makes a scenario with this response an unrealistic requirement.

Note that even if the two months scenario is chosen it may occur that during implementation developers build some common parts by normal refactoring of the two specific modules, Amazon Import and IMDb Import. As a consequence, the introduction of a new external it won’t take as long as planned. This is a good example of the interplay of top-down architectural upfront design and a more bottom-up agile design. On the other hand, if the 2 days scenario is chosen the identification of common and specific parts will require a fair amount of prototyping, which, actually, is bottom-up design.