On Code Libraries

Libraries of common code have a bad habit of growing wild. I currently maintain one such library, and at just under 25k lines of code it is getting a little unwieldy and every change risks unintended consequences.

All the time the library seems to be gobbling up more and more additional code adding complexity and further headaches.

The Problem

In an ideal world the current version of the library would only contain code that is actually used by more than one application. The central purpose of a code library is to avoid duplicated effort.

In reality what has happened is that any variation on any theme that already exists in the library gets sucked into the library as well, regardless of whether there will ever be a second use for said variation.

The problem lies in the structure of the library. It wasn’t written to be modular in the Dependency-Injection sense.

Let’s say the library contains a carefully crafted component to deal with file IO, from monitoring an input directory, to file locking, to archiving and clean-up. Then an application comes around and it needs exactly that functionality… exactly… except, the scanning algorithm is all wrong; the files need to be returned sorted on that name between those underscores at the end there, see?

Now there are two possible approaches.

Either the application copies all the code of the component and adjusts the scanning algorithm in its local copy. Yes, I can hear you cringing at the mere thought of branching the code in this way. Then there are two versions to maintain.

The other solution is not much better though, because that involves implementing a feature in the library that only makes sense in the context of that single application. An extra set of branches at every extension point, and keeping fingers crossed that nothing was overlooked and no bad interactions with pre-existing code were introduced.

The Solution

I already gave it away above really. Writing the library with a foundation in Dependency Injection will give applications the ability to use a whole component whilst replacing just a single part of it with its own implementation.

Discussions on why Dependency Injection is a good idea tend to focus on test-ability, or auto-wiring of the dependencies, or component lifetime management.

In reality, the killer feature is the ability to allow a code library to define the common-scenario assembly of code parts into reusable components, whilst allowing individual applications to replace any one or more constituent code parts with its own as needed.

With DI in-place it becomes possible to enforce a strict rule that unless a code variation or component or code part is actively used by more than a single application it will not be let into the library. When the first application uses some code it will be private to the application. Only when a second use actually eventuates does it get considered for lifting into the library.

I plan to make that the new hard-and-fast rule as DI gets introduced into the code base of the library.

I’ll do a more in-depth post about DI and my opinions on its correct use sometime soon.