Saturday, July 30, 2011

Intellectually Tractable Code

The task of writing software consists in large part of minimizing the intellectual burden necessary to completely understand a problem and its solution. In fact, I would argue that our intuition about how "clean" a piece of software is is, in essence, a measure of how closely our understanding of a solution matches the "essential" complexity of an ideal solution. A "clean" module will have a low intellectual burden for complete understanding.

As an example, consider two alternative library implementations of a singly-linked-list data structure:
  • In implementation A, list operations include semaphore locks, to ensure that the list is always in the appropriate state.
  • In implementation B, there are no locks - it is assumed that the caller will always take care to ensure that list manipulation / use is appropriately protected for concurrent access.

It is my contention that implementation B is cleaner, for two major reasons:
  1. There is no one design decision for the locking semantics that will be universally applicable. For example, there may be state-variable relationships to be maintained that cross the locking boundary for that entry on the list (like "links 1 and 2 must be added and removed from two different lists at the same time"); the type of locking is set in stone at compile-time, though it may be better to be different for different scenarios (e.g. lock using semaphores for this list; lock using critical sections for this other list; no locks necessary at all for this third list). This means that the developer must _still_ reason about locking in order to use this module effectively - only now it isn't as obvious that she needs to do so, or (to another developer) it isn't as obvious if she _has_ reasoned about locking to protect the relationships between her state variables. Run-time use is less intellectually tractable.

  2. It adds unnecessary dependencies for the linked-list module to work. Linked-list maintenance is well-suited to use in a single-threaded environment, but embedding locks within the linked-list maintenance code prevents deploying the module to environments that lack appropriate locking primitives. Unnecessary dependencies can also lead to layering problems - as when the linked-list maintenance depends upon the scheduler's locking primitives, what happens if the scheduler depends upon the linked-list module? What happens if the scheduler depends upon the linked-list module during initialization, prior to the locking primitives becoming available for use? Instantiation is less intellectually tractable.

There exist general guidelines for making code intellectually tractable, which I will try to expand upon in later posts.

No comments: