![]() |
| Alan MacCormack |
Editor’s note: Alan MacCormack is a member of the Technological
Innovation and Entrepreneurship Group at the MIT Sloan School of Management. He
teaches Sloan’s core class in innovation and has served as thesis advisor for
several students in MIT’s System Design and Management Program (SDM), including
Andrei Akaikine, SDM ’09 (see story, page 1).
Over the last 10 to 15 years,
even the most traditional of industries have come to rely on software for
everything from inventory control to vehicle navigation. The average automobile
today has more software than the first Apollo moon rocket. Your garden variety
microwave may even have an algorithm for cooking popcorn to fit your specific
tastes. Despite this dramatic increase in the pervasiveness and importance of
software, however, many companies lack a fundamental understanding of the
architecture underlying their code. This problem costs firms millions of
dollars per year.
Ask systems designers at any
major commercial software company to describe the architecture of their product
on a whiteboard. They’ll typically draw a diagram showing a number of boxes (modules)
that perform highly specific functions, with a few neat connections between
them.
My research shows however, that
if you actually measure the interactions between boxes at the code-level,
you’ll find the architecture is much more tightly coupled than anyone would
think. Coupling has its virtues—tight interactions between different pieces of
code can lead to increased performance in areas such as speed or memory
footprint. But coupling also has major drawbacks, with respect to the ease with
which software can be corrected and adapted to meet future needs.
Virtual systems are
fundamentally different from other kinds of systems. As an information-based
product, software appears to be easy and quick to change—which can be an
advantage and a disadvantage. There are no physical changes to be made, yet the
complexity of modern software is such that even small modifications can ripple
through a system with unintended consequences. Software appears to be
malleable, but in practice, the architecture of many systems is opaque. A
developer dare not change them too much for fear of creating a tangled web of
dependencies and changes to upstream files.
Furthermore, unlike industries
such as automobiles and airplanes, which create new platforms from the ground
up every few years, modern software development efforts rarely start with a
clean slate. Most systems have a significant legacy, on top of which new
features and functionality are built. Unfortunately, it’s not obvious from
looking at the older code which pieces are connected to which others. It’s not
like working with a mechanical system, where you can see connections simply by
inspecting the product, or reverse-engineering its design. Unfortunately, this
hard-to-understand legacy code often embeds assumptions and design decisions
that are no longer optimal for the system.
Why are initial design
decisions often so out-of-whack with the current requirements for a software
system?
One reason is that the original
design may have been built quickly, by a small company or startup more focused
on releasing its first product rapidly than on building a framework to last for
many years and multiple product evolutions. Software engineers design programs
to meet their immediate needs, and in a startup, there is no guarantee that you
will be around in 12 months. Speed is of the essence, and any performance edge
is pivotal, no matter how you achieve it. Ten years later, however, when the
war for market share is over, the needs of a user might be better served by a
much more modular, maintainable, and adaptable system. In essence, early design
decisions create a “technical debt” that must be paid by all those that follow.
Let me provide a micro-level
example of these dynamics. Alice might decide to use a piece of functionality that
Robert has already designed in his module, so she writes some code to “call”
his function from her module. This saves time, but creates a dependency between
Alice’s modules and Robert’s that may not be transparent to the system
architect. Five years down the line, when Robert and Alice have both retired to
Tenerife, that dependency may be a complete surprise to a programmer needing to
make a change. Changing code in Robert’s module may well cause Alice’s module
to cease functioning.
The work that Andrei Akaikine,
SDM ’09, did in the thesis |I supervised provides a great example of the costs
that arise from an architecture that is overly complex. In his thesis, he
examined a software system with a long history, which generated significant
maintenance costs each year. Every change could create unexpected problems and
require additional fixes to other parts of the system. The owner of this
system—a large commercial software firm—decided to redesign the software with
the goal of adding new features to the system, while simultaneously reducing
its complexity (by reducing the coupling between elements). Akaikine showed
that the result of this redesign was a significant reduction in maintenance
effort, as captured by the time it takes to fix defects.
Of course, any major redesign
involves significant costs of its own—management has to decide if these costs
are warranted. Unfortunately, many businesses make these decisions based on
gut-feel and intuition, rather than a rigorous analysis of the likely payoffs.
We need much better data to make informed decisions, and the software industry
is woefully lacking in such data. Ultimately, this is why the work I have done
with Akaikine and other ESD students—including Daniel Sturtevant, SDM ’07, who
is working on his PhD—is important. We are among the first research teams to
visualize and measure the extent of technical debt in legacy software systems.
To achieve this goal, we have
developed pioneering methods for visualizing and measuring attributes of a
software architecture that can help us assess its underlying structure.
Consider a well-known example from a recent paper, in which we look at the
Mozilla web browser. After its release as open-source software in 1998, a major
redesign effort was undertaken on the system, with the aim of making the
codebase more modular, and hence easier to contribute to. The design structure
matrices (DSMs) from before and after this redesign (see Figure 1) illustrate
what happened. The modular architecture that resulted facilitated contributions
to the code by creating fewer unintended interactions between components.
Before the redesign, each component was, on average, connected to 18 percent of
other components. Afterward, this figure dropped to below 3 percent.
Ultimately, different designs
will have different performance characteristics along a variety of important
dimensions, making techniques like ours valuable for exploring design
trade-offs. A highly integrated design is likely to be faster, while a highly
modular design may be more reliable. A designer must consider carefully what
the product needs to do to arrive at the optimal design for her objectives. For
example, if a system has to last 10 years, and you have no idea what it will
need to do at the end of that time, the software must be designed to be
extremely flexible and evolvable. Unfortunately, very few software companies
practice such forward-looking “systems thinking.”
How should a firm begin? Nobody
should rush headlong into full-blown re-factoring of a major system, given we
are still in the infancy of understanding how these efforts work. Indeed, our
research reveals that a manager’s intuition about where to start such an effort
is frequently wrong, given the perceptions of an architecture and the realities
embedded in its source code are often in conflict. Software companies first
need to generate data on measures of architecture, and begin to link these
measures to performance outcomes that they care about. Most firms tinker with
and redesign their software all the time—in effect they run hundreds of small
experiments every year. Armed with a careful assessment of this data, they will
be better placed to assess what works and what doesn’t. Ultimately, we know
complexity hurts. But reducing it is also a complex endeavor.



No comments:
Post a Comment