Friday, May 04, 2007

Hiding Assumption Lists & Parnas Partitioning

I'm posting about this both because I think it is interesting and because "hiding assumption lists" is a really poor name, such that I can never remember what it refers to. I don't think you have to know anything about programming to get the gist of this post.

When designing software, one idea that everyone basically subscribes to is making the software "modular." That means instead of having one giant clump of code, you have many smaller clumps. Ideally, each clump kind of does one thing, and doesn't contain too much code. The smallest level of clumps can clump together to form larger modules.

This modularity can mean a lot of specific things in terms of how code is organized, because it takes place on a number of levels and partly depends on what type of programming language you're using, but the more important general question is, how do you decide what things to lump together (or, alternately, what things to separate out)?

A general design principle is that you want to have high cohesion and low coupling. "High cohesion" means that each module handles one thing and not several things. "Low coupling" means that the modules are not interweaved together.

For instance, if we compare the modules of a program to a group of people making dinner, it would be like this. Cohesion (which is good) would be something like one person making the salad, one person making the stir-fry, and another person making the bread. You might get even higher cohesion if you had one person doing all of the vegetable prep (cutting things up), one person running the stove, and another person managing the oven: this way each person's taks would be quite specific. Coupling (which is bad) means, how much do the people have to interact? It's OK if the vegetable prep person interacts once with each person - taking orders for vegetables and filling the orders. But it would be annoying if two people were working on different aspects of the same dish at the same time, so that they constantly had to interact with each other, passing information and ingredients back and forth. What you want is for each person to have a single, coherent task (high cohesion) and also for each person to interact with the others a minimum amount (low coupling).

But "hiding assumption lists" give you a different way to think about design. What you do here is think about what might need to be changed about your program later. Might it be ported to a different type of machine? Might the type of database it uses change? Might the user interface be improved? Your ideas about what changes might occur are the "hiding assumptions" (you assume such-and-such might need to change).

You then divide up your modules (this is the "Parnas Partitioning") such that the changes you assume might be needed will be as easy as possible. For instance, if you think that the form of output might change (maybe right now your program puts information on the monitor, but you think later it might just print everything out to a printer), you might respond by making sure all of your output functions are grouped together, rather than interleaved into all of the other code. That way you'd only have to go to one place to make those changes.

Going back to our dinner-makers, let's guess that we might want to change what type of salad is made. If we originally had the salad-making functions divided up between the veggie-prep person and the salad-maker, it could be annoying to have to tell them both that we now want a caesar salad. (This isn't so bad with humans, because they program themselves, but with code you have to write out - and thus later modify - exact instructions.)

We might have originally had a "make salad" function in both our veggie-prep person (it could say "chop up a head of iceberg lettuce, three tomatoes, and an avocado") and our salad-chef ("when veggie-prep gets you the vegetables, put the lettuce on the bottom, tomatoes and avocado layered on top like so"), and then we'd have to change both in order to change the type of salad. It would make more sense to have the veggie-prep person simply take orders ("5 cups of chopped romaine") and deliver the prepared veggies. That way we'd only have to change our instructions to the salad maker ("order romaine from veggie-prep", etc.), and the salad maker would just give a different order. We'd never have to worry about the veggie-prep person at all in making the change.

The design change I described also results in lower coupling, but the idea is that by thinking in terms of future changes, it's easier to see where the problems are in the design and which ones are most important to fix.

When we first learned about this, I realized that I think about my own code this way all the time. I've hardly done any design ever - I tend to design by coding, which works fine for tiny projects but very poorly for larger ones - but I do think about how easy it will be to change certain things without messing everything else up. I'm always pleased when a formal technique turns out to correspond to something I instinctively do, though it would make more sense to be pleased at the opposite - a new technique that gives a perspective I never thought about.

(Yes I realize the way I talked about people making dinner together is kind of an idiotic way to think about people - "you want a minimum of interaction", etc. - but I only meant it as a highly imperfect metaphor. I am really not like the Borg or anything. "Number 8, slice 2 carrots.")

No comments: