Alethiography: April 2007

Monday, April 30, 2007

Review for the Quiz

Via Sally, in case you've been goofing off, and not reading my blog with the proper level of detail, here is a tag cloud via TagCrowd of the last ~40 days of my blog postings:

created at TagCrowd.com

Guess I've been a bit focused on school lately, eh?

Thursday, April 26, 2007

The Software Engineering Principles course I'm currently taking is meant to prepare us for Software Engineering Practices, the "Senior Experience" course for CS majors (and for my made-up degree). The Practices course involves working on a big group project for the entire semester, doing all phases of the software development cycle (requirements, design, coding, testing, etc.). So to help us prepare for that, we've been doing a small group project in our current class.

It hasn't been very good. Our class only had five students, and one dropped out fairly early in the project after having an argument with me (a very civil argument over a mild question; I'm not sure why he got as angry as he appeared to, and I'm not at all sure this is why he dropped out).

Of the four we have left, there is me (smart, moderately diligent), W (smart, moderately diligent, extremely busy and with almost no time to put into school right now), H (smart, a total fuck-off), and B (learning-disabled, may have skills that have not yet been revealed, has not done any work to date). And Dr. P has been contributing to the project himself.

Nevertheless, it's been my best group project for school ever. When we talk about design issues in class, and really hammer them out, it's incredibly fun and feels really productive. I never thought working on a team could be like this. (I can only imagine what it would be like if I had teammates who ever did any work.)

Our project itself currently resides in a file repository managed by cvs. Basically cvs manages it so that multiple people can work on various files, sort of checking them out and then back in with changes. It stores all of the previous version of files, so if you need to revert to one that you didn't yet screw up, you can. When people make changes, they upload their changed files with a comment about what they did.

It's fun checking the repository every day for new stuff, and reading the history logs (the comments people made when they checked things in) on the rare occasions that someone has done something. (Actually, Dr. P does something fairly often, and I usually do some work a couple of times a week, but otherwise, almost nothing ever happens.)

Here are a few things I have learned about doing this kind of group work using a repository.

Donut offenses - Don't break stuff that other people need to do their work. In some workplaces, doing so is known is a "donut offense" and you owe one donut to each person whose work you interrupted.

Submit early - You shouldn't wait until you get your stuff (whatever you're working on) completely finished to put it on the repository. Just because you created a file doesn't mean it's "yours" - it's better to go ahead and upload it when you're at a stopping point so other people can work on it in the meantime. Relinquish ownership.

Group decisions - Design issues need to be ratified by the group. It's fine to go implement stuff on your own, or make mock-ups or prototypes of anything you think might be interesting, but you can't change the project design without consulting with your teammates. For instance, we have a database schema that has been fairly well hammered out in a lot of group discussions. I've written all of the code for the schema, but I can't just go adding stuff without discussing the design with the group.

No ego - A lot of the above relies on not having a lot of ego-attachment to your work. One way you can be productive, as I mentioned, is to make mock-ups or prototypes of things you think might be neat. But this only works if you understand that your prototype is probably not going to become the product, and your mock-ups may be totally trashed (gently) by your teammates. You have to do everything from the perspective of trying to be useful to the collective effort, rather than with the idea that your stuff is the best and will naturally be adopted. Or you have to at least fake it pretty well.

Avoid parallel play - Some people in our group always just do their own stuff without seeming to even check the repository or message boards first. Sometimes what they do has already been done by someone else, or someone else has made remarks that might influence the decisions you make. So even though it's OK to do some stuff totally on your own and then submit it to the group, it's good if you keep in the loop about what's going on in the project as a whole rather than just totally doing your own thing. ("Parallel play" is what they call it when tiny children, who aren't old enough for social interactions, play together. They basically just play alongside each other.)

Groups aren't useless - A few times now, we've had a design task (mostly around the database our product is based around), and I've spent time in advance thinking about it a lot, and have been pretty sure I had it nailed down. One time I wrote up a whole document with two approaches to something, and arguments about why one approach was superior. In that case, trying to explain it to the group, and discussing why one design or the other might be better, yielded a third design that was better than either of my original two. In every case, group discussion has resulted in better designs than anyone had come up with on their own. (A note about design: it's best IME to do design on your own and then hash it out in a group. You can't really design things "on the fly" - you get much better ideas if you think about it for a couple of days by yourself. So I'm talking about refining the design or choosing between alternatives.) This kind of goes against our common sense that groups just muddle things up and accomplish nothing.

Be a self-starter - Sometimes people in the group don't do anything because they feel like nobody told them what they were supposed to do. When you're working in a group of peers, you can't expect someone to tell you what to do. You know what's going on with the project - see what's needed and go do some of it. If you can't think of any defined tasks you can do, go do something you think might be helpful (remember: no ego attachment!) even if you're not sure it's "wanted."

It's sad that this crummy group has been my best group work experience ever, but the parts that have gone well (and even badly) have been real eye-openers.

Thursday, April 19, 2007

The Nano-Date

In our software engineering class, we've been reading Waltzing with Bears (by DeMarco and Lister), a book about risk management. It's geared towards software development, but a lot of the ideas would be the same no matter what kind of project you're running.

The basic idea about risk management is that the odds of nothing "unexpected" occurring to delay your (non-trivial) project are very small, and by not acknowledging that, you are lying to yourself and other stakeholders and unjustifiably relying on luck to let you finish within the time and budget you've allotted.

When most people are given a project to accomplish, and asked how long it will take, they either guess (this is bad) or estimate (this is better) using an assumption that nothing will go wrong. In the software industry, there are tools like COCOMO for estimating how long a project should take. In your own life of doing projects, you probably have a sense of how long things should take you under normal circumstances. This is the date most people report as their answer.

DeMarco and Lister call this "the nano-date" because, given that it's basically the earliest date you might finish, so that the odds of finishing before that date are nil, the odds of finishing on that date are very very small. According to their research, the distribution of completion dates looks like this:

I wasn't able to label the vertical axis, but it represents the probability of finishing on a particular date. The area under the curve adds up to 100% - you're certain to finish overall, but the exact date is uncertain. N is the nano-date, and from the graph you can see why the odds of finishing on that date are so small. The right skew (the way it's higher on the left side and then slopes more gently on the right) is because projects that finish before their "most likely date" (the top of the hump) are likely to finish only a little before, while late projects can drag out forever.

According to their research, the range of very likely completion dates goes from about 250% to 300% of N. That is to say, if your "best case" says your project will take 10 months, it will probably in reality take 25 to 30 months.

That sounds terrible! But if you turn it around, it means that, given a realistic estimate of when your project will probably be done, it might be done in as little as 1/2 - 1/3 of that time. Your project could be way early if you do a good job estimating the probable effects of things that might go wrong and use that estimate to pad your original distribution curve properly. And if you apply proper risk management, you can also avoid, mitigate, or contain your risks, but that's for another post.

Meanwhile, here are the top five risks DeMarco and Lister have identified for software projects, in order from worst to uh...best? (They studied how much harm these factors usually do to a project, and this is based on that average damage.)

schedule flaw: the original estimate of the schedule (from whence N and the rest of the unrisked distribution should derive) was totally bankrupt to begin with, and not because other things went wrong later, but because it was just an incompetent estimate of how long a project should take in your organization.

requirements creep: the client (internal or external) keeps adding new things they want the product to do. The U.S. Department of Defense did a study in which they estimated that the size of a software project grows by about 1% per month.

turnover: important people leave your organization & this messes up your schedule

specification breakdown: negotiations with your client totally break down, and the whole project is cancelled. DeMarco and Lister have found this happens to about 1 in 7 projects. Obviously this isn't an incremental risk like the others - it's just a flat risk. Once you get past a certain point - where everything important has been absolutely nailed down and signed by all parties - this risk disappears.

underperformance: the people working on your project don't work as effectively as they reasonably would be expected to. This risk actually breaks even - sometimes you get overperformance instead (as you'd expect, given the nature of estimation).

The two takeaway ideas I have from this so far are...

1. Don't give the nano-date as your estimate of when a project will finish, no matter what the size of the project is. Recognize that there is a distribution of when you might finish, and use a more likely point in that distribution as your commit date. (A useful heuristic: given your estimate, is there a good chance you'll finish early? If not, your estimate is a lie.)

2. You can get a list of risks (perhaps from problems encountered on previous projects) and use estimation of their average effects to pad your estimates.

Soon I'll try to post about strategies for dealing with the risks themselves.

Wednesday, April 04, 2007

Good Teacher, Bad Teacher

Sometimes they are the same teacher.

Last week we got an assignment in Geometry that has been taking everyone hours and hours. First of all, some of the algebra is taking an extremely long time. The first problem out of seven took me about 5 pages of (not very densely written) algebra to get through, though I left the vast majority of it off of my homework paper because I know he doesn't care that we show that we can do a bunch of dumb algebra. And second of all, the concepts behind the homework are a bit challenging and require thought.

The way Dr. T gives homework is that it never includes solving problems of a type we have been shown in class or anything like that - it's always an extension or a related sidebar. This is appropriate but makes the homework and tests (which are the same way) pretty challenging.

Anyway, when he asked for questions at the beginning of class, I asked if the homework due date could be pushed back because it was taking a really long time for me and, I imagined, for others people too. Instead of hanging me out to dry, the class agreed. So he agreed to move the due date to Tuesday. Good teacher!

Then he told us we shouldn't be doing all this algebra by hand, but should be using Mathematica. I was already planning to do this, but a lot of the class was astonished to find out that it would be OK and not cheating. My take on it (having known for a few weeks that it was OK) is that using Mathematica to do algebra in this class is like using a calculator for arithmetic in a Calculus class - kind of a no-brainer. (It's assumed that we know algebra.)

You could, of course, program Mathematica to solve the whole problem, rather than just using it for fill-in algebra, but you'd be demonstrating total mastery of the problem in so doing, so it ought to be fine.

Anyway, then it came up that most of us don't know how to use Mathematica. So he actually spent about 10 minutes showing us how (writing the commands on the board). That was neat since I had just taught myself the same stuff earlier in the day.

This led to more questions about the homework, and I got into a little argument with Dr. T over a point I had misunderstood, and some other folks argued over different things, and he eventually said, "The concepts in this course are not difficult. If you think they are, you might be barking up the wrong tree."

Well, fuck you too. (Bad teacher!)

There are times when it's appropriate to suggest that having unusual difficulty with the material of a course might suggest an ill-chosen field of study, but this wasn't one of them. For one thing, every student in this class besides me is a secondary math education major. You need to know a lot of math to pass the tests to get certified, and of course you should know a lot of math to teach high school math, but (a) none of the material for this course is needed for teaching, and (b) you don't need to be a math genius for it either.

And for another thing, even a person good enough to potentially pursue a Math PhD will probably occasionally have conceptual difficulty with a math topic, and it's impossible for someone very familiar with the math to judge how hard it is. I'm not a math genius, but I'm not unfit for advanced study either, and I find this material pretty challenging at times.

Anyway, I got Mathematica in the mail yesterday (joy!) and I've decided to do my entire homework in it - write-up, graphs, drawings, and all. I've been working on that all day (I took the day off work) and it's going to take me hours and hours but be incredibly fun and result in a beautiful paper to turn in. What could be better?

My wall now sports a banner-shaped poster that proclaims "MATHEMATICA SPOKEN HERE" and I hope that will soon be true.