As a software architect, team leader, enterprise IT consultant, and expert witness, I have been studying and explaining why IT projects succeed or (more commonly) fail for nearly 20 years.
The factors for failure are remarkably consistent, as is the path that failing IT projects typically take. In one sense, it make my job easier, since a relatively simple checklist can usually uncover the fundamental issues. On the other hand, I find that clients -- when I review projects that are still alive (vs. those already in litigation) -- often don't want to hear what I have to say. Gets a bit depressing at times, but there you go.
So, let's look at all the failure factors that have been revealed so far in the Healthcare.gov debacle.
I could provide links to all these, but lots of people have been doing that for us, so I'll just summarize:
[0] A badly-written law, based on unsustainable economic assumptions, multipiled into 11,000+ pages of regulations. (
Ashton's Law)
[1] The wrong people and organizations.
[2] The wrong organizational structure (
Conway's Law).
[3] Changing and politically deferred requirements.
[4] A big bang approach (instead of an evolutionary one) (
Gall's Law).
[5] Trying to fit "five years of development into two years".
[6] Doing an incredible poor and late job on software quality assurance, including (but not limited to) testing.
[7] Going live before the system was ready.
[8] Trying to fix the system while keeping it (mostly) up and (mostly) running.
[9] Adding manpower to a late project (
Brooks' Law).
It is hard under the best of circumstances to bring a massive IT project to a successful launch (mostly on-time, mostly on-budget); making all these errors along the way made failure inevitable. I'm sure there are plenty more than these, but these are more than enough to kill a project.
Most of the failed IT projects I've examined (between 150 and 200) have been after-the-fact, when the matter has gone into litigation. I have reviewed hundreds of thousands of pages of documents (and done directed searches on millions of pages), tracing the history of the projects and attempting to identify where things went wrong.
So, what can you expect next?
There are two basic courses that a project such as this takes at this point: oscillation or off-line.
Oscillation is what happens when the developer continues to push ahead, attempting to fix/stabilize a live system. The system may improve marginally, but then new problems are uncovered and/or introduced by the on-going changes. I have seen projects go like this for months, with the essential fixes perpetually 4 to 8 weeks away, and the system never really behaving acceptably.
Off-line is the wiser move. The system is taken down for the time being, and an effort is made to make it acceptable.
At this point, those who are wise will insist on more, not less, time for development and more, not less, time and resources for software quality assurance, including testing. They may even recommend pulling the plug altogether, or at least going back to something close to the start.
Such people are rarely heeded, because their news is invariably bad and usually unacceptable to the Powers That Be. The temptation will exist to instead use short-cuts and quick fixes to rush it back into production -- and at that point, you almost always are back to the oscillation/off-line decision again.
Now. Sen. Dianne Feinstein (D-CA) and Rep. Mike Rogers (R-MI) gave Obama lots of cover on the Sunday talk shows by recommending that Healthcare.gov be taken down for an extended period (Rogers said six months) until at least the security problems are worked out.
I'm not sure that's going to happen, at least not yet. At the least, I think that the Healthcare.gov effort will push ahead until Thanksgiving weekend (end of November, remember?), and
then they will go off-line, burying the news on the ultimate Friday-for-news-dumps.
At that point, they will have two choices: try to fix the existing systems (or major sections thereof), or start a new, highly simplified system from scratch, with manual support, and they slowly graft it into the necessary back-end systems. This will take months; frankly, it could actually take a year or two.
If, however, they do not pull the plug then (or before then), then expect to see the oscillation continue: some modest improvements, accompanied by a rash of new problems (or old ones resurfacing). Usage numbers for the website will steadily drop --
actual non-Medicaid enrollment will continue to be very low -- and the Administration with its enablers and flacks will continue to try to find a way to blame this disaster on anyone but themselves. Ultimately, the site will either persist in low functionality or will be halted altogether.
The key lesson is this: there is no royal road to software. Good intentions, noble causes, and political fervor count for nothing; if anything, they may undermine the project by causing those involved to accept or even demand short-cuts and shoddy workmanship "in a good cause" -- which is how the Obama Administration got into this mess in the first place.