The Lost Boys: An Agile Software Development Case Study
In my last article 6 Reasons Agile Results in Better Software, I talked about how scrum solves many problems that commonly plague software projects. In this article, I would like to share a real-life scenario in which we corrected a troubled project using Agile methodologies. Hopefully it will provide you with some insights on how you might do the same for your own wayward project*.
The Lost Boys
I worked on a project with Rich Musser not too long ago involving a customer (I’ll refer to them as “The Lost Boys” in order to protect the guilty) that was struggling to get a .NET application out the door. In fact, they had been working on what was supposed to be a 1-2 year effort for nearly 10 years – with no end in sight. Morale was low, and management looked to us for a fresh perspective on the project. They were hoping we could get them back on track. In the end, that’s exactly what happened, but it took a lot of analysis and careful application of Agile principles to get there.
The application that The Lost Boys were working on was nothing especially groundbreaking from a software engineering standpoint. It mostly just moved bits from a database to a screen and back. It used a pretty straightforward three-tiered distributed architecture with a Win Forms front end, a set of web services, and SQL Server on the back. Everything seemed straightforward enough…
The tricky part about this project was the complexity of the problem domain. It had to do with taxation of public works and bookkeeping at the government level. It was pretty cryptic stuff—the kind of thing you need a have a career in to fully wrap your head around. Of course, this created a knowledge gap between the domain experts (which happened to be management) and the developers, which led to requirement misunderstandings. The Lost Boys attempted to bridge this gap the way a lot of companies do, by creating a rigorous requirements spec. The resulting 650-page document detailed every single field on every screen in the system. It contained pictures, business rules, and requirement ID numbers that went out to 9 decimal places (126.96.36.199.188.8.131.52.4)**. It was a valiant effort, but unfortunately, it wasn’t nearly worth the time and manpower put into it. After watching them talk amongst themselves for a bit, it was clear that the spec, despite its epic grandeur, was not sufficient to convey the complex and subtle accounting rules embodied by the subject matter.
In addition to (and largely because of) the requirements issues, bugs had become a major problem in the system. There was a long list of them and more were being found every day. To make matters worse, the poor quality of many aspects of the code made it difficult for them to fix bugs without creating more. This had a significant negative impact on the progress of the project as well as the morale of the team. All these bugs were being kept in a document but, unfortunately, nobody had put the time in to maintain it, and they weren’t really sure which items were valid bugs anymore.
When it came to the team structure, the Lost Boys had a strict hierarchy in place. The lead developer was the primary interface between the dev team and management. In addition, the lead would perform estimates on behalf of the other devs. This was done with the intention of relieving the other developers from having to worry about interacting with management, but it tended to produce inaccurate estimates because the person estimating the work was not always the person actually doing the work. It also disconnected most of the team from management, which created more communication problems.
Management had very little insight into the day-to-day status of the project and absolutely no insight into the overall schedule. To them, the development process was a black box. They asked for a feature and were never sure if or when it would be done. They talked to the lead developer on a continual basis, but they knew his estimates tended to be inaccurate, so it didn’t help much. This made it impossible to make decisions, communicate anything accurately to customers, or to determine how much manpower to throw at the project. There was very little reliable data to back up decision they might make.
Lastly, there had been a litany of contractors that had been hired on and off throughout the years. They were brought on in an attempt to ramp up development speed, but the team agreed that most of them did more harm than good. I didn’t know most of these contractors, but my guess is that much of the problem was related to the learning curve associated with the application and the dysfunctional atmosphere surrounding the project. Whatever the reason, the code they produced tended to be of poor quality and too often required complete rebuilds from the ground up.
Fragile to Agile
The interesting thing about this project was that most of their problems were process-oriented in nature. They had a talented, dedicated team, and supportive management. Everyone wanted this project to succeed. They simply didn’t have a process in place that was conducive to handling a large, complex system like theirs. Fortunately, Scrum provides solutions to many of their problems.
The first major change we made was to adjust their perspective on requirements. The ginormous spec, well intentioned as it was, could serve as a secondary reference point, but we needed to shift to user stories as the primary source of requirement information. We explained to them that simple stories, backed up by a healthy amount of verbal communication, would work much better than documents. This is especially true for non-intuitive problems like the one they worked on. Since they had already come to the conclusion that the spec wasn’t really that helpful, this was not a hard sell. At first it was fairly obvious that the team was not used to a group discussion format, but after a short feeling out period, everyone was quite pleased with how productive these discussions were.
The next problem was the rampant bugs. The code quality was poor enough that controlled, predicable development was no longer possible. We decided new features would have to be put on hold until a substantial number of bugs were fixed and the codebase was stable enough to be reliably extended. We sat down and groomed the bug list, talking about each one, cleaning up the definitions and reproduction steps, throwing out the invalid ones, adding new ones as necessary. It was a bit painful to be sure***, but once this was done, the group had a much better understanding of where the software stood as a whole. And more importantly, we had our initial set of stories for the backlog, which is a critical first step towards a stable development process.
One hurdle that we had to overcome with regard to the bugs was the difficulty in estimating the time required to fix them without understanding exactly what the problem was. Many bug fixes simply couldn’t be estimated without some diagnostic work first. So we decided to allocate a day between sprints for bug research and analysis. Then when it came time to do sprint planning, people usually had all the information they needed to make accurate estimates. It required us to burn a day, but it was well worth the added stability that was gained during the sprint.
As is the usual practice for sprint planning, each story was broken down into detailed tasks, and hourly estimates were assigned to each task. This had several key benefits: the first being that estimates became much more accurate. Estimating as a group tended to uncover misunderstood requirements and overlooked implementation details that had created so many problems in the past. And since estimates were no longer being done in isolation by the team lead, they better reflected each developer’s individual abilities. Another key benefit was that these discussions naturally spread knowledge out among the developers. Important information that had previously been the domain of the “front end guy” or the “database guy” was now part of the group consciousness, giving everyone (including the product owner) a better understanding of the software.
It would be hard to overstate how beneficial these group discussions were to The Lost Boys. You could really see a lot of new connections being made about how the system was intended to function by its visionaries and how it was implemented by its developers. Keeping the parity between these two worlds was absolutely critical to the success of this project, as it is for every project.
As mentioned, a principal concern for management was the lack of insight into the status of the. Fortunately, this is one of the areas where Scrum really shines. Having a sprint burndown chart gave them some visibility into the progress being made during sprints, but what they were really interested in was the long-term outlook. We let them know that Scrum handles this through the concept of velocity. If you are unfamiliar with velocity, the idea is that you measure how much work you are doing each sprint on average (usually how many story points you finished) and use that to predict how much more work is left on the project. It takes a few sprints to get a stable velocity established, but it can be extremely valuable because it gives you a schedule based on actual data rather than guesses. We were careful to make it clear to them that velocity does not give you a hard stop date; it is simply the best estimate you have based on the data currently available, but it was better than anything they had up until that time, so they were happy to have it.
With this plan in place, the team set out to get their project back under control. We spoke with them off and on through the first two or three sprints just to resolve any process-related questions and provide support while they were getting their footing under the new Agile approach. The first several sprints went very well, and you could see they were starting to get some momentum back. At that point, we let them continue on their own, feeling fairly confident that they had the tools they needed to be successful.
Lost and Found
So how did it work out? I have to say the team embraced the Agile strategy with enthusiasm. Their commitment to the process was commendable as was their dedication to seeing the project through to a successful ending. After about a year of working under Agile methods (specifically Scrum), they got there. The first major release in 10 years went out and was well received by customers. Shortly afterwards, they followed that release up with a web-based product.
This project had a positive outcome, but it was not my intention to make it seem like applying Agile methodologies to a problem is easy or even appropriate in every case. In fact, this project was not easy, and it took a lot of thought, introspection, and courage for The Lost Boys to break out of their old habits and find a new way of doing business. Agile isn’t about cracking open a book or watching a few videos. No software project is going to work that way. It’s about making honest assessments of where you are failing as a team and coming up with smart solutions to address them. It’s as easy as that****.
* Even if your project is not experiencing issues, I think you’ll still find this article useful.
** Not an exaggeration – their spec really was over 650 pages and contained requirements nested 9 levels deep.
*** As in super painful
**** Which is to say, not that easy