This is a story about my first software management success. It’s also a story about my first software management failure. It was a success, because the work got completed, and without any nasty surprises. It was a failure, because I could have made the team more efficient, and I didn’t. Both of these are good things.
You might think it’s strange for me to call a failure a good thing. But I count it solidly a good thing. Because it means I was stretching myself, doing something I had not already mastered. And the fact that I recognize that there was some part that I can improve, some part I would do differently, means that I’m using the opportunity to grow. That’s the most important thing you can get out of a failure, and that’s why we need to fail. Actually, we shouldn’t even call them failures. Like Phil McKinney did recently on the Killer Innovations podcast, we should call them opportunities to learn.
So, how did I encounter this opportunity to learn? First you have to understand something about our system. It’s a legacy enterprise web-app written mostly in C++. Numerous consultants wrote much of it back in the 20’th century. Money was tight, and the most important goal was getting new features delivered. So a consultant would be hired to add features as fast as he could and pad his resume at the same time. For example, one of these consultants apparently wanted to be able to say that he developed a web server. So he did. And we’re still saddled with it.
Yes, our enterprise web-app uses a web server that isn’t a real web server. But that’s only the beginning. It also uses an app server that isn’t recognizable as an app server. And interwoven with these are the application model and domain model. And I do mean interwoven. There is duplicated code everywhere, and each piece of code has at least five different responsibilities. Before working on this system, I thought I had seen unmaintainable legacy code. Ha! If you’ve ever read Working Effectively With Legacy Code, I routinely face real-life problems as bad as the most terrifying stories Michael Feathers recounts. Whatever code you’re wrestling with, take heart that it probably isn’t as bad as this. The basic rules every first-year comp-sci student learns: high cohesion and loose coupling— Remember those? Well, our system has no trace of them anywhere.
A good design is like an egg. If you crack an egg into a bowl, you’ll see the yolk in the middle and the white around it. Each of these parts has its own form and function within the egg. Each is highly cohesive. The two parts have a well-defined relationship with each other. Each never intrudes on the other’s space, but they work together to form the whole egg. They are loosely coupled.
Take a fork and whip it through the egg over and over again. Now you have a scrambled egg. That’s our system. My job is to unscramble the egg.
I wanted this responsibility. I’ve been pushing for it. You might think this is an instance of be careful what you wish for. But I’m not afraid, and I’m not overwhelmed. Because I know the secret to unscrambling an egg. The secret is to do it with tweezers, one pinch at a time. Pick up a tiny bit of egg. Is it yolk or white? Yolk? Okay, put it over here. Next tiny bit. Yolk or white? White? Okay, put it over there. The secret to refactoring code is to do it bit by bit.
So the way it worked out, I was in charge. No, it wasn’t on a grand scale. It was just a mini sub-project. I didn’t make a Gantt chart. I did end up posting a burn chart, which I’ll get to in a moment. I was not officially a manager. But I was, for a few weeks anyhow, in charge. And I discovered something fantastic: I liked it.
The first thing I did was to prepare a small presentation for the rest of the team. I went over how the bits of our system fit into a proper enterprise architecture. And I identified a first step: Refactor our response code to use an IHttpResponse
interface, instead of typing in HTTP response text and pushing it at the open TCP socket. Yes, that’s really what the old code did. I boiled this down to a set of refactoring techniques we could use. I pulled the general process from Working Effectively With Legacy Code. But then I applied it to the specific problem we were facing at that moment. I provided refactoring templates that applied to most of the code that we needed to refactor. Yes, there were special cases. But we could handle them as they arose.
When I gave the presentation, a junior engineer was already working with me. He was refactoring the parts of the system he was intimate with. Afterward, another senior engineer was asked to help out. She picked a module with some pretty hairy refactorings that were all interrelated and had been bugging her. I worked on the rest. I picked a module to refactor and went at it. Then I picked another module.
But how did we know what code we needed to refactor? We searched through the code for a particular function call: the Send()
function. Send()
was the function that pushed response data up the open TCP socket. We could just search for instances of Send()
. Then instead of generating HTTP and calling Send()
, now we wanted to generate HTML and use IHttpResponse
.
This also made it very easy to chart our progress. Just search for instances of Send()
, and find out how many instances we’ve eliminated. I threw together a semi-automated process and did this every day. Then I updated a burn chart, which I posted in our shared hallway. Everybody appreciated seeing the progress, especially my manager.
Can you imagine my elation the day I actually deleted the definition of Send()
and rebuilt the project? I told everyone what I’d just done. How cool is that? All that excitement over one little function that had its tentacles woven throughout the whole system. And now it is gone, hopefully forever.
There’s one big thing I would do differently. I would only take on personally those parts that required my insight into the big picture. As soon as other developers were on-board, I would start asking them to work on any part of the project they could. In other words, I would delegate more. As it was, I spent a lot of time doing by-the-numbers refactorings. Then at the end, there was one more function, which has a couple unique twists. To refactor this monster, I first had to change the inheritance hierarchy and add additional features to the supporting classes. It took me days. If I had been thinking more effectively, I would have avoided all the by-the-numbers work. A junior engineer could have done that, and at much less cost to the company. I would have been working on this code much earlier, and we probably would have finished sooner.
And that’s what I plan to do with the next phase. Right now I’m implementing IHttpRequest
. We finished the response side of things; now we’ll do the request side. Of course, the function that needs to instantiate the request object currently does things from all layers of the system, including the domain model. It’s 465 lines long. And it’s full of bugs and, er, undocumented features. So I need to deal with that, again using the techniques Michael Feathers talks about.
But once that’s taken care of, we can start refactoring the application-model code to use IHttpRequest
instead of the global variables and duplicated logic it uses now. Most of this code will follow straightforward patterns. But some of it will hold unexpected surprises. I plan to look for the unexpected surprises first and save the rote stuff for later. We’ll see how that works out.
-TimK
UPDATE (March 2010): A recent conversation reminded me of this old post. Before I left that job, I did successfully refactor the 465-line monster-function into a proper web-server/app-server architecture, and I did it by writing unit tests first, and then refactoring functionality piece by piece, until I could delete the original (now unused) function.
A year or so ago, I reconnected with some of the people on this project. I found out that they still hadn’t integrated these refactorings into the baseline. And they no doubt still haven’t, even today. Oh well, it was still an exciting and rewarding experience at the time. Maybe sometimes it’s prudent not to ask what they actually do with your work-product: you may not want to find out.
Hey. Nice story. I’m glad it turned out positive and not another bitch session/horror story like most I read (think dailywtf.com). Good for you!
Gav
Ah, the thrill of driving progress at the lead position. I remember a very similar engagement when converting old VBA apps to a Java environment with a small team. I think being a good IT manager really does require some sort of insight from experiences that had both the successes and the failures.
Great story! One word of warning, though. Reducing costs isn’t as easy as it seems. It isn’t just a matter of getting the lower cost person on the things that look mechanical. The way to really reduce costs is get the guy with the most understanding to pair with everyone else. It doesn’t matter if he’s the most expensive guy. I think that the thing that raises costs more than anything else in software is lack of understanding. And, this seems to be true for all sorts of understanding.. low level understanding of techniques, API.. the big picture, etc.
I love the egg metaphor, btw.
Thanks for the kind words, Gavin, Retrospector, and Michael. I’m still learning, but it’s a good feeling.
Michael, that’s an epiphany I wish for all software development organizations. Pair programming increases effectiveness and thus reduces cost.
I did get to pair at times with each of my teammates, though I didn’t mention it in the story. I wish I had more opportunity to do so. The primary goal of this pairing was to share knowledge about the code. That’s an easy sell. I don’t know to sell continuous pairing to the organization, much less my team-lead, who once even refused to let me pair with him… But that’s another story.
-TimK