If it ain't broken, don't fix it
Recently, I was involved in trying to solve a blocking issue at work. It took us quite some time to figure out was what going on, and it ended up being one of those cases of "If it ain't broken, don't fix it". So, even though everybody knows about this line, I want to give it some extra attention.
The issue was rather nasty, because the action that led to it was actually quite logical and made sense to the developer. I might well have done the same thing, had I encountered the code that was changed. Let me explain the issue.
Our application makes heavy use of external API's, usually SOAP webservices, to configure systems, fetch data, etc. In this specific case, our application was calling an external webservice to push a configuration into it. The developer causing the issue found a small typo in this code: It said "Emegrency" instead of "Emergency". Funny enough though, in earlier tests the system appeared to be working. He fixed the issue, committed it to Subversion, and went on to do his job.
Little did he know that his quick spelling fix would cause a lot of searching later on. After the sprint, we deployed to our test server, and our test users started testing the system. Quickly they found out that the configuration system would not work anymore, and let us know about this blocking issue. I started digging into the code, amongst other things by reading the diff of the sprint to find out what was changed. I found the spelling fix but quickly discarded this as the cause of the problem, since, after all, it was just a spelling fix.
It took me ages to find the problem. I faked SOAP calls, sniffed the SOAP connection to see if there was anything going wrong, checked the WSDL of the SOAP server to see if there were any obvious issues. There was nothing strange to be found! Or was there? On a closer inspection of the WSDL I noticed an inconsistency in it: The inline documentation was mentioning the element "Emergency" while the actual definition was mentioning "Emegrency". The SOAP server was expecting the element with the typo!
Apparently, the original developer of this code had actually spotted this inconsistency (since there is hardly any documentation on this SOAP server aside from the inline docs of the WSDL) and developed to accomodate the spelling mistake. Unfortunately this was not spotted by the fixing developer. And not by me while trying to find the cause of this issue, at least not before spending a lot of time trying to find the cause of the issue.
So what to learn from this? Don't fix things that are actually working! Even though there is a spelling mistake somewhere, be very cautious and don't just go about fixing the mistake unless you are absolutely sure you are not breaking anything, especially when working with external APIs.
October 13, 2009 - tags: php, symfony, common sense, best practices
Nicolas Martin: You fixed a damn bug ! Unfortunately, fixing this bug took you more time than expected.
Next time the same kind of bug happens, you'll fix it in a few minutes.
More bug you fixes, more robust code you make.
What to learn from this ? Well, always improve your code even if it's not broken !
left: Nicolas: What you're forgetting here is that the bug I fixed had never existed if the spelling mistake was not fixed. I fixed the bug basically by reverting to the previous state.
Andrew Thompson: Would you have been able to rollback to states since the test servers last passed, checking what revision the failure emerged? That might have solved it quicker.
I would have thought the lesson learned would be that the developer should have tested their change if they thought they were correcting an error.
Justin Dalton: Seems to me the problem wasn't fixing the spelling error, rather the problem was that the developer didn't recognize that changing the spelling of a parameter was going to cause an issue. It would only have taken a couple more seconds to check out the related scripts and make the change.
Of course I did the exact same thing a couple of months ago so I guess I would be looking for it.
Matthew Weier O'Phinney: Actually, I'm hoping that you added a comment explaining the purposeful typo, as that's the real bugfix.
left: Andrew: A rollback was hard, since there were so many changes in our project from the whole sprint (even in this specific file) that this would've taken some effort. Besides, just rolling back would not have given us the cause of the problem, so we would not learn from it. And yes, the change should have been tested, I wholeheartedly agree with that.
Justin: Yes, there is more to this problem than just fixing the spelling error, and we're looking into how to prevent these kinds of errors without paralysing development by processes.
Matthew: Actually, my fix involved slightly more than just reverting the change. The solution in the code is now a much clearer (and better documented) indication that special care should be taken

Seva Lapsha: The conclusion is very faulty and dangerous.
This is what should you learn from this:
1. The code should be documented.
2. The code should be covered by unit tests.
3. The code should be fixed consciously.
Alan Knowles: sounds like you need to remind the original developer of the importance of inline comment..
// Emergency is deliberatly spelt wrong, due to some id**t @ the other end c**king this up.....

Nate Abele: When I first clicked on the title of this post, I thought it was gonna be a response to Fabien about Twig.
Just sayin'...
Herman Radtke: I think the real lesson here is to make sure each change is properly tested. Had the fixing developer done this, they would have realize they broke something and resolved the issue right there.
http://www.Vork.us: Although this would not have helped with a SOAP call, I always make it a habit to grep the entire application for the misspelled word when fixing typeos [sic] like that.
Seva Lapsha: 2Vorks.us:
http://en.wikipedia.org/wiki/Typographical_error
Fabian Lange: Indeed, one of the rare occurrences where an inline comment is a must.
It is actually not soo easy to get such typo bugs fixed in interfaces