If it ain't broken, don't fix it

Recently, I was involved in trying to solve a blocking issue at work. It took us quite some time to figure out was what going on, and it ended up being one of those cases of "If it ain't broken, don't fix it". So, even though everybody knows about this line, I want to give it some extra attention.

The issue was rather nasty, because the action that led to it was actually quite logical and made sense to the developer. I might well have done the same thing, had I encountered the code that was changed. Let me explain the issue.

Our application makes heavy use of external API's, usually SOAP webservices, to configure systems, fetch data, etc. In this specific case, our application was calling an external webservice to push a configuration into it. The developer causing the issue found a small typo in this code: It said "Emegrency" instead of "Emergency". Funny enough though, in earlier tests the system appeared to be working. He fixed the issue, committed it to Subversion, and went on to do his job.

Little did he know that his quick spelling fix would cause a lot of searching later on. After the sprint, we deployed to our test server, and our test users started testing the system. Quickly they found out that the configuration system would not work anymore, and let us know about this blocking issue. I started digging into the code, amongst other things by reading the diff of the sprint to find out what was changed. I found the spelling fix but quickly discarded this as the cause of the problem, since, after all, it was just a spelling fix.

It took me ages to find the problem. I faked SOAP calls, sniffed the SOAP connection to see if there was anything going wrong, checked the WSDL of the SOAP server to see if there were any obvious issues. There was nothing strange to be found! Or was there? On a closer inspection of the WSDL I noticed an inconsistency in it: The inline documentation was mentioning the element "Emergency" while the actual definition was mentioning "Emegrency". The SOAP server was expecting the element with the typo!

Apparently, the original developer of this code had actually spotted this inconsistency (since there is hardly any documentation on this SOAP server aside from the inline docs of the WSDL) and developed to accomodate the spelling mistake. Unfortunately this was not spotted by the fixing developer. And not by me while trying to find the cause of this issue, at least not before spending a lot of time trying to find the cause of the issue.

So what to learn from this? Don't fix things that are actually working! Even though there is a spelling mistake somewhere, be very cautious and don't just go about fixing the mistake unless you are absolutely sure you are not breaking anything, especially when working with external APIs.
Add comment

Comments

gravatar Nicolas Martin: You fixed a damn bug ! Unfortunately, fixing this bug took you more time than expected.
Next time the same kind of bug happens, you'll fix it in a few minutes.
More bug you fixes, more robust code you make.
What to learn from this ? Well, always improve your code even if it's not broken !
October 13, 2009
gravatar left: Nicolas: What you're forgetting here is that the bug I fixed had never existed if the spelling mistake was not fixed. I fixed the bug basically by reverting to the previous state.
October 13, 2009
gravatar Andrew Thompson: Would you have been able to rollback to states since the test servers last passed, checking what revision the failure emerged? That might have solved it quicker.

I would have thought the lesson learned would be that the developer should have tested their change if they thought they were correcting an error.


October 13, 2009
gravatar Justin Dalton: Seems to me the problem wasn't fixing the spelling error, rather the problem was that the developer didn't recognize that changing the spelling of a parameter was going to cause an issue. It would only have taken a couple more seconds to check out the related scripts and make the change.

Of course I did the exact same thing a couple of months ago so I guess I would be looking for it. ;)
October 13, 2009
gravatar Matthew Weier O'Phinney: Actually, I'm hoping that you added a comment explaining the purposeful typo, as that's the real bugfix. ;)
October 13, 2009
gravatar left: Andrew: A rollback was hard, since there were so many changes in our project from the whole sprint (even in this specific file) that this would've taken some effort. Besides, just rolling back would not have given us the cause of the problem, so we would not learn from it. And yes, the change should have been tested, I wholeheartedly agree with that.

Justin: Yes, there is more to this problem than just fixing the spelling error, and we're looking into how to prevent these kinds of errors without paralysing development by processes.

Matthew: Actually, my fix involved slightly more than just reverting the change. The solution in the code is now a much clearer (and better documented) indication that special care should be taken ;)


October 13, 2009
gravatar Seva Lapsha: The conclusion is very faulty and dangerous.

This is what should you learn from this:
1. The code should be documented.
2. The code should be covered by unit tests.
3. The code should be fixed consciously.
October 13, 2009
gravatar Alan Knowles: sounds like you need to remind the original developer of the importance of inline comment..
// Emergency is deliberatly spelt wrong, due to some id**t @ the other end c**king this up..... ;)
October 14, 2009
gravatar Nate Abele: When I first clicked on the title of this post, I thought it was gonna be a response to Fabien about Twig.

Just sayin'...
October 14, 2009
gravatar Herman Radtke: I think the real lesson here is to make sure each change is properly tested. Had the fixing developer done this, they would have realize they broke something and resolved the issue right there.
October 14, 2009
gravatar http://www.Vork.us: Although this would not have helped with a SOAP call, I always make it a habit to grep the entire application for the misspelled word when fixing typeos [sic] like that.
October 15, 2009
gravatar Seva Lapsha: 2Vorks.us:
http://en.wikipedia.org/wiki/Typographical_error
October 15, 2009
gravatar Fabian Lange: Indeed, one of the rare occurrences where an inline comment is a must.
It is actually not soo easy to get such typo bugs fixed in interfaces :(
October 18, 2009

Php5_zce_logo

not tested in IE


Upcoming events

I will be speaking 08-10-2010: Symfony Day Cologne 2010
I will be speaking 09-10-2010: Symfony workshop

Tags

1337 2008 2010 4developers access modifiers accessibility AdaLovelaceDay09 advent agavi agile amsterdam apache apple article articles atk atkMetaNode audioscrobbler azure backwards compatibility barcelona bbc bbq beatstad belgium best practices bittorrent boards of canada book books bughuntday caching cake cal evans career cat cerf certificate cfp clear cms cologne common sense communities community conference conferences continuous integration contribute crisis css custom datetime DbFinderPlugin decorator decorators deployment devdays development directoryindex documentation download dpc dpc09 dpc10 DPC2008 dreamhost dv7 eclipse ed efficiency enterprise errors event events expertise ezcomponents facebook flickr framework frameworks freelance freeze frontend fun games germany getting real google googletalk graceful degradation hack hackers hidden gem hiphop howto hp html http ibuildings icann ide idm imovie indy ingewikkeld internet IPC ipc ipc08 javascript job jobeet john peel joomla kubuntu left on the web lighttpd lime linux live london loudblog m2ts mac malware mambo marjolein mediterra meeting meme meta methodology microsoft movie music mysql namespace namespaces netbeans netherlands nllgg odmarco open source opinion ORM osx paradiso pavilion pear performance personal pfc10 pfcongres pfcongrez photo php phpabstract phpazure phpBB phpbb phpbelgium phpbenelux phpbnl10 phpgg phpitalia phpnw phpnw08 phptek phptek09 phpuk2009 phpUnderControl phpunit php|architect php|tek podcast politics portability postcrossing presentation presentations private projects protected public qa recruiting refactoring review rewrite ruby on rails schedule scifi script security seven things sfdaycgn simplexml slides smfony software sogeti solar sound standard standards star trek static steer strings subversion symfony Symfony2 symfonycamp symfonyday symfonyUnderControlPlugin talk talks technology techportal tek09 telecommuting terratec terrorism testfest testing textpattern tips tld tomas training twig uncon unet usability usergroup validation vhost video vinyl virus warp weblogging wiki windows winphp women work workshop world world of warcraft wpi writing xml xpath xsd yara year youtube ZCE zemanta zend zend framework zend server zend studio Zend_Form
© 2004 - 2010 Stefan Koopmanschap + Powered by Symfony, photos powered by Flickr, links powered by Delicious, Shanghai smilies by Iconbuffet. Feeds: rss / atom. Left on the Web v4.4.0.1