Wednesday 3 December 2008

Don't ignore that error!

Settle yourself down for an apocryphal bedtime story. A programmer's parable, if you will...
I was walking down the street one evening to meet some friends in a bar. We hadn't shared a beer in some time and I was looking forward to seeing them again. In my haste, I wasn't looking where I was going. I tripped over the edge of a curb and ended up flat on my face. Well, it serves me right for not paying attention, I guess.

It hurt my leg, but I was in a hurry to meet my friends. So I pulled myself up and carried on. As I walked further the pain was getting worse. Although I'd initially dismissed it as shock, I rapidly realised there was something wrong.

But, I hurried on to the bar regardless. I was in agony by the time I arrived. I didn't have a great night out, because I was terribly distracted. In the morning I went to the doctor and found out I'd fractured my shinbone. Had I stopped when I felt the pain, I'd've prevented a lot of extra damage that I caused by walking on it. Probably the worst morning after of my life.
Too many programmers write code like my disastrous night out.

Error, what error? It won't be serious. Honestly. I can ignore it. This is not a winning strategy for solid code. In fact, it's just plain laziness. (The bad sort.) No matter how unlikely you think an error is in your code, you should always check for it, and always handle it. Every time. If you don't, you're not saving time, you're storing up potential problems for the future.

We report errors in our code in a number of ways, including:
  • Return codes. A function returns a value. Some of which mean "it didn't work". Error return codes are far too easy to ignore. You won't see anything in the code to highlight the problem. Indeed, it's become standard practice to ignore some standard C functions' return values. How often do you check the return value from printf?
  • errno This is a curious C language aberration, a separate global variable set to signal error. It's easy to ignore, hard to use, and leads to all sorts of nasty problems - for example, what happens when you have multiple threads calling the same function?
  • Exceptions are a more structured language-supported way of signaling and handling errors. And you can't possibly ignore them. Or can you? I've seen lots of code like this:
try {
/// ...do something...
}
catch (...) {} // ignore errors
  • The saving grace of this awful construct is that it highlights the fact you're doing something morally dubious.

Not handling errors leads to:
  • Brittle code, full of hard-to-find crashes.
  • Insecure code. Crackers often exploit poor error handling to break into software systems.
  • Bad structure. If there are errors from your code that are tedious to deal with continually, you have probably have a bad interface. Express it better, so the errors are not so onerous.
Just as you should check all potential errors in your code, you must expose all potentially erroneous conditions in your interfaces. Do not hide them, and pretend that your services will always work.

Why don't we check for errors? There are a number of common excuses. Which of these do you agree with? How you would you counter each of them?
  • Error handling clutters up the flow of the code, making harder to read, and harder to spot the "normal" flow of execution.
  • It's extra work and I have a deadline looming.
  • I know that this function call will never return an error (printf always works, malloc always returns new memory, and if it fails we have bigger problems...)
  • It's only a toy program, and needn't be written to a production-worthy level.

2 comments:

Unknown said...

"Error handling clutters up the flow of the code, making harder to read, and harder to spot the "normal" flow of execution."

Yes, yes it does, but code includes error handling. It's part of the logic, not secondary to it. Having said this, it's possible to clean up your code by thinking about your design. Is there a design pattern you should be using? Are you being strict enough with what you accept, so that you don't have to perform pointless checks? Refactor mercilessly until your code is clean.

"It's extra work and I have a deadline looming."

Such short-sightedness causes pain after the deadline when the necessary checking is missing and the bugs bite. It's always harder to fix code than write it properly in the first place.

"I know that this function call will never return an error (printf always works, malloc always returns new memory, and if it fails we have bigger problems...)"

It will come back and bite you. You will spend hours trying to get enough information to find problems, tearing your hair out.

"It's only a toy program, and needn't be written to a production-worthy level."

Many pieces of production code started life as toy programs, sometimes called 'spikes'. The easiest way to avoid re-using such code is to throw it away as soon as possible - simply don't allow it to get near production.

None of these pieces of advice are any use, of course. Developers are often cocksure and most will repeat the same mistakes many, many times before either they learn from them, give up on development or their irate colleagues will release a leopard into their office.

the_exile said...

Well of course no-one should or could say anything about this until someone had said what Rik said. I agree with him totally. However, at the risk of starting an unholy war, but in the interests of avoiding a religious dogma, I think that there are some specific instances in which some of these arguments might be partially valid.

If you are short of time, it may be reasonable to shortcut on error checking - but not to do 'on error carry-on regardless'. I can't think of any time that is OK. If the default behaviour is 'die ugly' and your desired behaviour is 'die pretty' (e.g. a memory allocation error in C++ application software), cutting corners on this could be justifiable. Yes you will have more difficulty tracking down an error when it occurs, but this is where your domain experience and your customer prioritisation process comes into play. What is the result for the user, the support engineer or the business of 'die ugly' and how does the effort involved in your user-story 'die pretty' stack up against getting your code out on time and/or missing out some other feature.