People have asked why I don’t like programming with exceptions. In both Java and C++, my policy is:
- Never throw an exception of my own
- Always catch any possible exception that might be thrown by a library I’m using on the same line as it is thrown and deal with it immediately.
The reasoning is that I consider exceptions to be no better than “goto’s”, considered harmful since the 1960s, in that they create an abrupt jump from one point of code to another. In fact they are significantly worse than goto’s:
- They are invisible in the source code. Looking at a block of code, including functions which may or may not throw exceptions, there is no way to see which exceptions might be thrown and from where. This means that even careful code inspection doesn’t reveal potential bugs.
- They create too many possible exit points for a function. To write correct code, you really have to think about every possible code path through your function. Every time you call a function that can raise an exception and don’t catch it on the spot, you create opportunities for surprise bugs caused by functions that terminated abruptly, leaving data in an inconsistent state, or other code paths that you didn’t think about.
A better alternative is to have your functions return error values when things go wrong, and to deal with these explicitly, no matter how verbose it might be. It is true that what should be a simple 3 line program often blossoms to 48 lines when you put in good error checking, but that’s life, and papering it over with exceptions does not make your program more robust. I think the reason programmers in C/C++/Java style languages have been attracted to exceptions is simply because the syntax does not have a concise way to call a function that returns multiple values, so it’s hard to write a function that either produces a return value or returns an error. (The only languages I have used extensively that do let you return multiple values nicely are ML and Haskell.) In C/C++/Java style languages one way you can handle errors is to use the real return value for a result status, and if you have anything you want to return, use an OUT parameter to do that. This has the unforunate side effect of making it impossible to nest function calls, so result = f(g(x)) must become:
if (ERROR == g(x, tmp))
if (ERROR == f(tmp, result))
This is ugly and annoying but it’s better than getting magic unexpected gotos sprinkled throughout your code at unpredictable places.
If someone wants to write up a nice article about how to develop multilingual, Unicode applications with PHP or point me to an existing article on the subject I will link to it here. Right now both the PHP documentation and a google search for “PHP Unicode” make it look like you’re pretty screwed if you really want to do Unicode in PHP. There is some existing documention of mb_ functions that people have pointed me to, which is badly written and confusing, and appears to only support a handful of encodings, not Unicode in general. It also seems to be an extension that you have to turn on, which means, I think, that the average PHP installation does not support this out of the box.