Thanks to everyone who came to the open house last night. If you have pictures, send me a link!

We had an interesting conversation about how the impedance mismatch between contemporary high-level programming languages (Java, C#, Python, VB) and relational databases. Since a huge percentage of code requires access to databases, the glue (a.k.a. the connecticazoint) between the RDBMS layer and the application code is very important, yet virtually every modern programming language assumes that RDBMS access is something that can be left to libraries. In other words, language designers never bother to put database integration features into their languages. As a tiny example of this, the syntax for “where” clauses is never identical to the syntax for “if” statements. And don’t get me started about data type mismatches: just the fact that columns of any type might be “null” leads to an incompatibility between almost every native data type and the database data types.

The trouble with this is that the libraries (think ADO, DAO, ODBC, JDBC, embedded SQL, and a thousand others) need to be general purpose to be reusable, and yet what you really want is a mapping between a native data structure and a table row or query result row. Inevitably, you have to hand roll this mapping and wire it up manually, which is error prone and frustrating.

I think this is a fatal flaw in language design, akin to the bad decision by the designers of C++ that it was not necessary to support a native string type. “Let a thousand CString/TString/String/string<char> types flourish,” they said, and then spent more than a decade adding new features to the language until it was marginally, but not completely, possible to implement a non-awful string class. And now we have a thousand string types (most large C++ bodies of code I’ve seen use three or four) and a bunch of really good books by Scott Meyers about why your personal hand-rolled string class is inadequate. It’s about time that a language designer admitted that RDBMS access is intrinsic to modern application implementation and supported it in a first-class way syntactically.

Now for all the disclaimers to prevent “but what about” emails. (1) in functional languages like lisp the syntax layer is so light that you could probably implement very good RDBMS shims in ways that feel almost native. Especially if you have lazy evaluation of function parameters, it’s easy to see how you could build a “where” clause generator that used the same syntax as your “if” predicates. (2) Access Basic, later Access VBA, had a couple of features to make database access slicker, specifically the [exp] syntax and the rs!field syntax, but it’s really only 10%. There are probably other niche-languages or languages by RDBMS vendors that do a nice job. (3) Attempts to solve this problem in the past have fallen in two broad groups: the people who want to make the embedded SQL programming languages better (PL/SQL, TSQL, et al), and the people who want to persist objects magically using RDBMS backends (OODBMSes and object persistence libraries). Neither one fully bridges the gap: I don’t know of anyone who builds user interfaces in SQL or its derivatives, and the object persistence implementations I’ve seen never have a particularly good implementation of SELECT.


Fog Creek OfficeSave the date: Fog Creek Software will host an open house at our new office on March 24th, 2004, at 6:00 PM.

535 8th Ave. (bet. 36th and 37th), 18th Floor, New York

Top Twelve Tips for Running a Beta Test

Here are a few tips for running a beta test of a software product intended for large audiences — what I call “shrinkwrap“. These apply for commercial or open source projects; I don’t care whether you get paid in cash, eyeballs, or peer recognition, but I’m focused on products for lots of users, not internal IT projects.

  1. Open betas don’t work. You either get too many testers (think Netscape) in which case you can’t get good data from the testers, or too few reports from the existing testers.
  2. The best way to get a beta tester to send you feedback is to appeal to their psychological need to be consistent. You need to get them to say that they will send you feedback, or, even better, apply to be in the beta testing program. Once they have taken some positive action such as filling out an application and checking the box that says “I agree to send feedback and bug reports promptly,” many more people will do so in order to be consistent.
  3. Don’t think you can get through a full beta cycle in less than eight to ten weeks. I’ve tried; lord help me, it just can’t be done.
  4. Don’t expect to release new builds to beta testers more than once every two weeks. I’ve tried; lord help me, it just can’t be done.
  5. Don’t plan a beta with fewer than four releases. I haven’t tried that because it was so obviously not going to work!
  6. If you add a feature, even a small one, during the beta process, the clock goes back to the beginning of the eight weeks and you need another 3-4 releases. One of the biggest mistakes I ever made was adding some whitespace-preserving code to CityDesk 2.0 towards the end of the beta cycle which had some, shall we say, unexpected side effects that a longer beta would have fleshed out.
  7. Even if you have an application process, only about one in five people will send you feedback anyway.
  8. We have a policy of giving a free copy of the software to anyone who sends any feedback, positive, negative, whatever. But people who don’t send us anything don’t get a free copy at the end of the beta.
  9. The minimum number of serious testers you need (i.e., people who send you three page summaries of their experience) is probably about 100. If you’re a one-person shop, that’s all the feedback you can handle. If you have a team of testers or beta managers, try to get 100 serious testers for every employee that is available to handle feedback.
  10. Even if you have an application process, only one out of five testers is really going to try the product and send you feedback. So, for example, if you have a QA department with 3 testers, you should approve 1500 beta applications to get 300 serious testers. Fewer than this and you won’t hear everything. More than this and you’ll be deluged with repeated feedback.
  11. Most beta testers will try out the program when they first get it, and then lose interest. They are not going to be interested in retesting it every time you drop them another build unless they really start using the program every day, which is unlikely for most people. Therefore, stagger the releases. Split your beta population into four groups and each new release, add another group that gets the software, so there are new beta testers for each milestone.
  12. Don’t confuse a technical beta with a marketing beta. I’ve been talking about technical betas, here, in which the goal is to find bugs and get last-minute feedback. Marketing betas are prerelease versions of the software given to the press, to big customers, and to the guy who is going to write the Dummies book that has to appear on the same day as the product. With marketing betas you don’t expect to get feedback (although the people who write the books are likely to give you copious feedback no matter what you do, and if you ignore it, it will be cut and pasted into their book).