New Server at Peer 1 Network

I’ve moved Joel on Software to a new server, at a colocation facility operated by Peer 1 Network. In the process of finding a new home and getting it up and running I’ve learned quite a bit about how web hosting works, so I thought I’d describe a bit of it here and in the process provide a glimpse Behind The Scenes.

The Fog Creek office is connected to the Internet via a single T1, which provides 1.5 Mbps of bandwidth. That’s 1.5 megabits per second, the equivalent of 24 phone lines.

The T1 was provided by Savvis. The cost fluctuates depending on usage but it generally costs us something just over $1000 a month. [Image]

We had originally decided to use a T1 instead of the much cheaper alternatives (like DSL or cable modems) because T1 lines had a reputation for reliability. We ran all our servers (Fog Creek, Joel on Software, TechInterview.org) from the office because we needed a fast Internet connection to the office anyway, so why pay again for hosting? That was the theory, at least.

But the T1 turned out to be not as reliable as we thought. We had a lot of short outages, from seconds to an hour, and twice in the last year we had outages of a day or more. In both cases, the ISP did not fix them nearly as quickly as we thought they should have been able to. Although our ISP was Savvis Communications, the segment of T1 from our office to the nearest Savvis location (called the “local loop”) was actually provided by Worldcom. And in fact Worldcom didn’t actually provide it, they just resold a Verizon T1 under their own name. So when something went wrong with the Verizon line, we had to convince Savvis to convince Worldcom to convince Verizon to fix it. Somehow this was like pushing on a string. Outsourcing is problematic that way: when everything works perfectly, you don’t notice, but when things go wrong, it’s much harder to get them fixed.

You may be wondering if a single server connected to a T1 was enough for all the traffic this site gets, about 100,000 hits a day, more like 500,000 when I write a new article or get mentioned on slashdot. The short answer is, it was fine. Since I publish the site using CityDesk which simply generates a bunch of plain HTML files, the server doesn’t have to do anything tricky like generate each page from a database on the fly.

[Image]Anyway, we’re about to move to a new office, so a decision had to be made about whether to take the T1 with us, or do something else. The long outages convinced me we needed a real colocation facility. “Colocation” means that you pay someone with a room with extremely high speed Internet access, usually redundant, to let you put your own computer on a shelf in that room. A good colocation facility, or “colo” as they’re called, also has great air conditioning, a big generator to provide power if the lights go out, and huge batteries to keep the whole thing running during the few minutes it takes to get the generator started up. It also has a lot better security than average, all kinds of exciting fire extinguishing equipment, and locked cages to keep one company’s servers from being tampered with by another company that uses the same colo.

“Colocation” is not the same as “web hosting.” Typically a web host will host your web site on their own servers. They administer the servers; you just put your files up and they serve them. Whereas colo means you buy and install and administer your own server hardware. We chose colocation for a number of reasons, but primarily because it gives us the flexibility to run any software we want.

If you’re in the market for colocation, especially in a major metropolitan area, you probably have quite a few choices. By the way, there’s no reason you have to get colocation in the same city where you’re located — you can just build a server and then ship it to the colo. The colo’s engineers put it on the rack for you, plug it into power and the net, and turn it on.

A quick google search came up with a list of about eight companies that offered colocation in New York, so I called them all for price quotes. For the most part, the quote is based on how much space you need, how much bandwidth you need, and possibly how much power you need. There may be some other charges for things like IP addresses, but those are nickels ‘n’ dimes. Space is measured in racks. A single rack is about 6 or 7 feet high and always exactly 19 inches wide. You might rent a half rack or quarter rack if you have less equipment. If you only have one or two machines,  you’ll see things quoted in U’s. One U is about 1.75 inches and represents the vertical distance from one set of screw holes to the next set on a rack; virtually all equipment that is intended to be mounted on a rack is a round multiple of 1U in height. The thinnest servers are 1U pizza-boxes. Our new server, a Dell 2650, is 2U in height; I chose this model because it had room for 5 hot-swappable hard drives. Hot swappable means you can plug them in and pull them out while the computer is running, so we can add some more storage without turning off the server and messing around with screwdrivers.

There are different schemes for paying for bandwidth. The most common, called capped bandwidth, is simply to pay for the bandwidth that’s available. For example, you might get 1 Mbps of bandwidth capped, and you can use all of it or none of it and pay the same amount. The network administrator will program the router to cap your usage at that amount.

Another common system is where they set you up with a ton of bandwidth, but you only pay for the 95% percentile of the actual bandwidth used. That means that they sample how much bandwidth you’re using every ten minutes over the course of a month. At the end of the month, they throw away the top 5% of all the samples, and charge you based on the maximum bandwidth you used during the other 95% of the time. This method means if you have a really big spurt in usage for a few minutes you won’t have to pay for it. This is what’s called “burstable bandwidth.”

Dell 2650Anyway, I’ve been running this server for a while, so I had a pretty good idea of how much bandwidth it was going to need, and I had picked out a nice 2U server, so I took this info to the eight colocation providers I found for quotes. What I found clustered into three groups:

  • high end snobs, who told me they never do less than full racks, and I couldn’t afford their services. These were usually companies from the Dot Com days that built ridiculously fancy offices and have marble bathrooms in their colos and named gigantic sports stadiums after themselves, and are now in bankrupcy court. When they would quote me a price, it was in the range of $1000 to $1500 a month.
  • reasonable small companies who had grown slowly and carefully, with slightly grimy colo facilities because they didn’t waste money on trivial stuff that doesn’t contribute to the bottom line like sweeping the floors or being polite to potential customers. They usually wanted $400 or $500. But luckily there was a third group, consisting of,
  • Peer 1 Network, a super friendly Canadian company that took one look at Joel on Software and said, “hey, nice site, we’ll host it for free for ya.” OK, I like free.

I liked just about everything I saw about Peer 1. They have just built a spiffy new facility in New York which connects to their cross-Canadian backbone. A Google search convinced me that their customers were all happy campers. And if you ever get a choice of whether to deal with Canadians or New Yorkers, I know it’s unpatriotic of me to say this, but, go with the Canadians. Peer 1 is a small enough company that they care about your business, something which the giant bankrupt telcos don’t. And they had a really smart network engineer working out of the New York colo who took the time to teach me a lot of the things I would know if I were a real network administrator as opposed to a software developer pretending to be a network administrator.

[Image]I got the Dell delivered straight to Peer 1’s facility which is down the block from the New York Stock Exchange. When it arrived, Michael and I grabbed the Win2K CD-ROM and took the subway downtown to set it up. Most servers run “headless,” that is, without a keyboard or monitor, so Peer 1 had a keyboard, monitor, and mouse on a rolling trolley available for our use so we could set up the system. (You can see the temporary wires in the picture at right.) We installed Windows 2000 and got it completely up to date with patches before connecting it to the live Internet. The server has two separate power supplies and two power cables in case one fails or falls out. The dual power cables also means that if you need to move the server or plug it in somewhere else, you can do this using extension cords without ever turning the machine off.

It had three separate ethernet jacks: you use one of them for remote administration. The remote administration jack is really quite neat. This is a little computer with its own network jack which lives inside the big computer. Even if the main computer is off or hung, I can connect to the administration computer over the Internet using a web browser and tell it to reboot or turn on the main computer. Slick!

Peer 1 only gave us a single Internet drop in our cage — just a regular Cat 5 ethernet cable — so I ran out to J&R Computer World and bought a $35 Linksys 5 port switch so we could hook up all three network jacks.

One last step before connecting to the Internet: we turned on packet filters to block all incoming connections except for traffic from the Fog Creek office.

Once we got the computer on the net, we tried www.google.com, which redirected immediately to www.google.ca. Cool! They think we’re in Canada! Our work done, we locked up the cage and came back to the Fog Creek office. Since then, everything we need to do with this machine can be done over the net, using Windows’ built-in Terminal Services. This feels exactly like sitting at the machine. With any luck, it will be years until I visit this computer again.

Rub a dub dub

One reason people are tempted to rewrite their entire code base from scratch is that the original code base wasn’t designed for what it’s doing. It was designed as a prototype, an experiment, a learning exercise, a way to go from zero to IPO in nine months, or a one-off demo. And now it has grown into a big mess that’s fragile and impossible to add code to, and everybody’s whiny, and the old programmers quit in despair and the new ones that are brought in can’t make head or tail of the code so they somehow convince management to give up and start over while Microsoft takes over their business. Today let me tell you a story about what they could have done instead.

FogBUGZ started out six years ago as an experiment to teach myself ASP programming. Pretty soon it became an in-house bug tracking system. It got embellished almost daily with features that people needed until it was good enough that it no longer justified any more work.

Various friends asked me if they could use FogBUGZ at their companies. The trouble was, it had too many hardcoded things that made it a pain to run anywhere other than on the original computer where it was deployed. I had used a bunch of SQL Server stored procedures, which meant that you needed SQL Server to run FogBUGZ, which was expensive and overkill for some of the two person teams that wanted to run it. And so on. So I would tell my friends, “gosh, for $5000 in consulting fees, I’ll spend a couple of days and clean up the code so you can run it on your server using Access instead of SQL Server.” Generally, my friends thought this was too expensive.

After this happened a few times I had a revelation — if I could sell the same program to, say, three people, I could charge $2000 and come out ahead. Or thirty people for $200. Software is neat like that. So in late 2000, Michael sat down, ported the code so that it worked on Access or SQL Server, pulled all the site-specific stuff out into a header file, and we started selling the thing. I didn’t really expect much to come of it.

In those days, I thought, golly, there are zillions of bug tracking packages out there. Every programmer has written a dinky bug tracking package. Why would anyone buy ours? I knew one thing: programmers who start businesses often have the bad habit of thinking everybody else is a programmer just like them and wants the same stuff as them, and so they have an unhealthy tendency to start businesses that sell programming tools. That’s why you see so many scrawny companies hawking source-code-generating geegaws, error catching and emailing geegaws, debugging geegaws, syntax-coloring editing tchotchkes, ftping baubles, and, ahem, bug tracking packages. All kinds of stuff that only a programmer could love. I had no intention of falling into that trap!

Of course, nothing ever works out exactly as planned. FogBUGZ was popular. Really popular. It accounts for a significant chunk of Fog Creek’s revenue and sales are growing steadily. The People won’t stop buying it.

So we did version 2.0. This was an attempt to add some of the most obviously needed features. While David worked on version 2.0 we honestly didn’t think it was worth that much effort, so he tended to do things in what you might call an “expedient” fashion rather than, say, an “elegant” fashion. Certain, ahem, design issues in the original code were allowed to fester. There were two complete sets of nearly identical code for drawing the main bug-editing page. SQL statements were scattered throughout the HTML hither and yon, to and fro, pho and ton. Our HTML was creaky and designed for those ancient browsers that were so buggy they could crash loading about:blank.

Yeah, it worked brilliantly, we’ve been at zero known bugs for a while now. But inside, the code was, to use the technical term, a “big mess.” Adding new features was a hemorrhoid. To add one field to the central bug table would probably require 50 modifications, and you’d still be finding places you forgot to modify long after you bought your first family carplane for those weekend trips to your beach house on Mars.

A lesser company, perhaps one run by an executive from the express-package-delivery business, might have decided to scrap the code and start over.

Did I mention that I don’t believe in starting from scratch? I guess I talk about that a lot.

Anyway, instead of starting from scratch, I decided it was worth three weeks of my life to completely scrub the code. Rub a dub dub. In the spirit of refactoring, I set out a few rules for this exercise.

  1. No New Features, not even small ones, would be added.
  2. At any time, with any check in, the code would still work perfectly.
  3. All I would do is logical transformations — the kinds of things that are almost mechanical and which you can convince yourself immediately will not change the behavior of the code.

I went through each source file, one at a time, top to bottom, looking at code, thinking about how it could be better structured, and making simple changes. Here are some of the kinds of things I did during these three weeks:

  • Changed all HTML to xhtml. For example, <BR> became <br />, all attributes were quoted, all nested tags were matched up, and all pages were validated.
  • Removed all formatting (<FONT> tags etc.) and put everything in a CSS style sheet.
  • Removed all SQL statements from the presentation code and indeed all program logic (what the marketing types like to call “business rules”). This stuff went into classes which were not really designed — I simply added methods lazily as I discovered a need for them. (Somewhere, someone with a big stack of 4×6 cards is sharpening their pencil to poke my eyes out. What do you mean you didn’t design your classes?)
  • Found repeated blocks of code and created classes, functions, or methods to eliminate the repetition. Broke up large functions into multiple smaller ones.
  • Removed all remaining English language text from the main code and isolated that in a single file for easy internationalization.
  • Restructured the ASP site so there is a single entry point instead of lots of different files. This makes it very easy to do things that used to be hard, for example, now we can display input error messages in the very form where the invalid input occurred, something which should be easy if you lay things out right, but I hadn’t laid things out right when I was learning ASP programming way back when.

Over three weeks the code got better and better internally. To the end user, not too much changed. Some fonts are a little nicer thanks to CSS. I could have stopped at any point, because at any given time I had 100% working code (and I uploaded every checkin to our live internal FogBUGZ server to make sure). And in fact I never really had to think very hard, and I never had to design a thing, because all I was doing was simple, logical transformations. Occassionally I would encounter a weird nugget in the code. These nuggets were usually bug fixes that had been implemented over the years. Luckily I could keep the bug fix intact. In many of these cases, I realized that had I started from scratch, I would have made the same bug all over again, and may not have noticed it for months or years.

I’m basically done now. It took, as planned, three weeks. Almost every line of code is different now. Yep, I looked at every line of code, and changed most of them. The structure is completely different. All the bug-tracking functionality is completely separate from the HTML UI functionality.

Here are all the good things about my code scrubbing activity:

  • It took vastly less time than a complete rewrite. Let’s assume (based on how long it took us to get this far with FogBUGZ) that a complete rewrite would have taken a year. Well, that means I saved 49 weeks of work. Those 49 weeks represent knowledge in the design of the code that I preserved intact. I never had to think, “oh, I need a new line here.” I just had to change <BR> to <br /> mindlessly and move on. I didn’t have to figure out how to get multipart working for file uploads. It works. Just tidy it up a bit.
  • I didn’t introduce any new bugs. A couple of tiny ones, of course, probably got through. But I was never doing the types of things that cause bugs.
  • I could have stopped and shipped at any time if necessary.
  • The schedule was entirely predictable. After a week of this, you can calculate exactly how many lines of code you clean in an hour, and get a darn good estimate for the rest of the project. Try that, Mozilla river-drawing people.
  • The code is now much more amenable to new features. We’ll probably earn back the three weeks with the first new major feature we implement.

Much of the literature on refactoring is attributable to Martin Fowler, although of course the principles of code cleanups have been well known to programmers for years. An interesting new area is refactoring tools, which is just a fancy word for programs that do some of this stuff automatically. We’re a long way from having all the good tools we need — in most programming environments you can’t even do a simple transformation like changing the name of a variable (and having all the references to it change automatically). But it’s getting better, and if you want to start one of those scrawny companies hawking programming tool geegaws, or make a useful contribution to open source, the field is wide open.

Working on CityDesk, Part Five

As promised I’ve been writing a series of articles about the creation of CityDesk. The primary purpose of these articles is to share the reasoning behind some of the decisions we made as a software company, and the techniques we used in developing our flagship software product. Today I want to talk about the five guiding principles that shaped the architecture.

One of the biggest decisions we made about CityDesk was to make a standalone Windows executable. With all the hubbub about web services, not to mention the noisy Internet dotcom boom we just lived through, this was a non-obvious decision. Why is CityDesk one of those old-fashioned programs you download and run, instead of, say, a web site you go to, in the style of Blogger or Atomz? Or a web server you install, like the Big Iron content management systems (Interwoven, Vignette StoryServer, etc.)?

An even more non-obvious decision was not to use XML. People who were a little bit too close to Silicon Valley groupthink looked at us like we were crazy because CityDesk is not based on XML and is not delivered through a web browser.

Now we’re thanking our lucky stars that we didn’t have a bunch of stupid venture capitalists forcing us to copy all the other content management companies, and we’re grateful that we’re not in Silicon Valley where everyone meets at Bucks and Stanford University seminars and copies each other’s bad ideas, because the one thing we’ve heard from everybody who’s tried CityDesk, consistently, is that CityDesk is the easiest content management software they’ve ever seen, full stop. And we got this ease-of-use because we believed certain things about software.

1. You can’t do good UIs in a web browser.

When we started building CityDesk we kept hearing one thing about Big Iron content management systems, which were all web-browser-based. Despite the fact that these systems had web-based editing interfaces and advanced workflow, we kept hearing that in real life, reporters created their content in Microsoft Word, did the workflow the old way (email), and only at the very last minute, a secretary opened the Word file and cut and pasted it into the content management system.

Why?

Because nobody wants to compose in a big TEXTAREA on an HTML page.

There are too many things that are impossible to deliver properly in a web browser. First of all, there’s the latency. Even if the web server is running on your own machine, round trips still take a certain amount of time, so all web applications feel somewhat sluggish. Second, there is no decent text editing widget for web browsers. With CityDesk as a Windows program, I can give you a feature where you can drag a picture from the desktop into an article. There is no way to create that kind of user interface through the web. I can keep a word count in the corner of the screen and update it in the background whenever you stop typing. You can use Ctrl+S to save without losing your place in the document — something many writers have learned to do regularly. (Good luck creating a word processor inside a web browser that doesn’t instantly lose everything, without prompting, if the user closes the browser). CityDesk has menus. Remember menus? And they work exactly like you expected them to, because the web browser doesn’t have it’s own menu, with a bunch of irrelevant commands, that wants to eat the Alt key. We don’t have to waste any screen real estate on browser geegaws (like “back” buttons and spinning e-globes) that have no meaning for CityDesk. We get our own icon in the taskbar instead of looking like all the other web browsers you have open. I could go on for days about the nice UI things we can do in a Windows application that just can’t be done with a web browser. In fact there’s a whole chapter about it in the printed edition of my book.

“Sure,” you say, “you can’t get as slick an app through the web, but you get ubiquity! People can go to an Internet cafe in Phuket and …”  Yeah, OK, that sounds great for Hotmail. And in the future we will build a typical compromise-laden web interface to CityDesk for those times when you’re at the beach in Thailand. In the meantime, my number one priority is to make an application that’s easy to use so that people like it, so that they tell their friends about it, so that everybody buys it and we have to hire full time employees just to slit open the envelopes and take out the checks.

2. XML is a Dumb Format For Storing Data

I’m not sure why XML got so sexy. It has its advantages; it’s sure a good idea for data interchange or for all those little files you need to store settings. But for real work it just can’t do what a solid, multiuser, relational database can do. The next time some uninformed analyst at Gartner or Giga or Forrester tells you “in the future, everything will be XML,” ask them how to do “SELECT author FROM books” fast with XML. Hint: you can’t. It has to be slow. XML is not the way to store a lot of data. Now tell me how to insert a new book at the beginning of the table without massive bitblts. Of course, I doubt if there is an analyst in one of those companies who would even understand that sentence, but that’s life. In the meantime, CityDesk uses a relational database. Over and out.

3. Most People Don’t Control Their Web Servers

Of all the people in the world who need to put data up on the web, only a tiny percentage of them have any control whatsoever over what software runs on their web server. We decided that CityDesk has to be an all-client application because probably 95% of the people who maintain web sites simply do not have the ability to install a content management system on their web server, even if they wanted to. We reduced the server requirements of CityDesk to (a) any web server that knows how to serve static files and (b) ftp or file copy access to that server. We think this will open up content management to a whole new universe of potential users.

4. If You Can Only Do One Platform, Do Windows.

I’d love to have a Mac version and a Linux version, but they are not good uses of limited resources. Every dollar I invest in CityDesk Windows will earn me 20 times as many sales as a dollar invested in a hypothetical Mac version. Even if you assume that Mac has a higher percentage of creative and home users, I’m still going to sell a heck of a lot more copies on Windows than I could on Mac. And that means that to do a Mac version, the cost had better be under 10% of the cost of a Windows version. Unfortunately, that’s nowhere near true for CityDesk. We benefit from using libraries that are freely available on Windows (like the Jet multiuser ACID database engine and the DHTML edit control) for which there are no equivalents on the Macintosh. So if anything, a Mac port would cost more than the original Windows version. Until somebody does something about this fundamental economic truth, it’s hard to justify Mac versions from a business perspective. (Incidentally, I have said time and time again, if Apple wants to save the Mac, they have to change this equation.)

And don’t get me started about Linux. I don’t know of anyone making money off of Linux desktop software, and without making money, I can’t pay programmers and rent and buy computers and T1s. Despite romantic rhetoric, I really do need to pay the rent, so for now, you’re going to have to rely on college kids and the occasional charitable big company for your Linux software.

5. Content Management Can’t Require Programmers

One of the reasons Big Iron content management costs so much is the team of programmers it takes to figure out the thing and set it up. Even the free and low-cost content management systems are designed by geeks for geeks. People have said to me that there’s no market for CityDesk because “with XML and XSL it’s a solved problem.”

Uh huh. Thanks. Even I can’t figure out XML and XSL, and I’m pretty sharp.

We decided that everything in CityDesk has to be simple enough for web designers and HTMLers, who are not necessarily programmers. And for the end-user, even HTML is too much. To the end-user CityDesk can’t be any more complicated than a WYSIWYG word processor.

The rule of thumb with ease-of-use is that if you make your program 10% easier, you’ll double the potential number of users of your product. In CityDesk we made absolutely no compromises on ease of use. Sure, it’s not the most powerful system out there. We’ll compromise on that. And you can’t update your blog from an Internet Cafe in Bali. Another compromise. But I guarantee you that any HTMLer can create CityDesk sites, and anyone who can use a word processor can manage a site that was created for them by an HTMLer, and you’ll never need a programmer.

Summary

All in all we made a bunch of decisions based on a world view that, frankly, was not very popular during the Internet boom. There was a time when you couldn’t get funding for a company that wasn’t a web pure play. And the lemming VCs thought that they had to do what all the other VCs were doing: pure web interfaces. To which I say, Thank you! Thank you for keeping your dumb overfunded companies out of my way, because now, over here in GUI land, I can’t find any real competition, and now you’re not funding anyone, so it’s smooth sailing for Fog Creek.

Working on CityDesk Part: 1 2 3 4 5

Working on CityDesk, Part Four

Boy, what a terrible weekend. I’ve come down with a cold. Fever. Runny nose. General malaise. And WININET.DLL.

WININET.DLL, you see, is a software file provided by Microsoft. It comes with Internet Explorer and with all versions of Windows since about 1996, and it was the bane of my weekend. And it illustrates a fascinating fact about how much software developers’ day-to-day lives have changed in the last decade.

Here’s what happened. Late on Friday afternoon I was setting up some new web servers designed to handle the increased load we’re expecting when we ship CityDesk. One of the first things I did was try to use CityDesk to copy files to the new servers using the FTP protocol. We had a firewall set up that prevented FTP from going through. CityDesk froze up.

“No problem,” I thought, “I’ll use passive-mode FTP, which can get through that firewall.”

That’s when I noticed CityDesk doesn’t support passive-mode FTP.

“OK, how hard can that be to implement? It’s probably available as an option on the file transfer library we’re using.” One checkbox and I’m done.

But … where’s the checkbox?

No mention of it in the documentation.

Searching the object file itself didn’t turn up anything likely.

I checked Google Groups, formerly known as DejaNews. It’s a complete archive of every UseNet discussion. This is where programmers ask each other questions about the most arcane topics. As I told Babak once: it’s a big world out there. You’re never the first person to have this problem.

After about five minutes searching, the conclusion was inescapable. Our file transfer library, which Microsoft gives away for free with the Visual Basic compiler, can’t do passive-mode FTP.

OK, back to the drawing board. What are my choices? I wrote up a list:

  1. Do without Passive FTP. This would make CityDesk useless to a large number of people, something I wasn’t willing to do.
  2. Purchase a commercial FTP library. Honestly, I’ve always had bad luck with commercial libraries, having discovered one too many times that their code quality is rarely up to the meticulous standards we set for Fog Creek. When I looked around the discussion groups where developers were discussing other FTP libraries, they always seemed to have scary bugs that I couldn’t live with.
  3. Use Microsoft’s other file transfer library, the infamous WININET.DLL. This is actually what Microsoft Internet Explorer uses to transfer files, which, despite its dismal reputation, is used so widely that I thought it had to be reasonably bug free (by now, at least). Anyway, lots of programmers use WININET.DLL and if I run into trouble I’m sure to find ample discussion of the problems on DejaNews, er, Google Groups.

I thought that #3 seemed painless enough. In fact I was already using WININET.DLL somewhere else in our code to import web pages, using the HTTP protocol.

A search of Microsoft’s online knowledge base revealed that you can’t really do FTP with WININET.DLL from Visual Basic code; it does some complicated stuff with threads that mean that you have to call it from C or C++. I thought the easiest thing to do would be to create my own custom FTP control written in C++, which talked to WININET. I chose to use Microsoft’s ATL library to create the custom control, because it makes the smallest files. ATL is the most complicated programming environment in the world, requiring a brain the size of Colorado and 10 years of solid experience to understand what’s going on. I have studied ATL in depth three, maybe four times in my career and I can never remember all the bizarre template crap that’s going on in there. Nobody can.

Yes, Virginia, it is possible to create a software development environment which is so difficult to use that no human being can do it. ATL and COM+ are my two favorite examples (the latter is so complicated that only one man on Earth, Don Box, actually understands everything that’s going on). C++ itself comes pretty darn close. But most programmers are too macho to admit this.

Luckily, Microsoft provides some wizards with ATL which write all the hard code for you. If you want to do something unusual, you’re on your own, so my motto was Nothing Unusual. No sudden moves. Just add some simple methods and events and get the hell outta there, hopefully with my brains non-exploded.

At one point, after I had written almost half the code, I discovered that one of the checkboxes that I had checked on the wizard which wrote the inscrutable ATL code for me was wrong. But once the code is written, it’s written. For the life of me, I couldn’t figure out what the checkbox had done and where to change it in the code. I searched MSDN (a gigabyte of documentation on programming Windows which I keep on my hard drive), the online knowledge base, and finally the entire Internet using Google, and didn’t find an easy way to change it. So I created two entirely new projects using the wizard, checking the box in one case and not checking it in the other, and then ran WinDiff, which compares two entire directories listing all the differences, to find what the checkbox really changed so I could change it in my code. (Somewhere, I had to change a hardcoded number to 131473, because I wanted my controls to be visible at runtime. A classic example of why COM programming is not for humans.)

The Microsoft documentation for WinInet was pretty decent, as these things go, but not decent enough. In the page documenting FtpOpenFile, you find both of these mutually contradictory quotes:

  1. “No file handle is returned”
  2. “Return value: Returns a handle if successful”

Well, which is it? Empirically, it wasn’t returning a file handle.

The next thing I discovered is that if there is a packet filter (a simple form of firewall) somewhere between you and the server, the code hangs trying to copy files. That’s normal; it’s the way of the Internet; there’s nothing you can do about it. After a minute, WinInet will realize that packets are not getting through and will time out. But the user is likely to get impatient long before a minute is up and hit the Cancel button.

When you hit the Cancel button, my code tells WinInet to give up and close down the connection. But as I discovered, if you do this in one of these packet-filter situations, WinInet will simply crash, bringing your program down with it. It’s clearly a bug in Microsoft’s code. An exhaustive search of all my Internet sources found a couple of people reporting the same crashing behavior, but nobody had a workaround.

How can that be? I thought. Since Internet Explorer uses the exact same code, wouldn’t Internet Explorer crash in the same situation?

I tried it. What I discovered is that Internet Explorer doesn’t crash in this situation — it shows an hourglass and freezes for a couple of minutes, waiting for the time-out it knows it will get. This proves that Microsoft’s programmers knew about this bug in WinInet and worked around it, instead of just fixing the code in the first place. Stupid stupid stupid. For the umpteenth time, I found myself dependent on a code library which had a crashing bug that was unacceptable in code I shipped. What are you supposed to do if you’re the chef at Les Halles and your fishmonger is giving you smelly fish?

Another two hours of investigation and experimentation. Finally I decided that in this case, when the user hits the Cancel button, instead of freezing like Internet Explorer, I will simply hide the file transfer so it looks like the operation has been cancelled. In the background, invisible to the user, I’ll wait around for the time-out to happen.

So, as I said, developers’ lives have changed. All weekend I couldn’t sleep. Tossing and turning in sweat-drenched sheets, I had feverish ATL nightmares. Sunday morning I got up at 3 am and coded for 4 hours just to avoid the bad dreams about Structured Exception Handling.

Ten years ago, to write code, you needed to know a programming language, and you needed to know a library of maybe 50 functions that you used regularly. And those functions worked, every time, although some of them (gets) could not be used without creating security bugs.

Today, you need to know how to work with libraries of thousands of functions, representing buggy code written by other people. You can’t possibly learn them all, and the documentation is never good enough to write solid code, so you learn to use online resources like Google, DejaNews, MSDN. (I became much more productive after a coworker at Google showed me that you’re better off using Google to search Microsoft’s knowledge base rather than the pathetic search engine Microsoft supplies). In this new world, you’re better off using common languages like Visual Basic and common libraries like WinInet, because so many other people are using them it’s easier to find bug fixes and sample code on the Web. Last week, Michael added a feature to CityDesk to check the installed version of Internet Explorer. It’s not hard code to write, but why bother? It only took him a few seconds to find the code, in VB, on the Web and cut and paste it.

We used to write algorithms. Now we call APIs.

Nowadays a good programmer spends a lot of time doing defensive coding, working around other people’s bugs. It’s not uncommon to set up an exception handler to prevent your code from crashing when your crap library crashes.

Times have changed. Welcome to a world where the programmer who knows how to tap into other people’s brains and experience using the Internet has a decisive advantage.

Working on CityDesk Part: 1 2 3 4 5

Working on CityDesk, Part Three

UNM Humanities Building, where my dad had his officeLike all good hackers, I have been programming since junior high, using my dad’s account on the University of New Mexico IBM 360 mainframe. But the first real computer science course I took was during the first year of college. The course covered C and assembler. It had hundreds of students, most of whom, like me, had been fooling around in Pascal or Basic since they were toddlers.

The course proceeded merrily until one day, the professor introduced pointers.

A rock in Bandelier Park, New Mexico

Dread.

Suddenly, the majority of the students in the class were in deep trouble. For some reason, some people just do not seem to be able to write code with pointers in it. They were born without the part of the brain that does indirection, I guess.

Since then, I’ve mentally divided the world into three groups. The largest group of people can’t program at all. There’s another, smaller group of people who can program, but not with pointers. And there’s a tiny group of people who can program, even with pointers. Those elite few can even understand what it means to write CString*& in C++.

My first job at Microsoft was putting a decent scripting language into Excel. Although I was given free rein to implement whatever language I saw fit, we went with Basic for several reasons. (1) We had a Basic compiler team in house. (2) You didn’t need pointers, which meant that (3) more people are comfortable using Basic than any other language. Indeed Visual Basic is the best-selling language product of all time.

Visual Basic is an extremely productive way to write code, especially GUI code. Want bold text on a dialog box? It’s one click in VB. Now try doing it in MFC. You have to create a subclassed control, it’s a big mess, you have to know all about LOGFONTS and Windows window subclassing and a bunch of other things and you need about three lines of code once you have the magic class.

But many VB programs are spaghetti, either because they’re done as quick and dirty one-offs, or because they’re written by hack programmers without training in object oriented programming, or even structured programming.

What I wondered was, what happens if you take top-notch C++ programmers who dream in pointers, and let them code in VB. What I discovered at Fog Creek was that they become super-efficient coding machines. The code looks pretty good, it’s object-oriented and robust, but you don’t waste time using tools that are at a level lower than you need. I’ve spent years writing code for C++/MFC and years writing code in Visual Basic, and let me tell you, VB is just much, much more productive. Michael and I had a good laugh today when we discovered somebody selling a beta crash-reporting product at $5000 for three months that Michael implemented in CityDesk in two days. (And we actually implemented a good part of ours in C++/ATL). And I also guarantee you that our Visual Basic code in CityDesk looks a lot better than most of the code you find written in macho languages like C++, because we’re good programmers, and we write comments, and our variable names are well-chosen, and we do things the simple way, not the clever way, and so forth.

I’ll go out on a limb here. In my years of experience, I have seen many language and programming fads come and go. But there’s only ONE, that’s right, ONE language feature I’ve ever seen that actually improves your productivity significantly. No, it’s not object oriented programming; no, it’s not intentional programming or assertions or programming by example or CASE or UML or XML or Java. The only thing that improves your programming productivity is using managed code – that is, using a language in which memory management is automatic. Java and .NET languages do this with garbage collection; VB does this with reference counting; I don’t care how you do it, just let me concatenate strings without thinking about where the new bigger string will go and I’ll be happy.

One of the things about Visual Basic is that it doesn’t always give you access to the full repertoire of Windows goodies that you need to make a polished application. But what it does do, better than almost any other programming environment, is let you drop into C++ code (or call C APIs) when you’re desperate or when you need that extra speed. For example, you can always get the HWND of a control and do native stuff to it, which is not the case in Java. As another example, a lot of the non-GUI, time-sensitive inner loops, like the word counter, in CityDesk are actually implemented in C or C++ for speed. This ability gave us the confidence to use Visual Basic even though it can’t do everything and it tends to do string processing slowly. But since we’re all C++ programmers, we have no fear of creating a DLL or OCX to count words, or parse script, or call a Windows API. So about 5% of CityDesk is actually in C or C++, and we’ll probably move a little bit more of the code to C++ to speed up a few more inner loops.

Now, Visual Basic is not the perfect programming language. It’s fairly object oriented but there are little things that you can’t do with it, like have base classes with implementation reuse. It’s limited to Windows. And the worst part about coding in VB is that people think you’re not cool because your code doesn’t have {‘s and }’s. I can live with the shame if it means I’m more productive.

Philosophically, I think that C# has a bright future in the Windows GUI programming world. It’s not as embarrassing as VB, and it uses the type of syntax which C/C++/Java programmers have come to love. VB programmers looking to upgrade can’t upgrade painlessly to VB.NET, because there are so many major differences in the programming environment. Even Microsoft admits that you can’t port from VB to VB.NET, you have to rewrite. And that’s enough of a pain that many VB programmers will use this opportunity to look around at what else is out there. I think many will choose C#, because it’s virtually the same language as VB.NET with slightly different syntax and vastly less stigma attached to it.

What about Java? Yes, I’ve used Java extensively, but unfortunately the language, the code libraries, and especially the GUI libraries are just too primitive for a commercial desktop application. I like the language and I appreciate the benefit of write-once-run-anywhere, but frankly not a lot of desktop software is sold for Sun Solaris and I think that WORA benefits Sun more than it benefits software developers, and I’m not willing to write an app that behaves in an inferior way on 95% of my customer’s computers to benefit the 5% with alternate platforms. Every Java app LOOKS like a Java app, takes forever to launch, and just doesn’t feel completely native. Since CityDesk’s competitive advantage comes from having an excellent GUI, that’s one area where I refuse to skimp.

I am virtually certain that I will now receive a million emails from lovers of tcl/tk, or Delphi, or C++ Builder, or NextStep, or Cocoa, or perl, or Python, or RealBASIC, or some other programming environment which may or may not be suitable for creating a professional Windows GUI. That’s nice. I don’t really want to get into a debate about language features or programming environment features — at some point, you have to stop debating and write code! I’m just trying to explain why we chose to use VB for the GUI part and C++ for time sensitive bits of code. If you do want to have a fun religious war over programming languages, please do so on the discussion group!

Working on CityDesk Part: 1 2 3 4 5

Working on CityDesk, Part Two

I’ve been trying to get a discussion board running for three days now, without much success. Installing server software is just much harder than installing Windows software. There are always multiple complicated steps that involve permissions, accounts, database servers, dependencies (do you have the absolute latest version of perl?), superuser, and web servers. If you’re using free software, nobody wants to volunteer to make a decent setup program (or even documentation, half the time), so you’re generally on your own there. And when you’re using commercial software, the vendor would usually like to sell you a three-week integration consulting project so they can make another $50,000 copying some files and editing some configuration files. (We have a double-click SETUP program for FogBUGZ that works pretty well, but there are still people with funny configurations where it doesn’t set all the permissions correctly.)

Untitled, Steven Harvey, Oil on Canvas

I’m also a bit particular about what discussion software we use. The purpose of the board will be a place for CityDesk beta testers (and later, users) to ask questions, provide feedback, and share ideas. In designing a UI for anything, the very first question you always want to ask is: who is the user? Specifically: are most users casual, occasional users, or are these users who will be spending all of their time using your program? For casual users, learnability and simplicity are more important than usability and power. In that sentence, by “learnability,” I mean, the ability for novices to figure out how to get tasks done rapidly. By “usability,” I mean only the ability to do tasks in a convenient and ergonomic way without making mistakes and without needing to do repetitive tasks. A data entry system that minimizes keystrokes by prefilling things and automatically jumping from field to field is more usable for experienced users, but it’s harder to learn because it behaves unexpectedly to a novice.

Most people using the Fog Creek bulletin board will be going there to ask a question or raise an issue. Certainly in the early days, learnability and simplicity are our priority. When I looked a bunch of discussion boards, I found that most of them have their heritage in the BBS world, where the same people log on to chat for 4 hours every night. Those people love features like a geegaw that lets you put a graphical smiley in your posting, and the ability to upload snapshots of their ugly mugs to appear next to their postings, and the ability to click a button and never see postings by the blithering idiot who wrote this one again. All of these features are neat for power users but they just clutter up the interface for novices.

Juno's Read and Write tabs(Longtime Juno email users may have noticed that as time went on, the big Read and Write tabs got smaller and smaller. In the early days, where almost every Juno user was a new Juno user, it was nice to have big giant buttons for reading and writing email, the most common task. As time went on, a larger proportion of our users were experienced users, who know how to “read” and “write” and would rather have more screen real estate available for other features. It’s not uncommon for a program to start out simple and evolve to be more complicated, and you can do this without hurting “average” usability because your users are getting more experienced on average.)

I spent Wednesday and Friday playing around, installing various buggy BBS systems, some of which required a Linux server, others Windows 2000, playing with DNS to move discuss.fogcreek.com hither and yon, figuring out why DNS caches weren’t flushing, installing and reinstalling database servers, and generally getting frustrated. One of the most popular packages, Discus, actually hardcoded its own URL in so many places that it needed to be reinstalled from scratch just to change the URL. (In fact, it wasn’t enough to reinstall it… you had to redownload it. The download package already had your personal URL hardcoded throughout.) And it had a perfectly terrible UI in which there was a little treeview showing folders (so far so good)… but each folder was actually a command, not a container. A broken metaphor, worse than no metaphor at all. That had to go. Then I tried IdealBB, a decidely beta package. I wouldn’t ordinarily run software that is advertised as beta, but this thing seemed to have been through many releases and it was alleged to be very close to shipping. Michael and I actually had to roll up our sleeves and debug the thing ourselves to get it sort of running, but then there were too many ASP errors and it had a tendency to crash the server. (Too bad, because it is one of the finest looking discussion boards, visually). Finally I flirted with Manila, since we have it running anyway for our weblogs, and we’ve already written a little daemon which watches Manila and restarts it when it crashes (about once every two days). But (as far as we could tell) Manila requires membership to post a message, and in my experience that is enough to turn away 90% of the casual visitors who might otherwise use the discussion board. It would be great for small elite communities of people who all post all the time, but I don’t want anything to get in the way of a beta tester casually reporting a bug.

The system I like best, believe it or not, is Lusenet, by Philip Greenspun, because it’s just super simple. That’s what I’ve been using for the Joel on Software discussion group. There are a couple of reasons we couldn’t use that. One, it is really not ready for prime time. There are actually things you can do in Lusenet that still show you the Oracle statement they just executed, as if Philip left in some debug printfs. Second, it’s not hosted on our own servers and we run the risk of it going away, taking our valuable data with it. Third, to host it on our own servers requires AOLServer and Oracle, and we don’t have the former, and can’t afford the latter.

When I got home, grumpy from all the time I had wasted, I realized that the software I want consists of exactly one table, and I could write the thing myself in less time than in took me to install some of these packages. Which I did. Two hours of work (ASP, Microsoft Access, and VBScript) and I had banged out a system that did pretty much everything I wanted (which is not much!) Check it out at http://discuss.fogcreek.com/joelonsoftware. (If you have trouble reaching it that’s because of all my DNS messing around. You’ll have to wait a couple of days for caches to flush.)

There’s a lesson in here somewhere, but I’m already well past 1000 words. In the past I wouldn’t have cared about word counts because I didn’t know what they were, but now I’m using CityDesk which keeps a running word count in the corner of the screen. So we’ll talk about the lesson tomorrow!

Working on CityDesk Part: 1 2 3 4 5

Working on CityDesk, Part One

I’ve been rather quiet lately on this weblog — mainly because we’ve been working so hard at Fog Creek getting ready for the beta of CityDesk, our flagship product. But I’d like to spend some time now talking about the design and development of CityDesk, because it’s a great case study for the kind of software development practices that I’ve been advocating here for more than a year. Over the next few columns, I’ll focus on “The Story of CityDesk,” with a look at some of the behind-the-scenes, day-to-day stories of a real software development project.

We’re launching the beta on October 15th — just three days away! This is officially On Time. We made a schedule for shipping CityDesk way back in June. We figured out estimates for all the remaining tasks and bug fixes, added them up, and got October 15th.

Every couple of weeks, we checked through our task list, revised estimates, and so forth. We’ve found a lot of bugs over that time and added a lot of small features, but we’ve also killed a lot of features that we just won’t have time for. And lo and behold, now we are almost completely done. Michael's cool crash handler actually enters diagnostics right into FogBUGZ. Most of what we’re doing these days is getting set up to run the beta. Michael added a dialog box to CityDesk for sending feedback. He also wrote some very spiffy code that catches any unhandled exception on any copy of CityDesk running anywhere in the world, prevents the app from crashing, and pushes the exception info up to our FogBUGZ bug tracking database here in New York City. I’ve been fixing bugs, writing the help file, and writing some pages for our corporate web site that will explain what CityDesk is and why you would buy it.

A common misconception, I assume popularized by Hollywood, is that as you get closer to shipping software, activity becomes frenetic as everybody scrambles to finish all the things that need to be done in time for the deadline. In the typical crappy movie, there’s a mad rush of typing in a room full of cool alterna-dressed programmers with found-object earrings and jeans jackets. Somebody stands up and shouts to the room in general “I need the Jiff subroutine! Somebody give me the Jiff subroutine!” A good looking young woman in Vivienne Tam urbanwear throws a floppy disk at him. “Thanks!”

Yeah, most programmers are as cute as Ryan Phillipe. That's why I'm in this field.As the second hand swoops towards the :00, the whole team waits breathlessly around Ryan Phillipe’s computer and watches the “copy” progress indicator as the final bits are put onto a floppy disk with less than a second to spare before the VC cuts off funding.

I suppose some software shops have last-minute coding frenzies like this. If so, their software is probably marked by incredibly poor quality. No code works the way you expected it to work the first time, and if you’re writing code up until the last minute, it’s going to suck. The first Netscape open source release, documented in the excellent movie Code Rush , demonstrates this.

On good teams, the days before shipping just get quieter and quieter as programmers literally run out of things to do one at a time. (Yesterday I took the day off to explore New York City with my wee niece and nephews.)

Times Square

The number of new bugs being found has decreased to a point at which we feel confident about releasing the beta. It is crucial to get to zero known bugs (what Netscape famously called “Zarro Boogs”) before releasing a beta. If you don’t, you’ll waste a lot of time during the beta reading 200 emails about a bug that you already knew about. And you’ve just used up time and goodwill of those 200 beta testers, so they may not bother telling you about the next bug they find, something you didn’t know about. Or the bug may stop them from trying other parts of the program that needs some pounding. This seems self-evident, but almost every time I’ve been on a real product, everybody starts to think that releasing the beta on time is more important than releasing a Zero Known Bugs beta. (After all, it’s ok to have bugs in the beta, they say. And I agree: it is ok to have bugs in the beta, just not known bugs.)

I’ll keep posting The Story of CityDesk over the next few days, keep an eye out for frequent updates.

Working on CityDesk Part: 1 2 3 4 5

What is the Work of Dogs in this Country?

How naive were we?

We had assumed that Bezos was just reinvesting the profits, that’s why they weren’t showing up on the bottom line.

Last year, about this time, the first big dotcom failures started to hit the news. Boo.com. Toysmart.com. The Get Big Fast mentality was not working. Five hundred 31-year-olds in Dockers discovered that just copying Jeff Bezos wasn’t a business plan.

The past few weeks have felt oddly quiet at Fog Creek. We’re finishing up CityDesk. I’d like to tell you all about CityDesk, but that will have to wait. I need to tell you about dog food.

Dog food?

Last month Sara Corbett told us about the Lost Boys, Sudanese refugees between 8 and 18 years old separated from their families and forced on a thousand mile march from Sudan, to Ethiopia, to Sudan, to Kenya. Half died on that trip, of hunger, thirst, alligators. A few of them were rescued and delivered to places like Fargo, North Dakota, in the middle of winter. “Are there lions in this bush?” one asked, riding in a car to his new home from the airport.

Peter touched my shoulder. He was holding a can of Purina dog food. ”Excuse me, Sara, but can you tell me what this is?” Behind him, the pet food was stacked practically floor to ceiling. ”Um, that’s food for our dogs,” I answered, cringing at what that must sound like to a man who had spent the last eight years eating porridge. ”Ah, I see,” Peter said, replacing the can on the shelf and appearing satisfied. He pushed his grocery cart a few more steps and then turned again to face me, looking quizzical. ”Tell me,” he said, ”what is the work of dogs in this country?” [New York Times Magazine, April 1, 2001]

Dogs. Yes, Peter. Fargo has enough food, even for dogs.

It’s been a depressing year.

Oh, it started out so amusing, we all piled into B2B and B2C and P2P like a happy family getting in the Suburban for a Sunday outing to the Krispy Kreme Donut Shop. But wait, that’s not even the amusing part, the amusing part was watching the worst business plans fail, as their stock went from 316 to 3/16. Take that, new economy blabbermouths! Ah, the schadenfreude. Ah, the glee, when once again, Wired Magazine proves that as soon as it puts something on the cover, that thing will be proven to be stupid and wrong within a few short months.

Ooh, sorry, did you buy the Wired Index?

And with this New Economy thing, Wired really blew it, because they should have known by then what a death kiss their cover was for any technology or company or meme, after years of touting smell-o-rama and doomed game companies and how PointCast was going to replace the web, no, wait, PointCast already replaced the web, in March 1997.  But they tempted fate anyway, and didn’t just put the New Economy on the cover, they devoted the whole goddamn issue to the New Economy, thus condemning the NASDAQ to plummet like a sheep learning to fly.

picture-small-goats:

But joy at other’s misfortune can only entertain us for so long. Now it’s just getting depressing, and I know the economy is not officially in a depression, but I’m depressed, not because so many stupid startups went away, but because the zeitgeist is depressing. And now we have to eat dog food instead of Krispy Kremes.

Which is what we’re doing, because life goes on. Even though everybody’s walking around with their chins glued to their chests, mourning about the hours they devoted, ruining their health and love lives for the sake of stock options in SockPuppet.com, life goes on. And the product development cycle must go on, and we at Fog Creek are getting towards the part in the product development cycle where you have to eat your own dog food. So for a while we’re Dog Creek Software.

Eating your own dog food is the quaint name that we in the computer industry give to the process of actually using your own product. I had forgotten how well it worked, until a month ago, I took home a build of CityDesk (thinking it was about 3 weeks from shipping) and tried to build a site with it.

Phew! There were a few bugs that literally made it impossible for me to proceed, so I had to fix those before I could even continue. All the testing we did, meticulously pulling down every menu and seeing if it worked right, didn’t uncover the showstoppers that made it impossible to do what the product was intended to allow. Trying to use the product, as a customer would, found these showstoppers in a minute.

And not just those. As I worked, not even exercising the features, just quietly trying to build a simple site, I found 45 bugs on one Sunday afternoon. And I am a lazy man, I couldn’t have spent more than 2 hours on this. I didn’t even try anything but the most basic functionality of the product.

Monday morning, when I got in to work, I gathered the team in the kitchen. I told them about the 45 bugs. (To be fair, many of these bugs weren’t actual defects but simply things that were not as convenient as they should have been). Then I suggested that everybody build at least one serious site using CityDesk to smoke out more bugs. That’s what is meant by eating your own dog food.

Here’s one example of the kind of things you find.

I expect that a lot of people will try to import existing web pages into CityDesk by copying and pasting HTML code. That works fine. But when I tried to import a real live page from the New York Times, I spent a whole day patiently editing the HTML, finding all the IMG links (referring to outside pictures), downloading the pictures from the web, importing those pictures into CityDesk, and adjusting the IMG links to refer to the internal pictures. It’s hard to believe, but one article on that web site contains about 65 IMG links referring to 35 different pictures, some of which are 1 pixel spacers which are very difficult to download using a web browser. And CityDesk has a funny compulsion to change the name of imported pictures internally into a canonical number, and it doesn’t even have a way to find out what that number is, so the long and the short of it was that it took me one full day to import a page into CityDesk.

It was getting a bit frustrating, so I went and weeded the garden for a while. (I don’t know what we’ll do to relieve stress when it’s all cleaned up. Thank God we can’t afford a landscaping service.) And that’s when it hit me. Hey, I’m a programmer! In the time it took me to import one page and adjust the pictures, I could write a subroutine that does it automatically! In fact, it probably took me less time to write the subroutine. Now importing a page takes about half a minute, instead of one day, and is basically error-free.

Wow.

That’s why you eat your own dog food.

And when Michael started importing some sites himself, he found about 10 bugs I had baked in by mistake. For example, we found web sites that use complicated names for pictures that cannot be converted to file names when you import them because they contain question marks, which are legal in URLs but not legal in file names.

Sometimes you download software and you just can’t believe how bad it is, or how hard it is to accomplish the very simple tasks that the software tries to accomplish. Chances are, it’s because the developers of the software don’t use it.

I have an even more amusing example of failing to eat dog food. Guess what email product was used internally at Juno Online Services? [If you’re just tuning in, I worked on the Juno client team for a few years].

Hmm, did you guess Juno? Since that was, uh, our product?

No. A couple of people, including the president, used Juno at home. The other 175 of us used Microsoft Outlook.

And for good reasons! The Juno client was just not such a great email client; for two years the only thing we worked on was better ways to show ads. A lot of us thought that if we had to use the product, we would have to make it better, if only to stop our own pain. The president was very insistent that we show popup ads at six different points in time, until he got home and got six popup ads, and said, “You know what? Maybe just two popups.”

AOL was signing up members at a furious rate, in part because it provided a better user experience than Juno, and we didn’t understand that, because we didn’t eat our own dog food. And we didn’t eat our own dog food, because it was disgusting, and management was sufficiently dysfunctional that we simply were not allowed to fix it, and at least make it tolerable to eat.

Anyway. CityDesk is starting to look a lot better. We’ve fixed all those bugs, found some more, and fixed them, too. We’re adding features we forgot about that became obviously necessary. And we’re getting closer to shipping! Hurrah! and thankfully, we no longer have to contend with 37 companies, each with $25 million in VC, competing against us by giving away their product for free in exchange for agreeing to have a big advertisement tattooed on your forehead. In the post-new economy, everybody is trying to figure out how much they can get away with charging. There’s nothing wrong with the post-new economy, if you’re smart. But all the endless news about the “dot coma” says more about the lack of creativity of business press editors than anything else. Sorry, fuckedcompany.com, it was funny for a month or so, now it’s just pathetic. We’ll focus on improving our product, and we’ll focus on staying in business, by listening to our customers and eating our own dog food, instead of flying all over the country trying to raise more venture capital.

Spring in Cambridge

Even though the weather is in the 40s, and the wind chill is wintry, today I woke up and went outside and felt like spring was finally in the air. Maybe I started to notice this last Friday when I went back to the Fog Creek garden in a t-shirt and wasn’t actually cold. 

Spring conjures up a funny set of associations for me. I think about the fact that it’s still daylight when you go outside after dinner. I think about living in a nice old messy house with real wood floors, lots and lots of books, and flowers in the garden. I think about two of my favorite places in the spring, Berkeley and Boston, cities that revolve around learning and education and the unbridled enthusiasm of youth and the belief that anything is possible. As a first year student I arrived at university without even the basic premise of the shape my life would take, but compared to the unspeakably horrible time I had doing compulsory military duty, any shape of life sounded utopian to me. I could be in theatre! I could write for the school newspaper! Politics! Hacking! Teaching! Art! I could become a competitive swimmer! Play piano! All possible!

Around 1993, I think, I finally got access to the World Wide Web at home. $35 a month for SLIP from Panix. One of the first things that captivated me was Travels with Samantha, by Philip Greenspun. The pictures were almost impossible to make out in 16 VGA colors; the connection was at 14.4 baud; the screen was 640×480. But Greenspun’s style of personal journal was captivating. He told us how lonely it was to live in Las Alamos for three months. He told us how to build an online community. He told us about his $5000 Viking stove.

Greenspun has been a key inspiration to many of us. He started a company called arsDigita, which was really just another consulting company for the Internet, but it had two things which made it uniquely different.

It had a personal voice. When you went to the home page, there was a happy little note about the new offices or the latest course offerings; the style was not boring corporate/marcomm happy-talk, it was Philip telling you that if you’re poor, there are some free services, but if you’re rich, head on over here and we’ll build you a nice service of your own, but bring a lot of money please, because the future’s so bright you gotta wear shades.

And it had education. The web was new and exciting and offered unlimited possibilities. Everybody wanted to learn about it and arsDigita would teach you, for free, starting with a book that cost about $15 more than it should because it was stuffed with completely irrelevant full-color photographs that seemed to be there solely to assuage the author’s egotistical photography hobby, but which really were so bright and colorful and optimistic that nobody could read that book (or any of the photography-filled pages of Greenspun’s websites) without becoming optimistic and excited about the future of the Internet, and when you did, there was a whole chapter on how to start a consulting company just like Philip and get rich off of the Internet, complete with suggested prices and detailed calculations of how you will make $250,000 a year and be able to buy $5000 Viking stoves and a beautiful condo in Harvard Square, in the spring, with flowers budding and optimist young students everywhere practically popping out of their tanktops. Stream of consciousness, random, and interspersed with completely random bogus snippets of half-baked Oracle SQL statements but goddamn it, there was a personal voice there.

(I know, anybody at arsDigita will tell you “no, Joel! It’s open source that made arsDigita different.” Or community. Or content management software. Sorry, all that stuff bores me. What excites me was the way arsDigita had a personal voice on the Internet that made it possible to relate to it in a human way, which is what people want to do, since we’re humans.)

Fog Creek Software was inspired in no small part by arsDigita; the code name for the company was “PaxDigita” and we took inspiration from Greenspun’s tongue-in-cheek corporate slogan “We don’t have venture capital; we have money.” ArsDigita thought that it was enough to be profitable. 

There were two things I thought that arsDigita did wrong. First of all, they didn’t understand that consulting doesn’t scale so well; if you really want to make an impact, you need to license software. A consultant working on a project makes ten or twenty percent profit. If you license software, each marginal unit you sell is 100% profit. That’s why software companies have valuations that are something like 25 times richer than consulting companies. Consulting is important and keeps us in touch with customers, and it funds the software development, but our goal is to be a software company. (ArsDigita seems to have learned this lesson. Allen Shaheen, the CEO, writes that “we are also considering developing and marketing additional software products using a different type of licensing arrangement. This investment is in addition to our investment in ACS as an open source product. We are considering this approach because it would allow us to accelerate our development of new and expanded features.” That’s happy talk for “the new features are going to cost you from now on.”)

The other thing I didn’t like about arsDigita was an almost religious aversion to Microsoft. There was a deep belief that arsDigita “eliminated risk” by never using a Microsoft product. The truth is, they just didn’t know anything about Microsoft products. Their religious bigotry struck me as just that: bigotry. In the words of Lazarus Long, “I don’t believe anything. Gets in the way of learning.” If you say, “we know Unix better, so we get better results with Unix,” you’re a scientist. If you say, “we don’t use any Microsoft crap, it all sucks” you’re just a fanatic, and it’s getting in the way of doing good work. (Greenspun has learned this lesson too; almost all of his bitterness to Microsoft seems to have dissipated and he finally admits, “I’m not motivated to fight a religious war for any particular database management system or Web server.” It’s nice to watch smart people learning.)

But there’s one thing that arsDigita did very, very, right, and that was the personal voice thing. The biggest thing I think about with Fog Creek is how to maintain that personal voice, and if we can do it, I’ll owe a big, big debt to Philip Greenspun.

For ArsDigita, the spring is over. ArsDigita found it impossible to hire engineers during the dotcom bubble without the promises of IPO riches, and took $38 million in venture capital. They grew from 5 to 250 people.

And then the excited exuberance gave way to a completely collapsed market for Internet consulting services. 

The VCs promptly installed boring old-school management who insisted on a happy-talk marcomm website with complete bullshit like “Fully integrated through a single data model to ensure consistency across all business processes, ACS empowers organizations to create Web services that maximize value for the end-user.”

Philip left.

The voice is gone.

To me, arsDigita will always remind me of those silly irrelevant pictures of pretty girls rollerblading and clam stands and boats in the Charles and sunsets in New Mexico and flowers and spring, and if you built it, they will come, and if you have a voice on your website, they will care. When I go to the website today and read that they “maximize value for the end-user,” I think of that dazzling, exciting kid you knew freshman year, who jumped from varsity squash to journalism to drama to campus politics, and was all fired up to change the world. But he got married, has 2 1/2 kids, took a job as an insurance agent and wears a gray suit to work every day.

Maturity is kind of sad.

But spring is in the air, and that makes up for it.

Command and Conquer and the Herd of Coconuts

Recently I’ve been thinking a lot about what made Microsoft a great company to work for, and why working at Juno was so frustrating. At first I attributed the difference to the east coast/west coast thing, but I think it’s a bit more subtle than that.

One of the most important things that made Microsoft successful was Bill Gates’ devotion to hiring the best people. If you hire all A people, he said, they’ll also hire A people. But if you hire B people, they’ll hire the C people and then it’s all over. This was certainly true at Microsoft. There were huge branches of the Microsoft tree filled with great people; these businesses were perennially successful (Office, Windows, and the developer products). But there were also branches that were just not as successful: MSN failed again and again and again; Microsoft Money took forever to get going, and Microsoft Consulting Services is full of airheads. In each of these cases it’s pretty clear that a B leader built up a business unit full of C players and it just didn’t work.

So: hire smart people who get things done. (See The Guerrilla Guide to Interviewing) It’s so important there’s a web page all about it and it’s one of PaxDigita’s most important goals. And when it worked, it worked quite well at Microsoft.

When I first went to interview at Juno, I wasn’t sure that this was the company for me. The software looked a little bit goofy. The whole company smelled like a bunch of Wall Street Jocks who heard that the Internet was the New New Thing and they wanted in. So I called them up and said, “well, I’m not sure if I want to work there, but I’d like to learn more about your company.”

In typical arrogant manner, they proceeded to schedule me for a whole battery of interviews and made me prove that I was good enough to work there. At first I was a little bit miffed, because I hadn’t really decided that I was interested in the job in the first place. But then I realized that any company that has such a rigorous hiring process as Juno does has got to be full of smart people. This alone impressed me enough to take the job. Sure enough, Juno (and D. E. Shaw, the corporate parent) were full of geniuses. The receptionists had PhDs. The place was full of people like Dr. Eric Caplan, a former associate professor from U. Chicago who was writing technical documents for the intranet for heaven’s sake. Talk about overqualified.

For a couple of years, I was very happy at Juno. It seemed like a dream job. (I even got interviewed for a website called dreamjobs.com talking about how dreamy Juno was). One thing was a little bit strange — after about a year, I was promoted to be in charge of exactly 2 people on my little team of 6. In fact the average manager at Juno has about 2 reports, a stunningly vertical organization by any standard.

The next thing that I started noticing was strange is that a lot of people were leaving. In the wired wired world of Silicon Alley that’s not uncommon. What was weird is how consistent their reason for leaving was: none of them felt any possibility of moving up in the company.

So what we had here was a company that was already as vertical and hierarchical as imaginable, with about 50 managers for a total of 150 employees, and people were complaining because there was no hope of moving up

What’s going on here?

At Microsoft, for some reason, lots of people at the lowest rung of the hierarchy were completely satisfied with their jobs and were happy to go on doing them forever. Whereas at Juno, people in the same positions rapidly got frustrated and wanted to leave because they couldn’t get promoted.

I think the crucial difference here was in the corporate culture, specifically, in the way management worked.

At Microsoft, management was extremely hands-off. In general, everybody was given some area to own, and they owned it. Managers made no attempt to tell people what to do, in fact, they saw their role as running around, moving the furniture out of the way, so their reports could get things done. 

There were some great examples of this. Managers always refused to resolve conflicts. Typically what would happen is that a designer would get into an argument with a developer over what a feature should look like. They would argue back and forth, discussing the issue for an hour, and eventually, failing to reach agreement, they would stomp into some manager’s office hoping for a resolution. Now you’ve got three people in the room: a designer, a developer, and a manager. Who’s the person who knows least about the problem? Obviously, it’s the manager — who was just hauled in at the last minute for Conflict Resolution. At Microsoft, the manager would usually refuse to make the decision. After all, they have the least information about the problem. The manager would generally force the designer and developer to work it out on their own, which, eventually, they did.

At Juno, quite the opposite was the case. Nobody at Juno owned anything, they just worked on it, and different layers of management happily stuck their finger into every pie, giving orders left and right in a style which I started calling hit and run management because managers tended to pop up unannounced, give some silly order for exactly how they wanted something done, dammit, without giving any thought to the matter, and leave the room for everyone else to pick up the pieces. The most egregious example was the CEO and president of the company, who would regularly demand printouts of every screen, take them home, and edit them using a red pen. His edits were sometimes helpful (spelling and grammar corrections), but usually, they demonstrated a complete lack of understanding as to what went into the screens and why they said what they said. For months later, we would have meetings where people would say things like “Charles [the CEO] doesn’t like dropdown list boxes,” because of something he had edited without any thought, and that was supposed to end the discussion. You couldn’t argue against this fictional Charles because he wasn’t there; he didn’t participate in the design except for hit and run purposes. Ouch.

Hit and run management is but one symptom of what I would call Command and Control Management… something right out of the General Motors 1953 operations manual. In a particularly political company, it even becomes worse — more like Command and Conquer management. It’s completely inappropriate because it makes people unhappy, it causes the person with the least information to make the decisions, and it doesn’t allow a corporation to take advantage of all the talents of the people it hired. If, like Juno, the corporation had been careful only to hire the brightest, most talented people, then it squandered an incredible resource and made those talented people frustrated as all hell.

When is Command and Conquer management acceptable? Well, it might work where your company is a Herd of Coconuts — a bunch of underqualified morons who need herding. Perhaps something like the workfare corps that does the raking in Central Park. But software companies aren’t Herds of Coconuts, and Command and Conquer doesn’t cut it.

PaxDigita Culture

So this is why I’m concerned with creating the right culture of hands-off management at PaxDigita. In general:

  • everybody owns some area. When they own it, they own it. If a manager, or anybody else, wants to provide input into how that area is managed, they have to convince the owner. The owner has final say.
  • every decision is made by the person with the most information.
  • management is extremely flat. Ideally, managers just don’t have time to get their fingers in the pies of their reports. You may be interested to read about a GE plant in North Carolina that has 170 employees who all report directly to the plant manager.

As a result, we will build a democratic culture where everybody runs the company. On a day to day basis, PaxDigita people have no boss — they run themselves.