SysAdmin Week

After a bit of a scare discovering that a few of our critical files were not getting backed up, and with various system administration things starting to cross from annoying into the category of downright emergencies, I am going to spend a few days focused on improving our network infrastructure.

All of our backups are done to hard drives, not tapes. It’s not that much more expensive than tape, and it’s a lot more convenient. For example all our workstations and laptops are backed up using Veritas NetBackup Pro which creates hard-drive based backups on a server. Anyone can browse the last 5 versions of any file on their hard drive and instantly restore it; if a complete system is lost NetBackup does “bare metal restore”, and, the part I like best — if two people have the same file it is only stored once. This saves gigs and gigs of space because almost every machine here has the same OS files, the same development environment, the same full text of MSDN, etc. Servers are backed up over the Internet using Dantz Retrospect, also to a hard drive at a different location. Retrospect has the advantage of supporting “open file backup” on SQL Server databases, backing them up while they’re running. As far as I can tell, this relies on an underlying feature of Windows 2000 which allows you to make virtually instantaneous, atomic copies of any open file (Windows does this using “Copy on Write,” where the file is simply marked as being “copied,” the copy itself doesn’t take place until one copy is written to, and then only on a sector-by-sector basis). Dantz has the disadvantage of some architectural decisions that reflect its Macintosh heritage which do not really make sense… for example, rather than the traditional Windows server model of having two apps — an invisible service and a management console which controls that service — there’s just one app. This means you can only run one management console and if you lose it (e.g. someone else is running it in a different session) you can’t get in, requiring drastic process killing or rebooting. And the number of new concepts you need to learn to set up simple server backups is astonishing… it took me way too long to get things set up and then it took several weeks of occasional tinkering to get it to work, and even then it seems to get flaky and decide it doesn’t want to backup and doesn’t want to tell anyone that it doesn’t want to backup, so I have a weekly scheduled task to kick the sucker. Somewhat frustrating but I have no experience with other server backup products and suspect the others are just as bad.

I just woke up to the fact that we were paying about $6/GB for disk storage on Dell SCSI RAID arrays, and for backup media I don’t need SCSI and I don’t need RAID, so I’m going to try a LaCie Big Disk Drive connected to the backup server over USB 2.0 which is about $1.20/GB.


So far there are 136 people registered at Meetup.com. London, Toronto, and Dublin have passed the threshold of 5 members for meetings to actually be held. I was thinking it might be fun to pick the city with the most people on this list for my next vacation.


About the author.

In 2000 I co-founded Fog Creek Software, where we created lots of cool things like the FogBugz bug tracker, Trello, and Glitch. I also worked with Jeff Atwood to create Stack Overflow and served as CEO of Stack Overflow from 2010-2019. Today I serve as the chairman of the board for Stack Overflow, Glitch, and HASH.