JOB POST DATE: Jan 10th, 2003 EMAIL RESUME TO: Brad Fitzpatrick (or questions) ================= Short description ================= Linux sysadmin to manage servers for LiveJournal.com, an online community/"blogging" website with 800,000 users and millions of pages views/day. Also, management of upcoming launch of photo hosting site, PicPix.com. ================ Necessary skills ================ -- mysql - replication - tuning - monitoring -- apache -- ntp -- backup strategies -- linux (debian, redhat) -- monitoring/graphic tools (netsaint(nagios), mrtg, rrdtool, cricket, etc) will expect you to graph and monitor tons of things. -- mail (we use postfix, but we're flexible if you prefer otherwise) -- dns -- nfs ========================== Hardware/software involved ========================== [as of Jan 2003, but new hardware is almost always on order] 2 load balancers (BIG-ip) 15 database servers (MySQL, mix of RedHat and Debian) -- some inactive, used only as occasional stand-ins -- some incredibly light, replicating small subset of tables for isolation. like directory search (uses InnoDB, not MyISAM) or mail (postfix) 20+ web servers (Debian (mostly netbooting), mod_perl and/or apache+mod_proxy+lingerd) 3 100 Mbps switches... 1 public, two internal (linked with gigabit fiber) 1 Gigabit switch (for backup network) misc machines, disk arrays, ... mail machines Offsite, for static content (cheaper bandwidth): 2 LVS boxes (locked down, "managed") + 2 of our machines (TUX + mod_perl) =========== Programming =========== Only enough to get your job done. We're not looking for a sysadmin + programmer. If everything's running smoothly and all your work is done, we want you to relax. Experience in the past has shown us that most programmer + sysadmins prefer to just program. So, we're looking for somebody that perhaps knows how to program, but doesn't necessarily enjoy it. :) ===== Other ===== -- good communication skills, *especially* if you're remote -- regular status updates -- proactive investigation in how to improve things. -- proactive configuration of fail-over services. if it's not necessary until something else breaks, it's necessary right away.