What happens when the Offbeat Empire's server goes down

It was a happy Monday here at the Offbeat Empire. A wedding we posted last week got picked up by first The Mary Sue and then Neatorama, and this morning ThinkGeek posted it on their Facebook wall. According to my beloved Chartbeat, traffic spiked at about 580 concurrent visits (meaning 580 people were viewing Offbeat Bride simultaneously) this morning, and then was settling down into high but not spiked day.

The Offbeat Empire has two servers: one for all the blogs, and then a separate smaller server for the Offbeat Bride Tribe. At exactly 1:02pm PST today, the blog server went down. I caught it the minute it happened (I was editing Stephanie's pitches for next week's Offbeat Mama posts), and here's where things went from there:

1:02: immediately fire up a chat window with my web host, Liquid Web.
1:03: send email to staff letting them know the site is down
1:05: get irritated that it's taking so long to validate my account with Liquid Web
1:08: server tech confirms that yes, the site is down
1:10: post on all sites' Facebook pages that yes: the blogs are down
1:12: server tech reboots server
1:20: start getting irritated — why is the site still down?! helloooooo?
1:25: ask for status update
1:30: server tech reports that the site rebooted, but then went back down
1:35: get too irritated to deal, and fire up separate IM window to get in touch with my preferred server tech, a wizard named TravisZ.
1:37: Call to cancel my 2pm bang trim appointment at Vain (OK NOW YOU KNOW IT'S SERIOUS)
1:40: TravisZ diagnoses
1:42: Field comments from freaking out editors who can't access sites
1:45: TravisZ reboots again and fiddles with some firewalls
1:50: Site goes back up! DANCING AND REJOICING!
1:55: Post "we're back up!" posts on all Facebook pages
2:00: Go to thinkgeek to purchase special thank you gift for Travis (it feels appropriate, right? There's the ones who sent the traffic that made the site go down!)
2:05: Give positive feedback on LiquidWeb support ticket that involves comparing TravisZ to an alligator wrestler.


We lost approximately 2000 visits in the 48 minutes of downtime, which is a very real and significant dent. For a business that ultimately is all about pageviews, every minute of downtime is basically dollars out of pocket. This Monday started awesome (TRAFFIC SPIKE YAY!) and has now reduced me to a bundle of nerves. And I still need to review Stephanie's pitches!

Back to the digital trenches…

  1. I saw that ThinkGeek post and I was like "Yay for the Empire!" And then I thought about it a second, and I was like "OSHIT FOR THE EMPIRE."

    1 agrees
    • Our server is SUPER beefy, so it should have been able to handle the spike… we're dealing with another little bug with duplicating header images, so it may have been a collision of processes that took it down.

      The worst part of server downtime is how hard postmortems are. Most of the time no one knows why a server goes down, other than "the traffic load was high and the server got tired."

      Servers: they're just like us.

      7 agree
      • I heard it was a DNS "Band Aid "that was put up to avert a complete shut down of the internet~and that the remedies do come from your server first then cleaning off your own computer -amazing virus

        • I'm not quite sure what you're referring to here, Caren — our server issues last week didn't have anything to do with DNS issues or viruses.

Comments are closed.