Well all grow up eventually

I suppose it is fitting that this week I really felt like a grownup in my job.  After all, yesterday marks two years in my position.  I’ve gone from being a green, inexperienced admin to the point where my position is starting to become too small for me.  So what happened this week?  On Tuesday evening, I happened to have my e-mail client open when I received a message from our mail file server complaining that a hard drive had failed.  The timing was awful, I had just settled in with my Law & Order and my martini.  I drove in anyway, and yanked the drive.  I put the shelf spare in, but that’s where the routine fell apart — our spare didn’t work!

My colleague Randy got my voicemail by that point and had hopped online to help me out.  We tried a few times to get the spare to be recognized, but the controller never saw it.  At that point, we decided to throw the old disk back into the array to see what would happen.  The controller recognized it, and said it was good.  Confused by this, we decided to add it back in as the spare.  Except we gave it the wrong command, and ended up expanding the array by one, meaning we no longer had a hot spare.  Bad news.  Fortunately we have another array, so we just changed the spare in that array to be a global spare.  So that gives us one spare for 23 disks.

Fortunately, nothing bad came from this mess, and it may actually turn out to be a good opportunity.  On Wednesday morning, once we got everything settled, I started working on a plan.  The current disks in our first array are 400GB.  If we upgrade them all to be 500GB, we get an extra terabyte of storage, and the disks in both arrays will be identical, which means we can have two global spares, which means we could lose three disks and still not have any data loss.  So what caused me to feel all grown up?  Randy and I figured out a plan and covered all the bases.  When we presented it to the rest of our team, they couldn’t come up with anything we hadn’t already considered.

Anyone can come up with a plan for things going well.  A true mark of the sysadmin is the ability to plan for massive failures.  And also paranoia and social ineptitude. 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *