BOFH ?

I’ve just come off of a very difficult working weekend. We consolidated approximately 2.5 terabytes of data from two file servers into one. For the uninitiated, thats an awful lot of disk. Fortunately, one of our other sites loaned us a shelf of disks that greatly assisted the process, and with ndmpcopy — newly available in the lastest OS release for these file-servers — we were able to do a majority of the data moving with the old hardware still in place and the new hardware stacked, neatly, on a couple of nearby wire storage shelves. At some point Sunday night — it’s all a blur — we rearranged the hardware, placing the new equipment in the racks and stacking the one system with remaining data on the floor, cabled and running.

Unfortunately, not everything went according to plan. The old file servers just couldn’t move the data fast enough so the move took MUCH longer than originally expected and planned for. Monday morning, and all the impatient users, found us with just over a half-terabyte of data still to move. Not all were upset at getting to go home early.

Fortunately, with the new hardware configuration, I was able to get the job finished by this morning, although I spent the rest of today fighting the expected “Where the hell did you put my data!?!” fires.

One thing I discovered through all this was that nobody reads my emails. In the past, I’d heard complaints that I wasn’t communicating enough with the users when I was planning changes. So for this change, I tried to ensure EVERYONE was informed. First, I worked with the management of each group I support to select a date (weekend) everyone was okay with. Then, every other day, for the week and a half before our scheduled shutdown time, I sent out emails to EVERYONE, informing them of the outage date & time, how long were were expecting to be down, what the impact to them would be, and that they needed to be logged off of their systems NO LATER THAN 5pm on Friday. In addition, I separately emailed the managers and their office administrators asking they forward the information on to the rest of their groups.

Friday afternoon rolls around and, confident that everyone would be logged off at 5pm, I push a job out to all workstations and servers to cause them to shutdown and poweroff at 5:30pm. It worked beautifully. At 5:30, the computer room and surrounding offices became enveloped in calming and serene silence — which was all to soon disturbed by sounds of people running back and forth, screaming “The server crashed!” “My workstation just died!” “AaaaIIIIeeeeee!!” “What happend to my $#%@ job!?!” including the poor soul off in a distant cublical who simply screamed “Noooooooo!!”.

After explaining for the third time what was happening, and no, they wouldn’t be able to do anything until Monday (or so we thought at the time), I started asking why they were still logged in. Didn’t they get my emails? Nearly to a man, they all responded with “Yeah, but I rarely read those.”

Interestingly, management had left early.

Leave a Reply

Your email address will not be published. Required fields are marked *

*