PDA

View Full Version : [FQuest Alert] HUGO Server - Emergency Drive Replacement


sheila
02-05-2008, 10:43 PM
HUGO is suffering from a failing hard drive.

A technician has been dispatched to the data center to perform a hard drive replacement. We are shutting down the server as a preventative measure to preserve the data on the drives.

Additional information will be posted as it becomes available.

Our apologies for this inconvenience,

Sheila

tknterry
02-05-2008, 11:08 PM
That explains why my website is currently inaccessible. :sad:

Terra
02-05-2008, 11:23 PM
We are having a difficult time trying to get the server to boot...

No ETA at this time... :(

Terra
02-05-2008, 11:44 PM
It is official, HUGO has suffered a dual drive failure, and will need to be rebuilt from last nights backup...

SDA was the drive that RAID kicked out, and SDB was still running fine... Unfortunately, upon reboot, it was found that SDB had media failures in the critical boot sections of the drive preventing it from being used any further...

This restoration process could take a few hours to complete, and will be the 2nd time in our history that we've needed to perform this merciless act... :(

tknterry
02-05-2008, 11:56 PM
Is there some way to detect impending hard drive failures a bit earlier to prevent this from occurring, which in turn causes my web site to be down for several hours? I was just about to update my website. I guess its a good thing I didn't since you are using last night's backup and so the updates would have been lost.

Randall
02-06-2008, 12:08 AM
Is there some way to detect impending hard drive failures a bit earlier to prevent this from occurring, which in turn causes my web site to be down for several hours? Generally they do -- which is why this is only the second time it's happened. I forget which server went belly-up the last time.

Will this have an impact on IMAP accounts on Hugo?

Randall

Jeff
02-06-2008, 12:13 AM
If you search this forum for drive replacement you'll see how many times the raid system worked right to detect problems and replace a drive without any downtime other than the time to swap the drive itself.

sheila
02-06-2008, 12:13 AM
The last time this happened was in May 2005 on SAMSON (http://www.aota.net/forums/showthread.php?postid=132221#post132221). That's a total of 2 times since June 1998. Generally, yes, we do detect drive failures in advance and replace drives often with only a few minutes downtime.

The only impact this will have on email services, whether POP, IMAP or SMTP, is that the services are temporarily unavailable to you, as authentication goes through the CS (Community Server), in this case HUGO. Since HUGO is down, you can't authenticate, so you can't access the mail services.

Once HUGO is back up, all email will be available again. None will be lost or refused during this time, as email is received and stored on other servers.

Kevin
02-06-2008, 12:21 AM
It is sometimes possible to detect disks before they fail. In many cases a disk will start throwing out SMART error messages or even have a few IO errors before it gets bad enough for the RAID engine to kick it out.

Unfortunately there are also cases where a disk fails completely and suddenly.

In this case the first disk failed completely and suddenly. The second disk also started throwing out IO errors within a few minutes. We had hoped that the errors on the second disk were caused by some kind of SCSI bus issue created by the really dead first disk and that it would just start working again when the first disk was replaced.

When I got here I could hear the first disk grinding over the noise of the HVAC , all the other servers, and even my music playing headphones. There was no bringing it back.

When I replaced that disk and attempted to boot up with the second disk and a new one the boot failed. The symptoms kept shifting back and forth between the old second disk not being detected at all or its partition table not being readable. Either way it is essentially dead.

The HUGO server is currently sitting on the floor with its lid open because it has no internal CDROM drive and I had to add one to have a bootable device. It is currently booted up in the wonderful recovery tool known as Knoppix Linux and Terra is logged in from his house doing the restore from backups as I am typing this.

Randall
02-06-2008, 12:37 AM
The HUGO server is currently sitting on the floor with its lid open because it has no internal CDROM drive and I had to add one to have a bootable device. Ooo, ooo -- which camera (http://www.visionquest.net/Video.php)?

Randall

tknterry
02-06-2008, 12:37 AM
Kevin, thanks for the comprehensive and quick reply explaining your situation. I appreciate your expeditious work to solve this rare failure of dual drives. Its also great to know my web site is now sitting on the floor of the data center with Hugo's lid open ;). Its great to know its being restored as we speak. Thank you everyone at FQ for an efficient resolution.

tknterry
02-06-2008, 12:40 AM
Ooo, ooo -- which camera (http://www.visionquest.net/Video.php)?
Randall

I think it is Surveillance Camera #22. There is a guy sitting there working on something.
Kevin, wave if its you.

Kevin
02-06-2008, 12:45 AM
Camera #22 is me sitting at one of the camera computers typing in here because the camera servers are the only boxes at the data center with GUI browser capability.

HUGO is just off the bottom of camera #21 and behind the open door on camera #23

Randall
02-06-2008, 01:10 AM
HUGO is just off the bottom of camera #21 and behind the open door on camera #23 Aww. :sad:

I saw Kevin walk from 22 to 23...

Randall

Kevin
02-06-2008, 01:11 AM
Aww. :sad:

I saw Kevin walk from 22 to 23...

Randall

And back again just to see what you typed :P

Terra
02-06-2008, 01:45 AM
A couple things to be aware of:
1) Any prior STATS history is lost as we don't back that up due to the enormous disk space requirements
2) Any prior web logs, same issue as #1

What we guarantee with backups is server critical files and your data files, anything ephemeral is not backed up...

The good news is, HUGO will be defragged... :)

tknterry
02-06-2008, 01:51 AM
A couple things to be aware of:
1) Any prior STATS history is lost as we don't back that up due to the enormous disk space requirements
2) Any prior web logs, same issue as #1
What we guarantee with backups is server critical files and your data files, anything ephemeral is not backed up...
The good news is, HUGO will be defragged... :)
I haven't looked at the stats very often anyway.
Defragging is always a good thing.
I saw Kevin hard at work. What a way to spend a Tuesday night.
Thanks Terra

Wassercrats
02-06-2008, 02:58 AM
They make utilities that defrag a bit at a time in the background. It doesn't sound like a good idea to keep a server down for it. Maybe Kevin should have a cot in the datacenter so he doesn't have to sit there for hours waiting for the defragging to finish.

There's probably more to it, but that's how I picture it.

Jeff
02-06-2008, 03:00 AM
I think he means it will be defragged because all the data is being moved to the drive freshly ...

Kevin
02-06-2008, 03:00 AM
Actually, I wouldn't mind a cot right now. Unfortunately there is so much white noise here I doubt my pager would wake me up when the restore finishes and it is time to put the server back together.

Jeff
02-06-2008, 03:03 AM
...sleeping on a jet plane...

(sorry, that's probably not helpful!)

Kevin
02-06-2008, 03:06 AM
...sleeping on a jet plane...

(sorry, that's probably not helpful!)
Not my kind of tunes. I currently have Crimson from Edge of Sanity playing. For some reason playing excessively long songs makes the time pass faster for me. It also makes my iPod battery last longer. This one is 40 minutes long :yeah:

Terra
02-06-2008, 03:09 AM
I think he means it will be defragged because all the data is being moved to the drive freshly ...
That is correct...

Wassercrats
02-06-2008, 03:11 AM
Yeah. Still a wait either way though. I'd hate a job like that. I need my sleep.

What's on the back of Kevin's shirt?

Kevin
02-06-2008, 03:13 AM
I am actually glad it wasn't a little bit earlier. I had just finished eating when the first alert went off. If I hadn't had time to eat I would be really hungry by now.

Terra
02-06-2008, 03:14 AM
There really aren't any defrag utils for Linux, the filesystems themselves do a pretty good job at minimizing it...

Wassercrats
02-06-2008, 03:24 AM
What's on the back of Kevin's shirt?Guess he got self conscious. He put on his jacket.

I wanted to try Linux mainly for its file systems, but the one thing I didn't like was that it sounded like you can't stop certain file operations. So if I write a bad file transfer script, I guess I'd have to let it finish its bad stuff, or if I'm transferring a 10 GB file by accident, I'd have to wait. I think.

Kevin
02-06-2008, 03:29 AM
Guess he got self conscious. He put on his jacket.It is a Biketoberfest 2004 shirt with a wolf on the back. I put my jacket on because the computer room gets cold if you are sitting still.

I wanted to try Linux mainly for its file systems, but the one thing I didn't like was that it sounded like you can't stop certain file operations. So if I write a bad file transfer script, I guess I'd have to let it finish its bad stuff, or if I'm transferring a 10 GB file by accident, I'd have to wait. I think.

I am not sure what you mean by that. A process can be killed. Once that happens the file would either be truncated and left as is or it would go away depending on how it was being written. There is a case where if you simply delete an open file the disk space will stay allocated until the file is closed.

DogAndPony
02-06-2008, 03:37 AM
I'm so easily amused. I have Cameras 21 and 23 in two separate Firefox tabs. When Kevin walks around, I can cut between them by clicking tabs. :kewl:

Hi, Kev!

:wave:

I only have one client's domain on Hugo AFAIK, and I don't think it's very active. Lucky me... :safegrin:

I think I'll just sit here and watch you catch a cold.

:ytpopcrn:

DogAndPony
02-06-2008, 03:53 AM
Okay, I've had enough popcorn.

Thanks for burning the 3:51am oil, Kevin!

Time for me to...

:sleep:

Terra
02-06-2008, 04:07 AM
Uggggh...

Slothing through 47000+ temp files in: x....../cgi-bin/partsSQL/admin/tmp/*

And all are in a single directory - double nasty negative hit... :(

You know who you are, please clean those up and keep temporary files to a minimum...

Things like this can easily add an extra 30+ minutes to a restore process just for that *ONE* directory!

Wassercrats
02-06-2008, 04:12 AM
I am not sure what you mean by that. A process can be killed. Once that happens the file would either be truncated and left as is or it would go away depending on how it was being written. There is a case where if you simply delete an open file the disk space will stay allocated until the file is closed.I was probably thinking of this (http://linuxplanet.com/linuxplanet/tutorials/2926/3/).If the crash happens during the actual disk update, you still have the journal entry showing what was supposed to have happened. So when the system reboots, it can simply replay the journal entries and complete the update that was interrupted, or it can back out a partially completed update to restore the file's previous state.When I originally read that, I think I was imagining trying to abort something by turning off the computer, which would just make it continue when I turned it back on. But now it sounds like I'd have a choice of whether it continues.

DogAndPony
02-06-2008, 04:14 AM
:ytcoffee:

:shocked:

:ytrubeye:

:ytguido:

There oughtta be a clause...

:sprint:

Kevin
02-06-2008, 04:15 AM
nah, it doesn't really work that way. The point of journaling is to get the file system into a consistant state quickly during bootup. If you were in the middle of writing a file when the computer was shut off then that file would be incomplete but it would be there. Without journaling a file being written to would often become an "orphan" meaning that its name and location was lost leaving it as a numbered file in a directory called lost+found.

Wassercrats
02-06-2008, 04:30 AM
Uggggh...

Slothing through 47000+ temp files...Try saving 89 directories with 300,000 files in each, on NTFS. That's why I wanted to try ReiserFS. XP hardly reclaimed any disk space when I deleted them (took days). I assume my disk got corrupted either from saving or deleting the files.

Terra
02-06-2008, 04:43 AM
All files have been restored, I am now finalizing the server...

manfred
02-06-2008, 05:11 AM
First of all, thank you for the very good job!

I have asked this question before (http://www.aota.net/forums/showthread.php?t=23554) but I didn't get an answer ;)

Is it possible to add the servername or a "monitoring smiley" in the "Active Services List" - Overview? That would be helpful to determine if a package is affected?

Pleeeeeease :flowrpwr:

Manfred

tknterry
02-06-2008, 05:12 AM
All files have been restored, I am now finalizing the server...
Thank you Terra. Mission accomplished :yeah: My site is alive once more.

Terra
02-06-2008, 05:13 AM
Not just yet, both Kevin and I have a few things to do...

HUGO is down for racking...

tknterry
02-06-2008, 05:21 AM
Not just yet, both Kevin and I have a few things to do...
HUGO is down for racking...
Oh well, it was up just long enough for me to update my page. So for me it was mission accomplished :)

Terra
02-06-2008, 05:21 AM
HUGO is up, however a couple things are still needing repair:
1) bulkqmail for mailing lists
2) ApacheSSL engines (SCNC is online)

Everything else should be OK...

Terra
02-06-2008, 05:25 AM
bulkqmail is back online...

Terra
02-06-2008, 06:52 AM
ApacheSSL is fully operational

HUGO is now back in full production....

Please let us know if you encounter any problems with the restoration...

Our sincerest apologies for any inconvenience this dual drive failure has caused!

Bob
02-06-2008, 08:08 AM
First of all, thank you for the very good job!

I have asked this question before (http://www.aota.net/forums/showthread.php?t=23554) but I didn't get an answer ;)

Is it possible to add the servername or a "monitoring smiley" in the "Active Services List" - Overview? That would be helpful to determine if a package is affected?

Pleeeeeease :flowrpwr:

Manfred

A similar request to this is located here (http://www.aota.net/forums/showthread.php?postid=145546#post145546) and I have added your comments to the Feature Request that was generated from that post.

Thanks,
Bob

pvechi
02-06-2008, 09:01 AM
Hello,

I am wondering if you actually restored the server with the latest files. I am missing a lot of work that I uploaded onto the server and its not there.

Please advise

++Tx
Peter

manfred
02-06-2008, 09:12 AM
A similar request to this is located here (http://www.aota.net/forums/showthread.php?postid=145546#post145546) and I have added your comments to the Feature Request that was generated from that post.

Thanks,
Bob

Thank you :clapper:

Bob
02-06-2008, 09:14 AM
Peter,

The backup that was used would have been from a little over 24 hours ago, approximately between 2 AM ET Feb 5, 2008 - 7 AM ET Feb 5, 2008.

-Bob

pvechi
02-06-2008, 09:31 AM
Ouch!

pvechi
02-06-2008, 09:35 AM
Is that the most recent copy of the files on that server? Is there any way of retrieving files from the failed hardisks?

SteveYoung
02-06-2008, 09:52 AM
We were informed by a user that some of the audio files we posted yesterday ware not working. All we did to fix the problem was upload the files again.

Pvechi, you should be able to upload the files you did yesterday and it will resolve the problem. Unless I am missing something like data that was generated on the server.

pvechi
02-06-2008, 10:13 AM
I had actually edited some files over the server using the File Manager in CNC and uploaded some photos. The files that I edited are the concern. Is there anyway we can get those from the fried hardisk?

Randall
02-06-2008, 10:13 AM
Oh well, it was up just long enough for me to update my page. So for me it was mission accomplished :) Cheater! :rasberry:

Nothing seems to be missing, aside from yesterday's GPS data. :yeah:

Randall

hobbes
02-06-2008, 10:28 AM
I had actually edited some files over the server using the File Manager in CNC and uploaded some photos. The files that I edited are the concern. Is there anyway we can get those from the fried hardisk?

This is probably a worthwhile question for FQ ... Would FQ allow a client to cover the cost to recover files from a failed hard drive through a 3rd-party service?

SteveYoung
02-06-2008, 10:40 AM
I had actually edited some files over the server using the File Manager in CNC and uploaded some photos.

I see, I too will sometimes (more than I should) edit files using WinSCP on the server, with the intent of making a backup at some point. I fear your data would be lost, but FQ will respond shortly. The photos you could re-upload.

We all have become very used to how rarely FQ goes down, and when it does happen it generally (not this time) is not catastrophic in nature.

I slept while our website (HUGO) oldhomeday.com was down, thanks FQ.

Kevin
02-06-2008, 02:23 PM
This is probably a worthwhile question for FQ ... Would FQ allow a client to cover the cost to recover files from a failed hard drive through a 3rd-party service?

Unfortunately you are probably talking about thousands of dollars of work there. Those data recovery places make good money. If you are really serious about it we would consider it. Of course we would have to receive the repaired disk to protect the data owned by other site owners.

I hate to sound preachy but you really should always maintain a backup of your entire site somewhere outside of FutureQuest. In fact I would go as far as to say you should have a backup copy of your site running somewhere else for development purposes. We do the best we can but we only run backups once per day. The backups are a significant performance hit to the servers and it is very doubtful that we could run a backup in the middle of the day without triggering overload conditions.

We run RAID1 mirroring to protect against disk failures and we run nightly backups to protect against rare catastrophic hardware failures like this one and file corruption.

This problem started at about 10:40pm. The backups of the shared web servers start running at around 2am. That means it was almost time for another backup to run which means we were about as far as we get from the previous backup. In my experience this is a common disadvantage of backups. Disks like to fail during the extra stress of a backup which of course invalidates that backup but for some reason they also tend to break shortly before a backup. That must be one of those Murphy's law things :dunno:

Terra
02-06-2008, 02:42 PM
We were informed by a user that some of the audio files we posted yesterday ware not working. All we did to fix the problem was upload the files again.
That could have happened if the file was halfway uploaded when the backup daemon came through and copied the file... There is no easy way to prevent this from happening on a *live* server nor does the backup procedure know that a file is in transit... This is still an issue that is being researched, but I have yet to see a logical way to handle this condition without introducing kernel and VFS races that could open up security holes and/or cause problems for the application that is writing to the open file due to filesystem caching buffers...

Ouch!
I don't mean to sound obtuse, but the files we restored were backed up on 2/5/2008 between the hours of: 04:04:25 to 05:57:07 and they were not even 24 hours old yet...

The hard drives tanked before the next night's backup procedure had a chance to run... :(

On the flipside, I have read *many* horror stories of hosts that have suffered the same problem, and the clients' data is missing for two or more weeks waiting for the hard drives to be recovered by a data recovery service...

Running client data backups is not only a liability for us, but is also an enormous expense - however we do it because we feel it is the right thing to do... Most hosts approach it from the standpoint that client data backups is the client's responsibility...

pvechi, we did the best we could do in light of this catastrophic failure, and what we restored was the best we had... If we could have lifted the data off the SDB drive, we would have... That is what initially took us so long because we tried every trick in the book to get the drive talking again... SDA was just plain toast and was screeching from a dropped head... The time spent, trying to get SDB going could have ended up costing us a bundle if we exceeded our uptime guarantees... Nevertheless, we tried anyway and that added around an extra hour to the downtime...

Would FQ allow a client to cover the cost to recover files from a failed hard drive through a 3rd-party service?
I wouldn't have a problem with it, as long as:
1) the recovery service is reputable and could be trusted
2) the client is willing to wait the time for the recovery to be done
3) the client is willing to foot the costs which can range between $5,000 to $12,000 for a server drive of this type

Overall, I don't feel that expense is needed, because we make a best effort to always have a backup that is not older than 24 hours... But due to the lawsuit happy nature of America, I must state the DISCLAIMER that WE DO NOT GUARANTEE CLIENT DATA BACKUPS, we only provide client data backups as a best effort convenience for the clients... In our 10 years of operation, we have only had to retrieve data from backups twice, with SAMSON being the other server... All other times, we have been able to bend a failed RAID drive to our will and retrieve the data within our recovery window...

All in all, the backups are there for when catastrophe strikes, and when bad things happen, we like to give the client's a warm fuzzy feeling that we can step up to the plate and recover from the worst of conditions... That is why we cost a little more than bargain basement hosts, as our backside infrastructure (all the stuff you never really see) is designed to be as fault tolerant as possible within the bounds of technology... Last night's catastrophic ordeal with HUGO should hopefully make that visible to our site owners...

No alligators or crocodiles were harmed during last nights HUGO restoration... :smile:

SteveYoung
02-06-2008, 02:50 PM
...always have a backup that is not older than 24 hours...
The audio files we uploaded (and lost) were uploaded after the backup and before the crash. We uploaded them again and all was well. We all do things to save time like editing files on the server, and not having a backup.

This was a great lesson for me and a conversation I will have with others on how they edit and backup files.

Last night's catastrophic ordeal with HUGO should hopefully make that visible to our site owners...

I was sleeping, you were not, thank you for that! I got my 10 bucks worth of support for this month, that is for sure!

tknterry
02-06-2008, 02:55 PM
[q]...

All in all, the backups are there for when catastrophe strikes, and when bad things happen, we like to give the client's a warm fuzzy feeling that we can step up to the plate and recover from the worst of conditions... That is why we cost a little more than bargain basement hosts, as our backside infrastructure (all the stuff you never really see) is designed to be as fault tolerant as possible within the bounds of technology... Last night's catastrophic ordeal with HUGO should hopefully make that visible to our site owners...

No alligators or crocodiles were harmed during last nights HUGO restoration... :smile:
Hmmm, interesting, in fact I was feeling warm and fuzzy last night...but maybe that was the wine. This is the first time Ive encountered significant downtime for my website and I was impressed by how you guys and gals at FQ handled it. It is good to know that my web site that I have worked very hard on is in good hands at FutureQuest. I'm also glad you had the welfare of the gators and crocs in mind.

hobbes
02-06-2008, 03:09 PM
Terra / Kev - fully agree about client backups, and we maintain daily off-site backups of our own client's web/db data maintained at FQ. The question was geared more towards the feasibility of such a recovery if an urgent need arose. Thx.

Wassercrats
02-06-2008, 04:15 PM
I had actually edited some files over the server using the File Manager in CNC and uploaded some photos. The files that I edited are the concern. Is there anyway we can get those from the fried hardisk?They might still be in your browser's cache if you didn't view the old version after the update, but you probably did. That's why I suggested on Microsoft's newsgroup that there should be two caches of webpages. Currently, if you see a problem with a webpage, the browser's cache will have the same problem. Or maybe instead of two caches, the browser shouldn't write over the old cache until you move off the webpage, so if you see a problem you'd be able to open a new browser to view the cache and it will be the old, good version of the webpage. Unless it already works like that.

There's also a chance that a search engine cached the updated page.

Wassercrats
02-06-2008, 04:19 PM
Would FQ allow a client to cover the cost to recover files from a failed hard drive through a 3rd-party service?
I wouldn't have a problem with it...

Then you should probably have a system set up where you keep the crashed drive for a certain amount of time and it should be posted somewhere that you do this.

SneakyDave
02-06-2008, 04:35 PM
Were these RAID's anything like Commodore Computer 1541 Disk Drive's? I seem to have had a few of them crash and burn.


LOAD "*", 8,1


Thanks for keeping us in the loop.

hobbes
02-06-2008, 04:37 PM
Then you should probably have a system set up where you keep the crashed drive for a certain amount of time and it should be posted somewhere that you do this.
I'm guessing if there was interest, people would converge on these boards fairly quickly and those interested would pony up for recovery.

Hopefully the need will never arise, but should there be one, at least we'll know it's possible.

Wassercrats
02-06-2008, 04:41 PM
I was just worried that Kevin opened the drive and beat on the platters with an ax before leaving the data center. It would be too late then.

Kevin
02-06-2008, 04:42 PM
I was just worried that Kevin opened the drive and beat on the platters with an ax before leaving the data center. It would be too late then.

Actually, that is kind of tempting. The disks are off warranty so I don't have to worry about voiding anything :P

Terra
02-06-2008, 04:59 PM
Then you should probably have a system set up where you keep the crashed drive for a certain amount of time and it should be posted somewhere that you do this.
Actually, I should add that if the drives are under warranty, then we will have an RMA within 48 hours and they will be shipped out for replacement...

So if anyone is willing to pony up the big bucks for drive recovery, the window of opportunity is limited...

In this case, as Kevin stated, the drives are off warranty so they'll just sit on the shelf of broken dreams until we do some spring cleaning...

mromero
02-06-2008, 06:05 PM
Oh

I always thought techs were inside the server rooms at all times with hard drives, CPUS, power supplies etc. ready to hot swap :dunno:

HUGO is suffering from a failing hard drive.

A technician has been dispatched to the data center to perform a hard drive replacement. We are shutting down the server as a preventative measure to preserve the data on the drives.

Additional information will be posted as it becomes available.

Our apologies for this inconvenience,

Sheila

Terra
02-06-2008, 06:18 PM
I always thought techs were inside the server rooms at all times with hard drives, CPUS, power supplies etc. ready to hot swap
Nope, for the last 10 years we've operate a lights out shop which is facilitated by gear and custom software (much of it we've written ourselves) expressly designed for that purpose... 97% of the things we need to do can be done remotely... Anything that needs a tech onsite they can be there in a short amount of time...

We've all done the daily commute stare at your cubicle walls grind, and for the FutureQuest staff - we've liberated them from that... Our time flexibility is astounding, and actually makes us much more efficient...

What happened last night, was only the 2nd server hard drive catastrophic crash in 10 years, and even with the travel time - I was already working behind the scenes prepping the restoration scenario which would have blocked Kevin anyways...

Even though we work remotely, that doesn't make us any less of a well oiled machine... :QTcylon:

tknterry
02-06-2008, 07:59 PM
I was just worried that Kevin opened the drive and beat on the platters with an ax before leaving the data center. It would be too late then.

No, no, he gave it to the crocs and gators as an offering.

Randall
02-06-2008, 08:24 PM
No alligators or crocodiles were harmed during last nights HUGO restoration... :smile: Oh, but they were there. They were there... What happened last night, was only the 2nd server hard drive catastrophic crash in 10 years In my own idiosyncratic way, I'm kinda honored to have been sailing on that doomed ship (being rescued the FQ Coast Guard may have something to do with it too). :winky:

Randall

Bradley
02-06-2008, 11:50 PM
I remembered when this happened with SAMSON. I'm glad to see everything got taken care of so smoothly. Something like this crosses my mind every now and then about the colo but I never end up looking at some way to still have a drive with intact data in the server. I'm sure some form of RAID exists that would do that but I bet its out of my price range. :rasberry:

I did tune into the video last night for a minute and was curious about the shirt myself but was too tired to remember about it today. Good job Terra and Kevin (and any others involved!)

frankc
02-07-2008, 11:35 AM
Terra and everyone at FQ, I gotta say that I am so very impressed with how quickly you folks responded to this double-HD failure and worked hard to minimize the impact on customers.

I've been with FQ since early '99 and have never regretted it. You are all a credit to FQ for the quality of customer service and technical support/design you offer. Well done! :yeah: