|
FQuest Alert: UnSheduled Downtime
At 9:45am EST 1/20/99, the FutureQuest server went down due to a hardware failure...
This was a particularly problem to isolate and diagnose... Our 3COM Boomerang network card had intermittant failures with it's EEPROM settings and kept having an 'off-by-one' IRQ shift...
The card is set to listen on IRQ 5, yet I finally saw that it was switching to IRQ 6 in midstream of handling a request... This is the first time I have ever seen a NIC card ever do this, but does make sense because those settings are held in Volatile EEPROM memory...
This caused queued up network requests to overflow Kernel memory buffers, eventually spiralling down the server...
The card has been replaced, and everything is back at 100%...
Down: 9:45am EST
UP: 11:47am EST
Our apologies for any incovenience that this downtime has caused...
--
Andrew Gillespie
Systems Administrator
FutureQuest.net
|