View Full Version : FQuest Alert: Scheduled Downtime and Upgrades
Terra
01-22-1999, 08:40 PM
Wow, so this is what stress and pressure is all about... http://www.aota.net/ubb/frown.gif
Today we had **unexcusable** downtime, for which I don't have a clue why it went down...
Server was running great, I walk away to go get a cup of coffee, 5 minutes later the server had come to a halt... We were well within operating parameters (2.79 load with 97meg memory free)...
In light of all this - the 450 cpu and memory came in today...
At 2:00am 1/23/99, the server will be going offline *intentionally* for CPU and memory upgrade... Moving domains over to the new server will take a few days - so I am beefing up the existing server in the meantime...
I expect the downtime to be 1 - 2 hours total, as I have to reconfigure for the higher capacity...
--
Andrew Gillespie
Systems Administrator
FutureQuest.net
PS: I am now officially sleeping on the couch at the server room... Today has been a very bad day...
MikeA
01-22-1999, 09:41 PM
PS: I am now officially sleeping on the couch at the server room... Today has been a very bad day...
With my part of the refund, rent "Quigley Down Under" and listen to Mad Mira talk. "Just because the road is rocky, doesn't mean your spirit has to be rocky too." http://www.aota.net/ubb/smile.gif http://www.aota.net/ubb/smile.gif
Have a good night http://www.aota.net/ubb/smile.gif
Mike
------------------
www.webauthorities.com (http://www.webauthorities.com)
"To continue press any key"...Where's the anykey?
[This message has been edited by MikeA (edited 01-22-99).]
Sounds like you're not going to have much time to sleep on that couch http://www.aota.net/ubb/frown.gif
Also hope your not going to leave Deb alone with all that paraphenalia she bought with Mike's money! http://www.aota.net/ubb/smile.gif
------------------
Rich
"What time is it in _____?"
www.timezoneconverter.com (http://www.timezoneconverter.com)
Terra
01-22-1999, 10:40 PM
Please excuse my beginning post... I was a tad bit frustrated, and I didn't mean for it to show...
99% of the time that the server is having difficulties, or goes down - I almost always have a pretty detailed account as to why or where something went wrong...
Today, I had nothing... I spent over 6 hours today, combing the server with a fine tooth comb - heck, I even used a metal detector and seismic graphing equipment... http://www.aota.net/ubb/wink.gif
As previously stated - everything was well within great operating parameters, then boom - it silently went offline...
My final conclusion is that the server itself had gotten overheated - and the memory chips called it quits...
I will be installing the new server case fans tonight and adding an extra to blow over the memory and cpu specifically... It will then have a total of 5 heavy duty fans within the case (has 2 now) - and also an external floor fan to help circulate the air better...
The server room is climate controlled - but at times the case temperatures can go beyond what the external climate can help out with...
This is the only thing that I can think of that would have caused the problems we had today... I shut it down for 15 minutes, to let it cool off and rebooted... It's been working great since then...
SideNote: Total time to reboot after a server crash:
10 min drive to server room (15 if I catch red lights)
18 minutes to bring the server up from power off... It takes a long time due to the 14gig drives that are checked on boot-up, and for all the Daemon's to stabalize (3 minutes before I plug back into the network)...
So from the 'Down -- Up', its usually anywhere from 32 - 50 minutes to get everything back online...
--
Andrew Gillespie
Systems Administrator
FutureQuest.net
--I can see clearly now--
[This message has been edited by ccTech (edited 01-22-99).]
Terra
01-23-1999, 05:23 AM
Upgrade completed in 1.5 hours... http://www.aota.net/ubb/smile.gif
The server is now P-II 450 w/512mb ram, with more cooling fans installed as well...
The 512mb is temporary, as 256mb is destined for the new server, Until I can get my hands on the newly released Kingston 256mb unbuffered SDRAMS ($$$$$) and pump the servers up to 1 Gig each... Expensive - but these servers crave memory...
I must say though - when in doubt, always go for more memory... The 450 seems to have helped as well... One of my monitoring programs used to run 8 - 10% CPU time, now it's consistent 3.5% which is really great news... The server is under heavy load at the moment processing stats, and it's not even showing the slightest bit of stress... We'll see how well this new combination works during crunch time, but from initial results - things are looking up...
Time will tell...
--
Andrew Gillespie
Systems Administrator
FutureQuest.net
Terra
01-23-1999, 04:36 PM
It's sorta like when Windows crashes, and forces you to run 'ScanDisk' on startup...
Linux uses the ext2fs 'file system', and checking ***thoroughly*** a 14gig drive, takes a long time...
But will say this much though - I have never lost anything after a hard crash, when using 'e2fsck' program...
If there was not a crash, then the Linux box is up in 3 minutes or less - since no extended checking is going on...
The price you pay for completeness...
--
Terra
--The price you pay for sanity--
FutureQuest
meikel
01-24-1999, 01:21 AM
Hi,
18 minutes to bring the server up from power off... It takes a long time due to the 14gig drives that are checked on boot-up, and for all the Daemon's to stabalize (3 minutes before I plug back into the network)...
18 minutes sounds very long. I've never seen a PC that needs a that long time to get up and running. I know of many servers that need hours to shutdown correctly, but usually they boot up smoothly.
What part of the boot process takes a this long time?
Greetings from someone who needs an hour to get a "just booted" feeling in the morning
[nbsp][nbsp]Meikel Weber
http://www.meikel.com
vBulletin® v3.6.8, Copyright ©2000-2008, Jelsoft Enterprises Ltd.