View Full Version : FQuest Alert: Downtime
Terra
02-22-1999, 03:39 AM
Tonight, around 1:28am EST, the TAZ server went offline due to a runaway PHP3 file consuming all of system memory...
PHP3 was built to handle this error, and not allow more than 8mb to be used, per instance... For some reason, this protection mechanism did not work... I am investigating this matter further...
I could not locate the problem PHP3 file, but whomever owns 'blurbs.php3' - you have an error on line 25... http://www.aota.net/ubb/frown.gif
Since I could not find the above file, I have reconfigured PHP3 to only allow 10 seconds of CPU time, instead of the previous 30 seconds... I will raise the limit again, after I can ensure that the PHP3 memory protection is functioning properly...
The (SIX) server will combat excessive resource usage issues by running a backend pool of PHP3 daemons (3 to be exact), so this should further protect us against such unforseen situations...
So much for the immaculate Uptime, last read:
22d 3:50:54.47
Our apologies for any inconvenience this has caused...
--
Andrew Gillespie
Systems Administrator
FutureQuest.net
PS: Special thanks to Dan Web (www.maxcess.net) for getting us back online quickly!
[This message has been edited by ccTech (edited 02-22-99).]
Justin
02-22-1999, 07:43 AM
No blurbs.php3 here http://www.aota.net/ubb/smile.gif But wow - 22 days is excellent! This month has been a lot better than January, even with all of the upgrades going on. I'm impressed! And at least it happened in the wee hours of the morning, too. I was actually watching TV http://www.aota.net/ubb/smile.gif so I didn't even notice.
Does that make the first outage this month? That's too cool! Keep up the excellent work!
------------------
Justin Nelson, SFE Inc.
http://www.vdj.net
Jacob Stetser
02-22-1999, 10:43 AM
Ummm.. Andrew?
http://www.aota.net/ubb/frown.gif
I was testing what I thought was a simple, unassuming PHP script right around that time... =(
It mighta been me. Take a look at my php scripts in my www directory, specifically responder.php and less importantly, fmail3.php and the other fmailxxx.php files.
Let me know if I crashed you. If I did, I'm sorry =(
Jake
Terra
02-22-1999, 02:24 PM
It doesn't really matter who did it, the problem shouldn't have happened to begin with... http://www.aota.net/ubb/wink.gif
Something did not work right at the server level to prevent this situation, and for that I only have myself to blame...
FQuest is meant for development freedom, not analistic iron-fisted restrictive environment... It's a difficult line to configure a server for, and I keep a watchful eye on maintaining the proper balance of freedom and uptime...
--
Terra
--Should have joined the circus as a tight-rope walker--
FutureQuest
PS: Jacob, it wasn't you... http://www.aota.net/ubb/smile.gif
[This message has been edited by ccTech (edited 02-22-99).]
Jacob Stetser
02-22-1999, 02:48 PM
Are you suuuuuure? http://www.aota.net/ubb/smile.gif
I know it's all your responsibility, Terra. http://www.aota.net/ubb/wink.gif
But as developers in a shared environment, we tenants of your server too must share some of the task of being good neighbors, making sure we don't run scripts that are a security risk or are too much of resource hogs http://www.aota.net/ubb/smile.gif
Now, sometimes we make mistakes, all of us, and that's a fact. But mistakes are misteaks, and can be fixed http://www.aota.net/ubb/smile.gif
BTW, I didn't think my script woulda done it, but I wasn't sure http://www.aota.net/ubb/smile.gif
Jake
alexandra
02-22-1999, 07:34 PM
Thanks again for your honesty -- it's so nice to send an email saying, "Uh, I just tried to log on to my site and couldn't get in," and get one back that says, "Yes, sorry, the server is down. We're working on it" instead of "Well, we had a power outtage in Atlanta for six hours" (one that mysteriously never made the papers) or "maybe it's time to upgrade your browser," which is what my last web hosts liked to say.
FYI in an effort to help you try to pinpoint, I just checked my email to you, which I wrote within a minute of trying to get on, and it was at 1:04 a.m. So maybe it was something else that caused it?
Alexandra
Terra
02-22-1999, 09:23 PM
At 1:02am, the FQguardian program kicked in and shutdown the Apache primaries, in an effort to keep the server from spiraling down... The Apache primaries came back online shortly thereafter, after the load had gotten back under control...
The second time, around 1:27am, the malfunctioning PHP3 file consumed **quickly** every bit of memory available causing the server to thrash (excessive swapping) memory, and was compounded by FQguardian shutting the primaries down... There was just not enough memory to perform a proper shutdown and it eventually spiraled the server...
When you have a multi-user server running, this is an unfortunate potential problem that can, does and will happen... The Linux OS does a great job of recovering after this condition and is 10 times more resiliant than Windows NT handling situation... It is a very delicate balance of resources, and there are many controls that are in place to stop this from happening, but at times the rush can happen so quickly that nothing much can stem the tide...
It's a tradeoff in providing power and capability, while maintaining freedom to develop and broaden ones horizons... Sometimes the best of us gets bitten by high-tech...
If you look at what is requested of a server from a year ago to now - you will understand the deep complexities to deliver what is asked of it...
All I can do in this situation is study the cause/problem, and tighten down the bolts that can keep a better lid on it... Most of it unforseen, some of them just plain gremlins...
Downtime is never wanted or desired, but it is the unfortunate side-effect when you run a high-tech service company... Technology will break and it's up to us to fix it...
In prospective - how many times has your computer locked up in the last day / week / month / year? Then look at a server that is running constantly and performing millions of requests a minute... 24/7/365...
All in all, the FQuest server has been performing it's duty very well... Downtime has been below average (after you take all variables and place them in perspective)...
I'm proud of our servers, and what they have accomplished... Providing developers with the power tools they need, while walking the UpTime fine line...
The things you can do with our server is only limited by your imagination... I know - I live and breath it everyday... http://www.aota.net/ubb/wink.gif
--
Terra
sysAdmin
FutureQuest
PS: I wish I had a penny everytime I've emailed tech support and wrote. 'Duh, I know that - I'm not a blooming idiot... Oh wait, my sunglasses were on, I can see my web page now --- nevermind'...
[This message has been edited by ccTech (edited 02-22-99).]
Terra
02-22-1999, 09:36 PM
Thinking about uptime for a moment, and ways to increase it... I could almost guarantee 99.9% uptime if:
disable all Telnet access
disable all SSH access
disable all CGI execution (utilize only scripts written/authorized by FQuest)
disable all PHP3 execution
disable MySQL databases
disable Reverse Resolving Stats
disable SSI
serve only static .html pages
allow FTP only access
disable cron jobs
disable all non-system daemons
limit email capacity (size and quantity)
As you can see, by providing 99.9% uptime (or better) - we would quickly go out of business because we wouldn't have what it takes to be a real hosting company...
--
Terra
--Just thinking out loud--
FutureQuest
hearts
02-22-1999, 10:27 PM
geeeeeeeeeee Andrew.. are you stressing or what?
It almost seems like you are upset.. with whom? YOURSELF? Well, sir, relax, have a cigarette.. and breathe.
You just seemed a bit defensive, and we have all been behind your work and your committment.. and no you can't please everyone. BUT YOU HAVE AN ENTIRE community that is very proud of you and Deb, and caring serivce you offer us.
And heck, it coulda been me? I was uploading a html file when I got the "memory" message via ws_ftp.. so maybe it was a combo of things happening simultaneously?
Maybe I misread ya... and if I did I apologize. I just don't like to see ya get hard on yourself, cuz of something unexpected.
My favorite outta all you had to say:
FQuest is meant for development freedom, not analistic iron-fisted restrictive environment...
and you guys know... you are MUCH MORE than all that! *hugs to you both*
-----
hearts
[This message has been edited by hearts (edited 02-22-99).]
Justin
02-22-1999, 11:07 PM
I agree with Hearts - I've taken the uptime for granted for over three weeks without a single Downtime notice. How can one beat that??
My computer has been locking up a lot lately due to a bad video card (first the monitor, now this, what's going to break next month?).
I am extremely pleased with the server. And you're exactly right - given a choice of 99.9% uptime or cool toys with > 99% uptime, I'd much prefer to have a downtime now and then. I actually learn from the downtimes - reading what went wrong, etc, and obviously you learn a lot each time too. Just reading what's going to be done to prevent it from repeating shows that.
I didn't even get upset for a second when I read the post. I thought to myself "Wow, it's been a while now!". I forgot what it was like to read a Downtime post!
One final thought: Nobody is perfect http://www.aota.net/ubb/smile.gif
------------------
Justin Nelson, SFE Inc.
http://www.vdj.net
Terra
02-22-1999, 11:20 PM
Thanks Hearts, I needed to take a moment and relax...
We have suffered much downtime, whether it was our fault or not, and it's a bit on edge always worrying if the server is up or not...
The end of December, and parts of January (toss in all the Internic delays) had finally taken it's toll... Now we have major upgrade situation before us, and trying to perform it transparently without causing anymore downtime has got me on pins and needles...
I usually take stress and pressure in stride, but the last downtime cost us tremendously (financially / customers running / lost trust / failed expectations)... I want FutureQuest to succeed in everyway possible, but I know that we can't take another major hit like we did the last time... Money is not really the underlying issue, but losing peoples trust is...
Community Forums, can either be a blessing or a curse - within them you can find all of our weaknesses... Whenever I have to make another 'Downtime' post, I just have to glance down and count all the other ones... Sometimes it's difficult to find the words stating 'we crashed again'... Many times, it just plain hurts...
But I tell you what, I wouldn't want to have any other job then the one I have know, because in all senses of the phrase - 'It's a never ending adventure, fraught with perils and victories of a most unique kind'...
--
Terra
--SysAdmins are human beans too--
FutureQuest
hearts
02-23-1999, 01:25 AM
hehehe.. and ya know.. I was gonna blame you justin! http://www.aota.net/ubb/wink.gif
well, i am very happy with the service, and very impressed, that you are doing all that you are doing to accomodate us! Thanks for taking the extra steps for us.. *hugs*
hearts
02-23-1999, 01:59 AM
hey Andrew,
I guess I know the feeling, *sorta* just for different reasons.
Losing customers, is not fun nor is it an easy thing to deal with. The people that chose to leave, were aiming for perfection, and there is no such thing. *really*
heck, I didn't like the downtime at all.. I got pissed, frustrated, angry.. all of that. I had even considered leaving to. Then I thought, what would be the point? I won't be able to find another hosting company that cared about its service the way you all do. I was even worried about losing my privelages in the forums. I cannot live without these forums for nothing. Why? Cuz they have taught me so much. I wouldn't know what to do without them. My favorite thing I learned? .htaccess files and installing cgi scripts! whew.. that .htaccess trick has come in so handy for me. I outsmarted my tech with it at my other server. He won't allow telnetting, and I needed the path to perl... he wouldn't even set up my cgi-bin.. so.. I outsmarted him.. *EG* He thought I should allow his "pro programmer" install my scripts cuz he thought I would screw up. Well.. when I installed that guestbook for my Mike.. I posted him.. the server guy.. and told him.. I am the web master.. the pro programmer! *EG*.. cuz I was able to go around his stubborn egotistical self. BUT you guys gave me this independence. I had to count on my Mike to do things I didn't know how. I come here, and now I can do more than my Mike.. *the power is with me*.. granted I have so much to learn. BUT without you guys.. I wouldn't have this opportunity.
The downtimes might not have been the nicest thing to have happened, but look at what you have learned from this. YOUR business is growing.. it didn't ruin ya's. Sometimes a bunch of good can some outta something not so great.
The most strongest thing going for ya's is this forum. Without it, I would have left here during that time. NEVER close down these forums. This is the heart to FQ.
Your honesty.. the way you will go out of the way to help even when it is not your place, the way you guys treat us as family instead of an invoice number and whatever else I left out, is why you will succeed.
Something that is engraved in my heart, FQ was your's and Deb's dream.. It is like you are sharing your dream with us.
I know business is business..
but people are people with hearts.
and the story continues.. http://www.aota.net/ubb/wink.gif
-------
hearts
and for those of ya lurking.. I have checked out many other hosting companies.. none can hold a candle to the efforts and dedication of FQ.
disable all Telnet access
disable all SSH access
disable all CGI execution (utilize only scripts written/authorized by FQuest)
disable all PHP3 execution
disable MySQL databases
disable Reverse Resolving Stats
disable SSI
serve only static .html pages
allow FTP only access
disable cron jobs
disable all non-system daemons
limit email capacity (size and quantity)
Geeessshhh I leave for a few hours and.... hehe Has he forgotten who he would have to deal with if those things were disabled???? muhahahahhahaha
I was at the hospital most of the night for my mother... I should have snuck out with some of that Demorol<sp?> they were passing out eh http://www.aota.net/ubb/wink.gif
All is well -- FutureQuest is well -- We are well -- Life is ummm well errr Life is life <grin>
Downtime -- Some are open about it.. some are not... we happen to be one of the ones open about it so even those who may never have known about it... get to know. The only diff about our down time and another hosts is how well it's hidden or not --- for those who would rather not know their site was down.. tis fine.. those who want to know -- know where to find out http://www.aota.net/ubb/wink.gif
I am absolutely thrilled with the community here... heck yeah you guys are our main selling point http://www.aota.net/ubb/wink.gif I only hope we can continue to grow this area with the same type of people for a long long long time to come.
Thanks for taking care of our Sysadmin for me while I was away -- I'll remember to leave more milk and cookies for him next time http://www.aota.net/ubb/wink.gif
Deb
--Reality is a place one dreams of entering but runs from when given the chance.
[This message has been edited by Deb (edited 02-23-99).]
hearts
02-23-1999, 02:35 AM
not to sound dumb.. but what are
cron jobs
kinda sounds personal.. like i really don't wanna know the answer to this question. *LOL*
Terra
02-23-1999, 02:42 AM
cron jobs == task schedular (Win95)
You can set a program to be executed by the system at predetermined times, either 1 shot, or specific intervals of time...
e.g.
At 7:00pm 2/23 (order_pizza)
The system would execute the order_pizza program, and finish in about 45 minutes...
--
Terra
--We are Homer of Borg, resistance is futile, prepare to be assimi- yuuummmmmm pizza--
FutureQuest
[This message has been edited by ccTech (edited 02-23-99).]
Julie
02-23-1999, 05:44 AM
Terra,
Oh boy. I am sorry that you are having a rough time. I do want to interject this....
Personally when I am looking into a hosting service, I want a no suprise sort of deal. You are honest with everyone, including your potential clients. I love that. when I was with Hypermart, they never told anyone what was wrong and why, (with the exception of when FTP would go out and nothing belonged to me, and they told me that I must have been spamming)
I know it hurts when you loose clients. Forget the finacial hassles, but I know it hurts emotionally. These people sound like they weren't ready to be a part of a community. FQuest is designed to be a community. (At least that is how I see it)
Durn, I lost my train of thought. Broken URLs. I think my server is down. The image on page four isnt showing!!!!!!
Well anyway, since I don't know what I am babbling about anymore, I think I should go.
Talk to you later Terra.
------------------
Julie
http://www.splitangels.org
Armand
02-23-1999, 08:09 AM
I admit when I stop in to the forums which is at least twice a day I always go to Server News first and work my way up. My reason for this started in January because of the downtime, but in the last month it switched to finding out about all the great work you folks are doing to correct and upgrade things.
I personally really stopped caring if downtime things popped up and now that one did I still don't care. That is I'm not worried for a second. I just imagine you and Deb working your butts off to fix it really quick. That image only makes me more patient and think they're taking care of it the best possible way they can so let me go do something else productive here or there.
Okay this is starting to sound like a ramble. So just know this if you two quit I'm personally coming down to Florida to crack some heads! http://www.aota.net/ubb/smile.gif http://www.aota.net/ubb/smile.gif And even though I usually just lurk more than post I'd be lost without these forums and of course FQ.
------------------
Armand (aka Joe)
joe@arcanumcafe.com
www.arcanumcafe.com (http://www.arcanumcafe.com)
Terra
02-23-1999, 12:45 PM
So just know this if you two quit
The word q**t isn't in my vocabulary... http://www.aota.net/ubb/wink.gif
--
Terra
--is off chasing the future--
FutureQuest
Armand
02-24-1999, 08:18 AM
Glad to hear Terra, cuz cracking heads isn't in my nature http://www.aota.net/ubb/smile.gif
vBulletin® v3.6.8, Copyright ©2000-2008, Jelsoft Enterprises Ltd.