View Full Version : Sys Admin Skillset
phppete
04-02-2009, 03:12 PM
After reading about Erza as FQ's new sys admin/tech I was wondering what a level entry sys admin should have in his/her skill set. I assume:
1) working knowledge of HTML, CSS, web languages
2) expert in - shell scripting, bash etc
3) expert in - perl
4) expert in - regexp
5) expert in - sed
6) expert in - awk
7) basic - mysql (and or postgres if on server)
what about C? I assume the above would be the very minimum required? Just curious :)
Bruce
04-02-2009, 03:21 PM
Strictly speaking, for a system administrator the most important one of what you listed is the second -- shell scripting. Pretty much all of the rest is secondary, particularly knowledge of HTML and CSS. Perl (and probably PHP and Python) are also useful, but not quite so necessary. C not so much, as it is rare to need to create or modify administrative tools written in C. Every additional skill helps provide a broader view on the picture, but C has rather limited direct value for system administration.
What is perhaps more useful, and harder to quantify, is a good knowledge of the underlying hardware and operating system. For administration, knowing programming (including scripting) can make jobs easier, but without a sound knowledge of the underlying system it becomes difficult to determine what is even possible or not.
phppete
04-02-2009, 03:40 PM
Thanks for the info :) Do you think someone could learn enough 'home alone' through reading books and messing about with a local 'none real world server'? I'm guessing the issues you face on a real box with real clients are very different from anything you could simulate yourself.
Kevin
04-02-2009, 03:52 PM
The biggest thing for an admin/tech IMO is basic troubleshooting skills. The ability to isolate and reproduce a problem while eliminating all the things it might be but isn't until you know exactly what is wrong. The other big skill is the ability to research a problem or new technology. Once you have those skills everything else is just experience.
Kevin
04-02-2009, 03:55 PM
I should also mention thoroughness and attention to detail. This generally comes with experience but it is nice to have up front.
Terra
04-02-2009, 04:24 PM
Both Bruce and Kevin make good points...
I'd like to toss in another quality that is desired in a good sysAdmin is the ability to reduce complex systems and/or problems to simplicity while realizing that this will be the most difficult and frustrating thing you will ever have to do...
Case in point is that I will be making another pass at our network core this weekend in an effort to simplify the interactions between the various switches and routers... The complexity I'm trying to reduce is the interdependencies that breed nasty cascading and convergence problems... The very same problems that are simply an integral part of BGP4 technical life and are extremely difficult to 'fix'... To be honest, it can't really be fixed but what can be done is minimize the interactions and drive in wedges that will hopefully reduce cascading problems... The hiccup we had the other day finally gave me enough information and understanding of how best to evolve the network core through simplification... So even though that hiccup was painful in several ways (clients upset), it was not a loss at all as improvements can only really be made once you can see the true nature of the situation.. In the end you will find that we will have a more resilient (not perfect though) network core that can better serve the demands of the future... For those that remember the networking issues we had several months ago, a lot of work was invested to turn that situation around - which we did... Sadly, that also increased complexity for which now growth has brought forth new problems that must be addressed...
We received an email wondering when the network core will be fixed permanently and sadly it never will because as things grow - complexity increases and/or previous things that were fine start to strain under the increased loading/demands... It is a never ending evolution, just as with active programs that evolve over time with new features as well as current/old bugs being fixed... Something that every sysAdmin needs is a thick skin knowing that they walk a very thin line between the needs of the clients vs. the needs of the technology and be able to balance them both without fail (impossible, yet we still try)... That - IMHO - is what separates your average sysAdmins from the really good ones because you have to have a foot on both sides of the fence without tearing yourself apart in the process...
Even now, I don't really want to touch the network, because of the potential problems it can create - but I know I have too in order to improve it... So I have to do it, knowing that we'll lose a few clients over it and I'm going to get yelled at, but in the long run it really will be for the best and those that believe in us will benefit from it... These are just simply growing pains that must be endured and I will always do my best to reduce its exposure to the clients but sometimes technology itself has different plans and ends up morphing into a kicking, screaming and cantankerous two year old that forces you to tap into inventive/creative solutions that you would not have had otherwise...
Terra
04-02-2009, 04:43 PM
I'm guessing the issues you face on a real box with real clients are very different from anything you could simulate yourself.
Exactly!
I'm a firm believer in that trial-by-fire is the best way to gain *useful* administrative skills...
Simulations are just that - simulations... You will find that a production server will constantly surprise you, especially ours because these are not closed loop servers... They are open loop because they have to be a bit of everything for everyone which tosses design consistency out the window... What is requested of them is so volatile and at times bordering the bizarre that you can never hope of even coming close to simulate as the requests are simply all over the map... What you simulate today, will most likely not be designed for the volatility of tomorrow... An admirable trait of a good sysAdmin is being able to find a happy middle ground without leaning too far in one direction or another... There is something to be said with having the ability to find the sweet spots in any technical design that can cover the extremes of swinging volatility... That - IMHO - is not perfection, but rather a design that is good enough to handle the demands of the real world... I've found that 'perfection' does not handle the real world so well because it ends up being fragile or brittle or just simply too tightly scoped...
phppete
04-03-2009, 04:07 AM
So I have to do it, knowing that we'll lose a few clients over it and I'm going to get yelled at, but in the long run it really will be for the best and those that believe in us will benefit from it...
For some it isn't about believing in you, I have no doubt that the vast majority of your clients do believe in you. The problem is for ecommerce sites, when network issues arise money is lost. If a site takes 10 orders a day on average for example and due to network issues we lose 2 orders then that is like you losing 20% of your salary.
Those who aren't that tech savvy get angry, not for the hell of it and not to be awkward but because they might already be struggling as it is and losing even one order is one too many. It is very difficult to convince a client to 'believe' when they hear of other hosts who aren't experiencing these growing pains.
The person who emailed you about the network might be someone I know and unfortunately it might be hard to prevent him from using a piece of fruit instead of you if things get a little flaky. Most people have unrealistic expectations about hosting so even 10 mins of downtime feels like a monumental failure.
One thing I have noticed that makes clients angry is when any downtime is under reported, eg a notice stating the downtime was 7 mins when in reality the client experienced 15 mins +. Clients don't understand the issues of networks and why their downtime was longer than reported, due to this they often feel the host is being dishonest and trying to cover something up. It might be a good idea to always give an explanation of why the downtime for some is actually longer than the reported length of time.
When hosts report 'intermittent network issues' and suggest clients might of experienced a slight lag also angers those whose sites have been totally unreachable. Once again the client feels they are being deceived and lied to. I'm sure for those living around the corner from a data center experience less packet loss than those living many thousands of miles away. I think more effort (for all hosts not just FQ) needs to be made to ensure clients, especially those with absolutely no knowledge of anything technical do not feel they are being lied to.
My final point is that the detailed explanations posted here about what happened when an issue occurs are very interesting to people like myself, anything that is beyond my understanding I usually lookup and read up on, even if it is just on Wikipedia. Other clients often find your responses as clear as reading Mandarin so it might also be useful to offer a 'dummies' explanation along side the tech explanation. I think some of the above might go someway into smoothing over any potential cracks that may or may not appear in the future.
vBulletin® v3.6.8, Copyright ©2000-2012, Jelsoft Enterprises Ltd.