PDA

View Full Version : [FQuest Alert] Network Upgrade


Kevin
09-03-2010, 04:17 PM
When: Sunday September 5, 2010
Time Frame: 12:01 AM to 3:00 AM ET
Who: The entire FutureQuest network
Duration: 2-10 minutes total spread out over several short blips

What:
One of the older network switches that make up the back end of the FutureQuest network has become a bit overloaded and is likely to cause increased latency as things continue to grow. There is nothing actually wrong with the switch it is just that it was purchased many years ago and bandwidth usage has increased by a huge amount since then.

We have purchased a newer switch that can handle many times more bandwidth and need to replace the old switch. The process of replacing a switch on a multi-switch network will cause several blips where portions of the network stop responding for about 30 seconds at a time.

After we finish replacing the old switch with a brand new one we will also be replacing an even older switch with the old one. This will cause a few more blips.

It is doubtful that site owners will see any immediate improvement due to this upgrade because we have already done work to minimize the latency problem even though we have never received a complaint about it. This upgrade is all about making room for future expansion.

Some technical details for the curious:

The brand new switch, named GLINDA, is an HP ProCurve 2510 48 port switch with 2 gig-e uplink ports. We have been using the ProCurve line of switches for a while now and have had very good luck with them.

The old switch it is replacing, named DOROTHY, is a Foundry FastIron 24 port switch which unfortunately lacks the gig-e uplink ports.

The even older switch, named ELMIRA, is a Cisco Catalyst 24 port switch which lacks gig-e uplink ports and the modern Rapid Spanning Tree Protocol feature. We will be retiring the ELMIRA switch after this upgrade.

Kevin
09-05-2010, 12:34 AM
We are about to begin the network upgrade. We will try to keep the interruptions as minimal as possible.

Kevin
09-05-2010, 01:27 AM
We are done with the first switch. Starting on the next one.

Kevin
09-05-2010, 02:13 AM
The network upgrades have been completed. The network interruptions were mostly too short for our monitoring system to notice so it is difficult to quantify them but they were probably less than a minute total for each affected server.

On a more interesting note...
It turns out we picked the exact right time to retire the ELMIRA switch. While we were working on the DOROTHY switch upgrade 2 fans in the ELMIRA switch started to fail and make a surprising amount of noise and vibration. It may not have lasted much longer and we may have been in here tomorrow doing a switch swap even without these upgrades.

Now that is timing!:QTwand: