Friday, July 29, 2011

VTP on Cisco Switches in a Small Company (aka: my network just drops)

Sorry it's been a while.  Here's the most recent fun bang-head-here problem I was able to resolve.

Situation:

3 Cisco Switches in an office.  1x 3750, 2x 2960S

Every so often at random intervals, the network connections for all the clients would just vanish; connectivity through the main switch was fine (used Zenoss to monitor, only reported failure of the switches, and a printer beyond them), but couldn't get to any of the clients, and they couldn't use the network, let alone the internet.

Troubleshooting:
I tried everything I could think of to identify this problem.  checked spanning tree, checked logging to see if I could catch it (this was one of those really random problems, highly unpredictable), made sure the VLANs were set correctly, had Zenoss pulling snmp data for interface utilization % on the trunks, etc.

what I noticed was the following: graphs didn't show any vertical breaks, so the interfaces never went down, even though the network connections would drop.  This meant the switch was up, and there was no problem with the physical wiring, as well as the power to the switches.

after asking someone else more knowledgeable than me, he pointed me in the direction of VTP settings.

What I learned (they probably cover this in CCNA 101): VTP is a proprietary Cisco protocol used to simplify VLAN management on many many switches (think triple digits or higher), allowing Network admins to manage them all from one point.  Makes sense, cuts down the amount of mistakes and time to configure a switch fabric.  My problem was that the three devices that were installed in the company had not been configured correctly, and since they were all non-configured when they were added, they all became servers.  Apparently, they couldn't decide which switch was the authoritative switch, and when the switch designated as the true master would change, all the VLANs would be deleted off these switches, and then added back.  Net result was the switches looked like they were going down.  Highly unpredictable, highly annoying (to everyone).

Resolution:

Set the switches to VTP transparent mode.  commands were really as simple as:

log in
config t
vtp mode transparent
write mem

some things to remember are to check your vtp status to see where you are on a given switch (show vtp status), and that you need to make sure you are not using vtp pruning when you make the change.  The change does not prevent you from connecting to the switch (some reported a delay, but I didn't experience one), but if vtp pruning is in place, it can cause problems getting your clients to connect as you change switches in the environment.  Since the environment I'm in is so small, I just set vtp transparent, since I could set the vlans on those switches, and they would still forward vtp packets.

info that I used included the following:
https://supportforums.cisco.com/thread/2029581 (be sure to read the whole forum thread)
http://www.cisco.com/en/US/tech/tk389/tk689/technologies_tech_note09186a0080094c52.shtml (main page about VTP configuration and what it is and does)
http://www.cisco.com/warp/public/473/vtp_flash/
(this really helped with my understanding of VLAN Trunking Protocol, VTP; the first problem discussed is exactly what I was facing, called Problem #1, of all things)

something else I learned (again, probably CCNA 101) was that a good protection technique on making changes where you might possibly lose connectivity to a switch is to start with the following before you make your change:

reload in {mmm|hhh:mm}
<make your change>
reload cancel (after change is complete)

this allows you to work, and if something happens that you can no longer connect to the switch, it will reload the config that worked before you started.

I'm sure there are more knowledgeable networking folks out there, but this was how I solved this problem for the time being.  Simply putting this out there for anyone who could use it; like me when I run into this problem again. (=