OK, once a while a blog-entry for Google, perhaps somebody stumbles upon this warning just in time:
Cisco’s Spanning-Tree Guard Loop-Feature can cause a Migration from PVST (or Rapid-PVST) to MSTP to fail miserably. Or at least to cause a major disruption in your network!
(aaaaaargh, this is not fair. I just spent 2 hours an a very thorough description of the problem, and then my blog-system crashes, with only the two lines above remaining. ok. have to live with it. now the short version)
Migrating a Cisco-Network from PVST or Rapid-PVST to MST usually is very simple and convenient. Create your MST-Region on every Switch (name, revision, vlan-to-instance-mapping), and then issue the command ’spanning-tree mode mst‘, starting in the core and working outwards to the access. (See Configuration example to migrate Spanning Tree from PVST+ to MST). The important mechanism for providing a stable spanning-tree with MST and PVST mixed is the following: MST sends its BPDUs only untagged (=> only on Vlan 1). The not-yet-migrated PVST-Switch expects a tagged BPDU on every Vlan, but only gets one on Vlan 1. The PVST-Switch therefor thinks that the uplink is no longer Root Port for Vlan 2-1004 and starts sending PVST-BPDUs on his Ex-Towards-The-Root-Port. The MST-Switch (in the first Step the Core) then sees tagged BPDUs on his Port and concludes, that its neighbour is not MST but PVST. The Port is so classified as ‚P2p Bound(PVST)‘ (in show spanning-tree). The MST-Switch then sends tagged versions of his CST-BPDU down on all Vlans; upon which the PVST-Switch reselects the Uplink as Root Port (everything and more in Understanding Multiple Spanning Tree Protocol, cco login required). This lets you run MST and PVST in parallel and makes an easy migration without Service Downtime (at least with RPVST) possible.
But if you use Spanning-tree guard loop (a.k.a. Cisco Loopguard) on your Uplinks, your network breaks. The reason is not too difficult to see, but as it is not mentioned in any Cisco-Paper and doesn’t show up in Googles first results, we did not think of it:
Loopguard alters the way, a port changes its state if it does not receive any BPDUs anymore, and it does so on a Per-Port-Per-Vlan-Basis (at least with PVST). A normal Port would switch to designated Port, if its Root-BPDU ages out. But a Port with SPT Guard Loop changes to Loopguard Inconsitent State, and does therefor not send any BPDUs itself. The Already-MST-Switch therefor never learns that its neighbour is still PVST, and does not send the tagged Pseudo-PVST-But-Really-CST-BPDUs. On all Vlans except Vlan 1, your spanning-tree gets segmented, your service disrupted. You have to hurry and turn all Switches in your Network to MST for your network to recover.
I see 2 ways to act against this problem:1) Before migrating, clear out SPT Guard Loop on all uplinks. Then switch your complete VTP to MST and reenable Guard Loop on all uplinks2) As the above solution is quite tedious in a large network, there is a simpler possibility if you have a simple network (every access has direct connection to a core-switch) with RPVST. Shut down all Switchports on the Core (of course you need to have a connection through a routed port or console). When the other Core-Switch is Root for all Vlans on all Access-Switches, change your first Core to spanning-tree mode mst. Then take back up all Switchports. As the Access-Switches never saw a Root-BPDU on this Uplink for any Vlan but Vl1, Loop Guard does not trigger and the Core can recognize the Access as MST/PVST-Boundary. Then do the same thing with the other Core.
Spanning-Tree-Migrations almost never work out as expected. But I hope, this article sometimes help someone to avoid the trap I felt in last week.