Virtual Link Aggregation Control Protocol (VLACP) is extension of the Link Aggregation Control Protocol (LACP) developed by Nortel to detect end-to-end failure over an Ethernet network. We’ve been deploying VLACP within our network for the past year with great success. We were eager to deploy VLACP because the Nortel Ethernet Switch 470 Gigabit Ethernet fiber ports (GBIC) did not support autonegotiation and are required to be hard set to 1000/Full Duplex when connecting to a Nortel Ethernet Routing Switch 8600. Without autonegotiation there is no mechanism to provide link failure notification (RFI, FEFI) on the specific interface. The problem can arise if you have a GBIC malfunction or a single fiber strand breaks leaving one side of the link up and the other side down. VLACP mitigates this problem by providing a mechanism to detect the path failure and can be applied to provide end-to-end failure notification over a telco carrier network.
Here’s what Nortel has to stay in their document, “Link Aggregation Control Protocol (LACP) 802.3ad and VLACP Technical Configuration Guide” dated August 2007;
Virtual LACP (VLACP) is an extension to LACP, used to detect end-to-end failure. VLACP takes the point-to-point hello mechanism of LACP and uses it to periodically send hello packets to ensure end-to-end reachability and provide failure detection (across any L2 domain). When Hello packets are not received, VLACP transitions to a failure state and the port will be brought down. The benefit of this over LACP is that VLACP timers can be reduced to 400 milliseconds between
a pair of ERS8600 switches. This will allow for approximately one second failure detection and switchover. Note that the lowest VLACP timer on an ES460/470 is 500ms. VLACP can also be used with Nortel’s proprietary aggregation mechanism (MLT) to complement its capabilities and provide quick failure detection. VLACP is recommended for all SMLT access links when the links are configured as MLT to ensure both end devices are able to communicate. By using VLACP over Single-Port SMLT, enhanced failure detection is extended beyond the limits of the number of SMLT or LACP instances that can be created on the ERS8600. VLACP can also be used as a loop prevention mechanism in SMLT configurations and should be used when setting up the IST. It also protects against CPU failures by causing traffic to be switched or rerouted to the SMLT peer in the case the CPU fails or gets hung up. Please refer to the Technical Configuration Guide for Switch Clustering using Split-Multilink Trunking (SMLT) with ERS8600 for more details.
NOTE: In regards to the ERS8600, although either the CLI or JDM interface allows you to configure the short timers to less than 400ms, Nortel does not support this configuration unless the ERS8600 is equipped with the SuperMezz daughter module for the 8692SF. The SuperMezz allow for very quick sub 100ms failure detection.
Although functions such as Remote fault indication (RFI) or Far-end fault indication (FEFI) can be used to indicate link failure, there are some limitations with these mechanisms. The first limitation is that with either of these mechanisms, they terminate at the next Ethernet hop. Hence, failures cannot be detected on an end-to-end basis over multiple hops such as LAN Extension services. The second limitation is both of these mechanisms required Auto-Negotiation to be enabled on the Ethernet interface. Hence, if an Ethernet interface does not support Auto-Negotiation; neither of these mechanisms can be used. The third limitation is if an Ethernet interface should fail and still provide a transmit signal, RFI nor FEFI will be able to detect a failure. Hence, the far-end interface will still think the link up and continue to transmit traffic. VLACP will only work for port-to-port applications when there is a guarantee for a logical port-port match. It will not work in a port-to-multi-port scenario where there is no guarantee for a pointpoint match.
NOTE: Please note that VLACP does not perform link aggregation. Is it simply used to detect end-to-end link failures and can be enabled over single links or even MLT trunks. VLACP does not require LACP to be enabled; LACP and VLACP are independent features.
NOTE: When configuring VLACP, both ends of the link must be configured with the same EtherType, Multicast MAC address, and same timers. By default, the VLACP parameters across all ES and ERS switches are the same with the exception of the FastPeriodicTimer which is set to 200ms on the ERS8600 and 500ms on all other switches. When connecting, for example, an ERS8600 to and ERS5500, the recommendation is to use 500ms FastPeriodicTimers with ShortTimeout in order to achieve fast failover. Also, when using the ES460/470 in the 3.6.x software release, the VLACP EtherType must be configured with a different value on each MLT link. The EtherType must match the EtherType value at the far end of the MLT link.
NOTE: If VLACP is used with LACP, there is no difference in how VLACP and LACP bring down a port if no LACP or VLACP PDUs are received. VLACP will declare the VLACP status as down and will report the event in the log file whereas LACP will not synchronize, not activate Collecting and Distributing on this port, and not report a message in the log file. The end result is the same where the port will block traffic; the physical layer for this port will remain up. Although you can enable VLACP with LACP, there is no practical reason why you would do so.
There was an interim solution before VLACP developed by Nortel called Single Fiber Fault Detection (SFFD) specifically designed to allow remote fault detection on Gigabit Ethernet fiber ports that did not support autonegotiation. Unfortunately we had some issues with SFFD and never really deployed the feature beyond our testlab environment.
Ethernet Routing Switch 5510
Here’s how you would configure VLACP on the MLT uplinks to an ERS 8600 Switch. You’ll need to connect to the 5510 switch and enter the “Command Line Interface” if you have the menu up.
5510> enable 5510# configure terminal 5510(config)# interface fastEthernet 47,48 5510(config-if)# vlacp port 47,48 timeout short 5510(config-if)# vlacp port 47,48 enable 5510(config-if)# exit 5510(config)# vlacp enable 5510(config)# exit
Ethernet Routing Switch 8600
Here’s how you would configure VLACP on the MLT uplinks to the ERS 5510 Switch above.
ERS-8610:6# config ethernet 1/1, 2/1 vlacp enable ERS-8610:6# config ethernet 1/1, 2/1 vlacp timeout short ERS-8610:6# config ethernet 1/1, 2/1 vlacp fast-periodic-time 500 ERS-8610:6# config vlacp enable
In this example we’re using ports 1/1 and 2/1 as the uplinks to ports 47 and 48 on the ERS 5510 respectively. The VLACP short timeout timers on the ERS 8600 default to 200ms so we need to configure them to match the minimum possible with the ERS 5500 series switches of 500ms.
If the interface appears to be bouncing you should definitely check the timers.
VLACP is great on paper, but it has some real-life short-comings :
– to be efficient, you need to use “short timers” (<10s), but this mode is currently incompatible with HA-mode (high-availability with 2 CPU cards).
This limitation is apparently global to HA-mode (unless you use SuperMezz), which doesn’t support short timers on any protocol (there’s a few seconds blackout when CPU failover occurs).
– VLACP implementation differs on ERS 8600 and 5500. When you plug-in a SMLT link, all traffic sent from the 5510 to the 8600 is dropped during the VLACP timer.
If you use the long timer, you lost packets for at least 10s…
Michael McNamara says
Your touching on a few different limitations and design considerations in your comment.
In the example we are using “short timers” so yes I would agree with you that short timers should be used. You are correct that the default timers between the ERS 8600 and ERS 5500 series don’t match. The example provided specifically configures the VLACP timeout to 500ms.
I don’t actually use HA-mode on the ERS 8600 switch although I have many switches with dual 8692SFs that also have SuperMezz daughter cards installed on them. There are quite a few limitations with the HA-mode configuration including a number of unsupported protocols, one being BGP.
Thanks for the post.
First of all, congratulations for your blog, its content is very useful and interesting. I use VLACP between Nortel 8610 and Nortel 5520, and there is some issue that I can’t explain. Despite the fact I have followed Nortel’s recommendation regarding the last workaround between these two devices (especially with the timeout scale to modify), my VLACP seems to do like “link flap” sometimes. I run SMLT over this layer and one of the two ports concerned by this process is put down by VLACP without any reason (no link or transceiver failure …)
I will communicate you the firmware versions used on each device later (by memory, 5.1 for 5520 and a old release for 8610 like 4.1.2 i think), but did you already face to this kind of issue ?
Many thanks for your help and keep your blog as it is :-)
Michael McNamara says
Thanks for the thoughtful words!
There were a lot of known issues with VLACP flapping back in the early software code for the ERS 8600 and ERS 5500 series. While the software you are are running on the ERS 5520 is fairly reliable (5.1) I would suggest that you upgrade the software on your ERS 8600 to 188.8.131.52. VLACP can also be impacted by how busy your CPU is (and which model of CPU you have) and how many software features you have enabled on your ERS 8600 switch.
I’m going to assume that you have configured VLACP with short timers (value of 500ms) and a retry of 5 which Nortel now recommends.
Many thanks for your answer ! You’re right, my VLACP configuration is exactly what Nortel recommends (500 ms short timers and a timeout scale fixed at 5).
From my point of view, I also think that the version code of my 8600 is obsolete, and I need to upgrade it as soon as possible. What do you think about the 5.x versions recently released for 8600 ?
Regarding the CPU utilization, I will monitor it in the aim to check if there is any link with these VLACP flap.
Have a nice day !
Michael McNamara says
I would strongly advise you to review the release notes of 184.108.40.206 and/or 5.1.1 depending on which version you chose to upgrade to. I will point out that 5.1 and 5.1.1 where pulled from Nortel’s website in place of 220.127.116.11 (I suspect there was a pretty significant problem although the release notes didn’t really give any hints).
If you are seriously contemplating 5.x software you should make sure that you don’t have any un-supported hardware in your switch like non-E cards or 8690SFs. I should also warn you that 4.1.8.x and 5.x work the CPUs much harder than previous software releases as such you can expect to see an increase in CPU utilization.
If you’re using VRRP you’ll also need to make sure that you have unique VRRP IDs across the entire switch.
Rodgers Jeffrey says
I was researching information on Cisco L3/L2 resiliency compared to SMLT and came across your blog. Nice job.
I am an ex-Nortel IT employee with SMLT experience. We replaced out Accelar 1200’s in the Alpharetta campus with one ERS8600 (then Passport) and later added a second creating an SMLT/IST access L3/L2 core with MLT’s to access L2 (BS460) in the closets. It was great! At first.
Then we had to problem with single fiber failure. Everything worked for about 24-48 while a storm built up to the point it shut down both switches. I enjoyed driving into the office at 1 AM to replace a GBIC or pull a fiber patch and reboot the 8600’s. (not!) SFFD fixed the problem and allowed me to sleep better. Network services didn’t implement VLACP until after I got laid of in 2007.
SMLT was the greatest thing since sliced bread. We could upgrade one switch and reboot without missing a lick. I just hope Avaya appreciates the technology they got from Nortel and develops it further.
Michael McNamara says
Thanks for taking the time to comment.
We’ve been using SMLT for quite sometime now and it’s worked extremely well. Unfortunately due to Nortel’s situation I’m now working with Cisco Nexus 7010 switch and vPC (Virtual Port Channel).
Did you manage to enable VLACP on Cisco switches? Do you mind sharing your experience?
By the way, I’m going to do a lab test on VLACP features and my setup looks something like this:-
Site A: Cisco Cat6509 Switch -> Safenet Layer-2 Ethernet Encryptor -> BTI Switch_A -> WAN link to Site B
Site B: Cisco Cat6509 Switch -> Safenet Layer-2 Ethernet Encryptor -> BTI Switch_B -> WAN link to Site A
Should I enable/configure VLACP in all switches residing at Sites A & B? With the encryptor in place, will it block VLACP updates between both sites?
Michael McNamara says
VLACP is a Nortel/Avaya proprietary protocol. It won’t work with Cisco equipment. VLACP will only work between Nortel/Avaya switches.
I have connect switch port 32 to patch panel 120 end A(switch room) and patch panel 120 end B(Work place) to connect hub. configuration below
(config)t#interface fastethernet 32
But auto – negotiation showing disabled only. please help how to enabled auto negotiation.
Michael McNamara says
You question isn’t really on-topic with respect to VLACP. Please post any additional questions over one the discussion forums.
When you set the speed to 100 you disable auto-negotiation. If you are trying to set the speed to auto and the duplex to full you need to research CANA (Customizable Auto-negotiation Advertisements) which is only available in certain switch models running specific software releases.
If you want true auto-negotiation, then set the speed to auto and the duplex to auto.
I have been using the forum for guides and advice, and its proved very valuable.
I have 3 stacks of 5520 & 5510. would the timeout-scale 5 & the fast-periodic-time 500
be different as there is no 8600 in out setup. there is one stack with 2 x 5520 and the other 2 stacks have 3 switches in ( 1x 5520 Base and 2 x 5510 ), I have set ports 1/47,2/47 on the 5520 stack to go to one of the 3 switch stacks 1/47,3/47 & then on the other stack it will be 1/48,3/48.
I assume this will work ? Do I then create an MLT for the relavant ports ( 2 MLT’s that contain the ports in the VLACP ?
Michael McNamara says
So VLACP is just a Layer 2 heartbeat to detect path failures… you should leave VLACP out of the mix until you have the MLT/SMLT configuration setup and working properly.