What is VLACP and why do I need it? That’s a question that I see being asked quite frequently and not all the answers are correct. I’m hoping to answer this question once and for all and explain the rational behind the protocol and describe some of the issues and difficulties that VLACP helps address.
What is VLACP?
VLACP is an Avaya proprietary protocol used to detect end-to-end failures. VLACP takes the point-to-point hello mechanism of LACP and uses it to periodically send hello packets to ensure end-to-end reachability and provide failure detection across a Layer 2 network. When Hello packets are not received, VLACP transitions to a failure state and the port will be brought down.
Why use VLACP?
We know that auto-negotiation supports both RFI and FEFI so if our interfaces are configured for auto-negotiation those mechanisms will protect us by detecting a link failure scenario. If auto-negotiation is not available (Ethernet Switch 470 GBICs) then VLACP can help detect link failures and prevent frames being mistakenly transmitted into oblivion when the far end is down. If we are using a LAN extension or Q in Q solution link will only prove that we have connectivity to the edge of the network. VLACP will flow across the entire carrier network verifying that we have connectivity end to end across the entire Layer 2 network.
Why is VLACP so important in an IST/SMLT configuration?
Well we know that failure detection is a big issue and something that a lot of vendors work towards refining. You have a link failure or core failure and the network quickly converges and re-routes traffic so there’s no or limited disruption of traffic. Let me ask you what happens when that link or core recovers?
Let me take the example of a core switch failure as an example which should easily make the point. When that switch starts to recover we don’t want to immediately start accepting packets from edge/distribution switches in the network until we can re-establish the IST link. Let’s say we also want to have a complete routing table populated too before we start accepting packets. We need a way to bring up the port to the edge/distribution switch but we also need to let that switch know that we’re not yet ready to bridge/route traffic. VLACP answers that problem by allowing the link to establish and sending VLACP PDUs to the far end switch telling it not to start forwarding frames until we’re ready to receive them.
Avaya has been working to refine how VLACP works and released a significant improvement in March 2011 which is available in the following software releases;
- Ethernet Routing Switch 8600 v5.1.4 (or later)
- Ethernet Routing Switch 5000 Series v6.1.6 (or later)
- Ethernet Routing Switch 4000 Series v5.5 (or later)
Here’s a blurb on the change from the ERS 8600 release notes;
VLACP HOLD Enhancement
During SMLT node failure scenarios, traffic loss may be observed in certain scaled SMLT configurations with hundreds of SLTs, hundreds of ports and tens of VLANs. The root cause for the traffic loss was that the ERS8600 ports would come up prematurely at the physical layer causing the remote end to start sending traffic toward the ERS8600 that just came up. On the ERS8600 that just rebooted, the communication between the line cards and the CP may take several seconds in such scaled configurations. This resulted in black-holing the traffic arriving on such ports which were physically up but all operational configuration was not yet performed on those ports by the CP. The VLACP SUBTYPE HOLD feature introduces a new VLACP PDU with a new subtype HOLD to help reduce traffic loss in such scenarios.
The goal of this new implementation is to “hold down” all VLACP enabled links for a specific period of time after a reboot. This prevents remote VLACP enabled devices that understand the new VLACP HOLD PDU from sending data to the ERS8600. This will ensure that all VLACP enabled ports on the ERS8600 have had sufficient time to come up with all operational configuration and are ready to receive and forward the ingress traffic.
ERS8600 switches with 5.1.4.0 release are capable of both sending and receiving VLACP HOLD PDUs. Future code revisions of the Baystack switch family will support receipt and processing of VLACP HOLD PDUs, but will not generate them. Please refer to the applicable product release notes for information regarding product specific software levels required for support of this VLACP enhancement. VLACP is an Avaya proprietary protocol and hence this enhancement in not applicable when connecting to switches from other vendors.
By default, the VLACP HOLD feature will be disabled. The feature is enabled by configuring a positive value for VLACP HOLD Time. The VLACP Hold Time value configured should be selected based on the specific recovery implementation requirements, size and recovery characteristics for your network implementation.
How do you configure VLACP?
Since VLACP is a Avaya (formerly Nortel) proprietary protocol you can only configure VLACP on a point to point link between two Avaya switches. In a scenario where you are utilizing a carrier TLS (Transparent LAN Services) the two switches at the ends of the network need to be Avaya but the switches in the carrier network can be from any manufacturer so long as they forward the Layer 2 VLACP PDUs through the network.
Here’s a quick example of how to enable VLACP on a DMLT (Distributed MultiLink Trunk) between an edge Ethernet Routing Switch 5000 and an Ethernet Routing Switch 8600;
Ethernet Routing Switch 5520
interface fastEthernet 1/48,2/48
vlacp port 1/48,2/48 timeout short
vlacp port 1/48,2/48 timeout-scale 5
vlacp port 1/48,2/48 fast-periodic-time 500
vlacp port 1/48,2/48 enable
exit
vlacp macaddress 01:80:c2:00:00:0f
vlacp hold_time 20
vlacp enable
Ethernet Routing Switch 8600 (NNCLI)
config ethernet 3/1,4/1 vlacp fast-periodic-time 500
config ethernet 3/1,4/1 vlacp timeout short
config ethernet 3/1,4/1 vlacp timeout-scale 5
config ethernet 3/1,4/1 vlacp macaddress 01:80:c2:00:00:0f
config ethernet 3/1,4/1 vlacp enable
config vlacp hold-time 20
config vlacp enable
Is VLACP right for me?
If you are running a pair of Avaya Ethernet Routing Switch 5000 or 8600/8800s in a switch cluster then you should most definitely be utilizing VLACP. If you are running a multi-vendor network then VLACP might not be possible since it’s an Avaya proprietary protocol. If you are running a simple flat network with MLT or DMLT links between Ethernet Routing Switch 4000 and 5000 series switches then VLACP might not provide a whole lot of value assuming you are running auto-negotiation and have RFI and FEFI capabilities.
There have been a number of issues with VLACP over the past few years but a great many of those have been resolved in later software releases. If you have hundreds of interfaces running VLACP you can run into scaling issues depending on the CPU/SF that you have in your Ethernet Routing Switch 8600. If you stick with the recommended short timer value of 500ms with a value of 5 retries you shouldn’t have any issues. Yes that equates to 2.5 seconds of time before an interface is mark down by VLACP but that’s a value that most peripherals should be able to tolerate including Avaya’s IP telephony. You can be more aggressive with the retry count but you might end up missing VLACP polls and have interfaces marked down when they actually aren’t down.
I’m running VLACP at every site I have now for the past 3 years and have had very few problems. It’s actually saved me on a number of occasions because the Ethernet Switch 470 48Ts that don’t support auto-negotiation on the GBICs and VLACP has been able to detect the problem and mark the link as down allowing the traffic to flow over the remaining link(s) with no interruption to user traffic.
Are you running VLACP?
Cheers!
References;