This week I took on the task of upgrading a pair of Nortel Ethernet Routing Switch 8600s from 4.1.8.0 to 5.1.1.1 software. I used the opportunity to ‘test’ out the upgrade process and procedure for a much larger site that I will be upgrading next week.
The site that I upgraded has two ERS 8600 switches (dual 8692SF w/Mezz) running 15 VLANs, OSPF, VRRP, IST/SMLT, 15 edge ERS 5520 switches, 150 IP phones, 300 personal computers, printers, etc. It’s a fairly small site but it’s all IP telephony with a CS1000B.
The site that I will be upgrading next week has two ERS 8600 switches (dual 8692SF w/Mezz) running 80 VLANs, OSPF, VRRP, BGP, PIM-SM, IST/SMLT, 36 edge switches/stacks, 50 IP phones, 4000+ personal computers, printers, etc. This site is a much larger site with running a lot more VLANs along with BGP and PIM-SM.
With all the issues I’ve run into these past 12 months regarding the ERS 8600 switch I must admit that I was pleasantly surprised. Nothing blew up, nothing broke (knock on wood), and everything just seemed to work – there’s a surprise. I thought I would share the steps I took and the process I used. I will readily admit that I don’t have any RS blades in my environment so customers with RS blades might want to think twice before upgrading to 5.1.1.1 software (this is based on my discussions with other Nortel customers and their experience with 5.1.1.1 and RS blades).
I’ll also need to follow-up late next week and let everyone know how the larger upgrade went.
How did I do it?
Well I did it remotely but with access to the serial ports of all four CPUs (8692SFw/Mezz). I had to make a few configuration changes before I performed the upgrade, these are outlined in the release notes but I’ll touch on them here.
– Enable System Monitor (JDM, Edit Chassis -> System Flags and then look under “System Monitoring”)
– Reconfigure the SNMPv3 trap hosts with a retry value of 0 (read the release notes!)
I also took the opportunity to enable Jumbo Frame support with the ERS 8600 switch because that configuration change requires a reboot to take effect. (config sys set mtu 9600)
With those configuration changes saved I set out to copy the software up to the primary CPUs and then across to the standby CPUs. I took the extra precaution of copying all the software and configuration files from the FLASH to the PCMCIA card just in case something came up that I need them there.
I started the process by upgrading the standby CPU (8692SFw/Mezz) on B switch first. From the primary CPU on the B switch I issued the following commands;
config bootconfig choice primary image-file /flash/p80a5111.img config bootconfig choice secondary image-file /flash/p80a4180.img save bootconfig save config
With the configuration files changed and saved I connected to the standby CPU (peer telnet) and issued the command to upgrade the boot software on the standby CPU;
boot /flash/p80b5111.img
I watched the console as the CPU restarted and upgraded the boot flash;
################ 8K CPU BOOT FLASH Update ################ File obj-boot/p80b5111-mpc740.romH found in loaded image File size: 786624 bytes Number of flash sectors to be programmed: 7 WARNING: You are about to re-program your Boot Monitor FLASH image. Do NOT turn off power or press reset until this procedure is completed. Otherwise the card may be permanently damaged!!! Press <Return> to stop monitor upgrade.... Erased 7 sectors of bootflash Programmed BOOTFLASH Image Verifying new BOOTFLASH Image 786624 matches, 0 mismatches Updating Fileheader Erased 1 sectors of bootflash Fileheader update complete Verifying new Fileheader 512 matches, 0 mismatches Update complete! Press return to reboot
The CPU then started to load the 5.1.1.1 software for the first time;
Copyright (c) 2009 Nortel, Inc. CPU Slot 5: PPC 745 Map B Version: 5.1.1.1 Creation Time: Sep 30 2009, 15:13:36 Hardware Time: DEC 11 2009, 02:16:31 UTC Memory Size: 0x10000000 Start Type: cold SMART MODULAR TECH SMART 221 CF The /pcmcia device mounted successfully, but it appears to have been formatted with pre-Release5.1 file system code. Nortel recommends backing up the files from /pcmcia, and executing dos-format /pcmcia to bring the file system on the /pcmcia device to the latest ERS8600 baseline. open_file:can't open "/pcmcia/pcmboot.cfg" 0x380003 S_dosFsLib_FILE_NOT_FOUND /flash/ - Volume is OK The /flash device mounted successfully, but it appears to have been formatted with pre-Release5.1 file system code. Nortel recommends backing up the files from /flash, and executing dos-format /flash to bring the file system on the /flash device to the latest ERS8600 baseline. Loaded boot configuration from file /flash/boot.cfg Attaching network interface lo0... done. Press <Return> to stop auto-boot... Loading /flash/p80a5111.img ... 12606133 to 43734492 (43734492) Starting at 0x1000000... SMART MODULAR TECH SMART 221 CF Booting PMC280 Mezz HW. Please wait..... The BootCode address is 0xe000100 3303 . Mezz taking over console and modem..... Mezz CPU Booted successfully Initializing backplane net with anchor at 0x4100... done. Backplane anchor at 0x4100... .. Mounting /flash: .done. License File <license.dat> does not exist License File <license.dat> does not exist License File <license.dat> does not exist CPU6 [12/10/09 21:18:23] SW INFO Trial Period will expire in 60 days Ethernet Routing Switch 8600 System Software Release 5.1.1.1 Copyright (c) 1996-2009 Nortel, Inc. File does not exist Critical Log file created CPU6 [12/10/09 21:12:11] SW INFO System boot CPU6 [12/10/09 21:12:11] SW INFO ERS System Software Release 5.1.1.1 CPU6 [12/10/09 21:12:11] SW INFO Waiting for cpu in slot 5 ... 2 seconds CPU6 [12/10/09 21:12:13] SW INFO CPU card entering warm-standby mode... CPU6 [12/10/09 21:12:16] SW INFO Loading configuration from /flash/config.cfg ************************************************** * Copyright (c) 2009 Nortel, Inc. * * All Rights Reserved * * Ethernet Routing Switch 8010 * * Software Release 5.1.1.1 * **************************************************
With that the standby CPU was upgraded to 5.1.1.1 software and I was set to upgrade the primary CPU which would cause the switch to fail over to the standby CPU. When I issued the boot /flash/p80b5111.img command on the primary CPU the standby CPU (slot 5) became the master and I observed the following on the console;
CPU5 [12/10/09 21:20:23] HW INFO Stand-by CPU in slot # 5 becoming master... CPU5 [12/10/09 21:20:28] MPLS INFO All MPLS components are up and active CPU5 [12/10/09 21:20:28] HW INFO Card inserted: Slot=5 Type=8692SF CPU5 [12/10/09 21:20:29] HW INFO Card inserted: Slot=6 Type=8692SF CPU5 [12/10/09 21:20:29] SW INFO R-Module inserted: Slot=1 Type=8630GBR, waiting to bootup... CPU5 [12/10/09 21:20:29] SW INFO R-Module inserted: Slot=2 Type=8648GTR, waiting to bootup... CPU5 [12/10/09 21:20:29] SW INFO R-Module inserted: Slot=3 Type=8683XLR, waiting to bootup... CPU5 [12/10/09 21:20:29] HW INFO Initializing 8692SF in slot #5 ... CPU5 [12/10/09 21:20:31] HW INFO Initializing 8692SF in slot #6 ... CPU5 [12/10/09 21:20:37] SW INFO Slot 1: Loading /flash/p80j5111.dld CPU5 [12/10/09 21:20:37] SW INFO Slot 2: Loading /flash/p80j5111.dld CPU5 [12/10/09 21:20:49] SW INFO Slot 3: Loading /flash/p80j5111.dld CPU5 [12/10/09 21:21:16] SW INFO Slot 1: 8630GBR Initializing. Do not remove board. CPU5 [12/10/09 21:21:16] SW INFO Slot 2: 8648GTR Initializing. Do not remove board. CPU5 [12/10/09 21:21:21] SW INFO Slot 1: 8630GBR Initialization completed. CPU5 [12/10/09 21:21:21] SW INFO Slot 2: 8648GTR Initialization completed. CPU5 [12/10/09 21:21:22] SW INFO Slot 1: Restart new image version 5.1.1.1 CPU5 [12/10/09 21:21:22] SW INFO Slot 2: Restart new image version 5.1.1.1 CPU5 [12/10/09 21:21:30] SW INFO Slot 3: 8683XLR Initializing. Do not remove board. CPU5 [12/10/09 21:21:31] SW INFO Slot 1: Loading /flash/p80j5111.dld CPU5 [12/10/09 21:21:31] SW INFO Slot 2: Loading /flash/p80j5111.dld CPU5 [12/10/09 21:21:36] SW INFO Slot 3: 8683XLR Initialization completed. CPU5 [12/10/09 21:21:36] SW INFO Slot 3: Restart new image version 5.1.1.1 CPU5 [12/10/09 21:21:45] SW INFO Slot 3: Loading /flash/p80j5111.dld CPU5 [12/10/09 21:21:49] SW INFO slot 2 found NP heartbeat - R-Module is online CPU5 [12/10/09 21:21:51] SW INFO slot 1 found NP heartbeat - R-Module is online CPU5 [12/10/09 21:22:12] SW INFO slot 3 found NP heartbeat - R-Module is online CPU5 [12/10/09 21:22:12] HW INFO Initializing 8630GBR in slot #1 ... SNMP-v3 VACM configuration is currently using default parameters. These parameters should be changed for maximum security. SNMP-v3 Having more than one entry in Group-access table for the same group-name with different security levels can cause a security hole WARNING: THE ALLOWED LOG FILE SIZE HAS EXCEEDED CONFIGURATION LIMITS. THE FILE SIZE IS CURRENTLY 1071131 BYTES!!!! CPU5 [12/10/09 21:22:12] HW INFO Initializing 8648GTR in slot #2 ... ************************************************** * Copyright (c) 2009 Nortel, Inc. * * All Rights Reserved * * Ethernet Routing Switch 8010 * * Software Release 5.1.1.1 * ************************************************** Login: CPU5 [12/10/09 21:22:13] HW INFO Initializing 8683XLR in slot #3 ... CPU5 [12/10/09 21:22:14] SW INFO Loading configuration from /flash/config.cfg CPU5 [12/10/09 21:22:15] SW INFO NTP Enabled CPU5 [12/10/09 21:22:15] SW INFO The system is ready CPU5 [12/10/09 21:22:15] SNMP INFO Booted with PRIMARY boot image source - /flash/p80a5111.img CPU5 [12/10/09 21:22:17] SW INFO All the configured hosts not reachable CPU5 [12/10/09 21:22:17] SW INFO A new log file = /pcmcia/ccf00005.001 is created CPU5 [12/10/09 21:22:17] SW INFO PCMCIA card detected in Master CPU "sw-8600-ccr-a.site1.acme.org" slot 5, Chassis S/N SSPN6C0ANC CPU5 [12/10/09 21:22:17] SNMP INFO Chassis with Power Supply redundancy CPU5 [12/10/09 21:22:17] SNMP INFO Fan Up(FanId=1, OperStatus=2) CPU5 [12/10/09 21:22:17] IP INFO the VRF OSPF Md5 key file 1 does not exist CPU5 [12/10/09 21:22:17] SNMP INFO Fan Up(FanId=2, OperStatus=2) CPU5 [12/10/09 21:23:03] SNMP INFO CPU switch over, stand-by CPU becoming master CPU5 [12/10/09 21:23:03] SNMP INFO Sending Warm-Start Trap CPU5 [12/10/09 21:23:03] SNMP INFO CPU switch over, stand-by CPU in slot # 5 became master CPU5 [12/10/09 21:23:35] SNMP INFO Communication established with backup CPU
I will comment that for the time that the standby CPU was running 5.1.1.1 software and the primary CPU was still running 4.1.8.0 software the console became very sluggish and unresponsive. The CPU utilization also surged to 97%. I suspect the CPUs didn’t like trying to communicate with each other being so far apart on software releases.
I didn’t need to-do anything special while upgrading the switches to have the other switch in the cluster maintain the network. The IST link was re-established between upgrading the switches when B was running 5.1.1.1 and A was still running 4.1.8.0. I just repeated the steps above for the A switch and everything worked just fine.
I did go backup and clean up the log files, you might have noticed the warning in there about the log file being full. I didn’t reformat the /flash or /pcmcia filesystems because I wanted to the option to downgrade if necessary. I can reformat those filesystems at a later point in time if the software proves stable and reliable.
I’m impressed with 5.1.1.1 so far, let’s see how it stands the test of time.
Cheers!
Update: Thursday December 17, 2009
I completed the 5.1.1.1 upgrade of the larger site I refered to above on Wednesday morning and I’m happy to report that everything is working well. I did get an initial scare when one of the core ERS 8600 switches (running 4.1.6.3) went belly up just before I started the upgrade. I had just completed re-configuring the VRRP interfaces on both ERS 8600 switches so the VRRP IDs would be unique. While the switch was still forwarding Layer 2 traffic it stopped processing all Layer 3 traffic, wouldn’t respond to ICMP ping and the IST went down.
The upgrade itself took more than it usually does especially with the SuperMezz cards installed. Although once the switch came up everything was working fine, OSPF, BGP,FDB, ARP, IST, SMLT, PIM-SM, etc.
I’m very hopeful that 5.1.1.1 will provide some much needed stability to the ERS 8600 switch!
Cheers!