SAN - Michael McNamara

Storage Area Networks

Michael McNamara — Sat, 18 Oct 2014 18:15:30 +0000

I’ve recently found myself dealing with a lot of storage area network issues. It’s my team that has the privilege of maintaining 22 different Fiber Channel Storage Area Network (SAN) switches with multiple independent and redundant SAN fabrics. I’ve recently run into a number of situations where improper cabling and FC SAN switch zoning were the root cause to some very visible application outages. A SAN provides “block” level access to a multitude of devices including disk arrays such as the EMC VNX 5300 and 7500, and tape drives such as the HP MSL 4048 tape library. Storage Area Networks provide centralized storage and usually play an important role in cluster solutions due to their ability to share storage between multiple servers or hosts. If you have a traditional (non-VSAN) VMware ESXi cluster you have either a Fiber Channel or iSCSI SAN providing the VMware datastores between the ESXi hosts.

A few weeks ago I had the opportunity to expand the SAN fabric by installing a pair of Brocade 5100 FC SAN switches. You need to follow a pretty simple process to ISL a FC SW to an existing SAN fabric.

You must make sure that the new switch has a unique Domain ID in relation to all the other switches in the fabric
You must make sure make that the new switch doesn’t have an active ZONE configuration
With those two steps completed you only need to physically connect the switches.

The switches will automatically configure the FC ports as E-Ports and establish an Inter Switch-Link (ISL) trunk between the two FC switches. The ZONE configuration will be copied over to the new switch so there’s very little to-do beyond configuring an IP address for management.

Whether you have a large storage network or you just have a single FC switch you might want to look at Brocade SAN Health.

I was very surprised by the level of detail that the tool was able to capture, and the Visio diagram was awesome!

I also took the opportunity to do some housekeeping by going back through all the FC switches and make some general configuration changes enabling SNTP, setting the timezone and configuring SYSLOG logging. I can’t tell you how maddening it is trying to troubleshoot a problem where the log time stamps are all wrong. Here are the CLI commands that I used to make those configuration changes;

tsTimeZone US/Eastern
tsclockserver 1.2.3.4
syslogdipadd 1.2.3.4

While changing the timezone the switch reported that I needed to reboot the switch for the change to take affect. I have yet to reboot the switch but the time is reporting the correct offset from GMT. This message was consistent across all versions of Brocade’s Fabric OS including v6.x and v7.x.

Cheers!

The post Storage Area Networks first appeared on Michael McNamara.

Secondary Data Center – Where have I been?

Michael McNamara — Sat, 08 Dec 2012 16:40:20 +0000

It was just over 2 years ago that I designed and stood up our first off-campus data center in Philadelphia, PA. Since that time we’ve completely vacated our original data center migrating all the servers, applications and services out to our new data center. Last month we relocated our offices leaving the old data center and office space behind forever. The new office space is very nice and has a lot of (very needed) conference rooms all of which have built-in audio/video capabilities with either an over-head projector or flat screen TV. I’m still hoping to have a LAN party someday on those 61″ monster displays perhaps with Call of Duty: Black Ops 2?

In June we started deploying our secondary data center with the intent of providing our own business continuity and disaster recovery services for our tier 1 applications including all our data storage needs. The design allows us the flexibility to utilize both DCs in an active/active configuration with the ability to move workloads (virtual machines) between DCs. While the design allows us that option we’re still testing how we’re going to handle all the different disaster scenarios – blade, enclosure, rack, SAN, cage, entire data center, etc. While our primary data center rings in at 800 sq ft our secondary data center is only 300 sq ft. This is possible because we’re utilizing a traditional disaster recovery model for our big box non-tier 1 applications that for one reason or another aren’t virtualized. This helps reduce the number of lazy assets hanging around and helps control some of the budget numbers. I totally expect the number of big box applications to continue to shrink over time as more and more application vendors embrace virtualization.

We’ve had pretty good success with the design of our first data center so we only made a few corrections. There’s a lot of logistics that need to be considered in any design especially around all the power and cooling requirements.

The Equipment

What equipment did we use? We already deployed Cisco at our primary data center so we decided to stay with Cisco at our secondary data center.

Cisco Nexus 7010
Cisco Nexus 5010
Cisco Nexus 2248
Cisco Nexus 1000V
Cisco Catalyst 3750X
Cisco Catalyst 2960G
Cisco ASA5520
Cisco ACE 4710
Cisco 3945 Router (Internet)
Cisco 2811 Router (internal T1 locations)

What racks did we use for the network equipment?

Liebert Knurr Racks
Liebert MPH/MPX PDUs

What equipment did we use for the servers/blades?

HP Rack 10000 G2
HP Rack PDU (AF503A)
HP IP KVM Console (AF601A)
HP BladeSystem C7000 Enclosure
HP Virtual Connect Flex-10 Interconnect
HP SAN 8Gb Interconnect
Cisco Catalyst 3120X
HP BL460c G7
HP BL620c G7
HP DL380 G8
HP DL360 G8

What are we using for storage?

IBM XIV System Storage Gen3 (SAN) (w/4 1Gbps iSCSI replication ports)
IBM SAN80B-4 SAN Switch
EMC DD860 (Disk-Disk backup via Symantec NetBackup)

Additional miscellaneous equipment;

MRV LX-4048T (terminal server)

We had some challenges with designing our secondary data center due to the density of our equipment. We had to stay under the maximum kw per sq foot load that the room (data center) was designed to handle. This is a simple calculation based on the kW utilization of the equipment to determine if there is adequate power and cooling available to meet that demand. We also had to maintain a N+1 design so we really can’t consuming more than 40% of our capacity leaving 10% for reserve. While some vendors charge a flat fee for the space (includes power) others charge per kWh so it’s very important to understand what type of demand you’re going to be placing on the data center.

My Design

We stood up a pair of Ciena 5200s from Zayo (formerly AboveNet) providing us a DWDM ring with 4 wavelengths between our primary data center and secondary data center . We’re using 2 wavelengths for the IP network between 2 pairs of Cisco Nexus 7010s and 2 wavelengths for the SAN fiber channel network between 2 pair of IBM SAN switches. We have the option of adding upwards of 4 additional wavelengths before we need to add any hardware so we have room for growth. The 4 wavelengths are diverse between an east and west path but they are not protected so it’s up to the higher layer protocols to provide the redundancy and failover.Not visible in the diagram above is a 10GE WAN ring that connects all our hospitals together. The primary and secondary data centers are also tied into that ring via multiple peering points for redundancy. You might be asking yourself why I’m using a Cisco 3750E as a termination switch in our primary data center. At the time we deployed our Cisco Nexus 7010s they didn’t support the 10GBase-ER SFP+ optic so I had to use the Cisco 3750E (with RPSU) as a glorified media transceiver/converter from 10GBase-ER to 10GBase-SR. The Cisco Nexus 7010 now has a 10GBase-ER SFP+ optic available so we didn’t need to use the Cisco 3750 in the secondary data center.

We are essentially stretching a Layer 2 vPC connection between the 2 data centers. It’s possible that some folks will get excited at the mention of Layer 2 between the data centers but it’s the best solution for us at this time and it certainly has pros and cons like everything in networking. We looked at potentially running OTV between the Cisco Nexus 7010s but ultimately decided to use a vPC configuration. We are only stretching the virtual machine VLANs that we need between the data centers.

My Thoughts

There’s a lot of work required to design any data center or even an ICR (Intermediate Communications Room), CCR (Central Communications Room), MDF (Main Distribution Frame) or IDF (Intermediate Distribution Frame). You’re immediately confronted with space, power and cooling challenges never mind coming up with the actual IP addressing scheme, VLAN assignments, routing vs bridging ,etc. You need to determine how much cabling you’ll need both CAT6 and fiber, perhaps you’ll look to use twinax of DAC (Direct Attach Copper) for your 10GE connections. Let’s not forget to include the ladder racks, basket trays, fiber conduits, PDUs, out-of-band networking, etc.

You also need to design the data center as if it was 300+ miles away… license those iLOs (HP Integrated Lights Out), purchase IP enabled KVMs, purchase console/terminal servers (Opengear or MRV) and wire everything up as if you will never have the opportunity to visit it again. We’ve had a few issues in the past few years that were quickly (less than 15 minutes) resolved thanks to having all our iLOs licensed, all our KVMs IP enabled, all our console/serial ports connected to a console/terminal server and the ability to dial-up into the console/terminal server should the problem get really bad.

Here’s a short story… We had a number of billing issues in the first few months of our contract with our current primary data center provider and the data from our Liebert PDUs, HP PDUs, and HP C7000 enclosures was invaluable in calling into question the numbers that were being reported to us. In all honesty when they told me we were consuming 53A on a 50A circuit I knew that something was grossly wrong with their math. In the end the provider admitted that there numbers were grossly wrong and the corrected numbers were in-line with the data we collected from our equipment.

It’s never a good idea to skimp on the documentation and I really advise taking lots of pictures, you’d be surprised how quickly you can forget what the back a specific rack looks like when you’re trying to walk Smart Hands through replacing a component at 2AM in the morning.

Cheers!

The post Secondary Data Center – Where have I been? first appeared on Michael McNamara.

vSphere SCSI reservation conflict

Michael McNamara — Wed, 02 Sep 2009 00:00:38 +0000

We had our first issue today with our recent VMware vSphere 4 installation. We’re currently up to about 30 virtual machines spread across five BL460c (36GB) blades in an HP 7000 Enclosure. The problem started with a few virtual machines just going south, like they had lost their mind. It was discovered that all the virtual machines that were affected were on the same datastore (LUN). One of the engineers put the ESX host that was running those VMs into maintenance mode and rebooted it. After the reboot the ESX host was unable to mount the datastore. Everything seemed fine from a SAN standpoint and the Fiber Channel switches were working fine. A quick look at /var/log/vmkwarning on the ESX host revealed the following messages;

Sep  1 13:04:35 mdcc01h10r242 vmkernel: 0:00:26:02.384 cpu4:4119)WARNING: ScsiDeviceIO: 1374: I/O failed due to too many reservation conflicts. naa.600508b4000547cc0000b00001540000 (920 0 3)
Sep  1 13:04:40 mdcc01h10r242 vmkernel: 0:00:26:07.400 cpu6:4119)WARNING: ScsiDeviceIO: 1374: I/O failed due to too many reservation conflicts. naa.600508b4000547cc0000b00001540000 (920 0 3)
Sep  1 13:04:40 mdcc01h10r242 vmkernel: 0:00:26:07.400 cpu6:4119)WARNING: Partition: 705: Partition table read from device naa.600508b4000547cc0000b00001540000 failed: SCSI reservation conflict

A quick examination of the other ESX hosts revealed the following;

Sep  1 13:04:26 mdcc01h09r242 vmkernel: 21:22:13:25.727 cpu10:4124)WARNING: FS3: 6509: Reservation error: SCSI reservation conflict
Sep  1 13:04:31 mdcc01h09r242 vmkernel: 21:22:13:30.715 cpu12:4124)WARNING: FS3: 6509: Reservation error: SCSI reservation conflict
Sep  1 13:04:36 mdcc01h09r242 vmkernel: 21:22:13:35.761 cpu9:4124)WARNING: FS3: 6509: Reservation error: SCSI reservation conflict

We had a SCSI reservation issue that was locking out the LUN from any of the ESX hosts. The immediate suspect was the VCB host as it was the only other host that was being presented the same datastores (LUNs) as the ESX hosts from the SAN (HP EVA 6000).

We rebooted the VCB host and then issued the following command to reset the LUN from one of the ESX hosts;

vmkfstools -L lunreset /vmfs/devices/disks/naa.600508b4000547cc0000b00001540000

After issuing the LUN reset we observed the following message in the log;

Sep  1 13:04:40 mdcc01h10r242 vmkernel: 0:00:26:07.400 cpu9:4209)WARNING: NMP: nmp_DeviceTaskMgmt: Attempt to issue lun reset on device naa.600508b4000547cc0000b00001540000. This will clear any SCSI-2 reservations on the device.

The ESX hosts were almost immediately able to see the datastore and the problem was resolved.

We believe the problem occurred when the VCB host tried to backup multiple virtual machines from the same datastore (LUN) at the same time. This created an issue when the VCB host locked the LUN for too long causing the SCSI queue to fill-up on the ESX hosts. This is new to us and to me so we’re still trying to figure it out.

Cheers!

References;
http://kb.vmware.com/kb/1009899
http://www.vmware.com/files/pdf/vcb_best_practices.pdf

The post vSphere SCSI reservation conflict first appeared on Michael McNamara.