It’s the network’s fault? Well no not really but here’s another entertaining story…
Within the past six months my employer opened a new 1 million square foot distribution facility here in southeastern Pennsylvania. It’s been an exciting challenge for me personally. One of the bigger challenges has been getting the wireless network tuned in since the building opened essentially empty and now almost 4 months later is filled to the gills with product as we prepare for the upcoming holiday season. It’s about 45′ to the ceiling framing joists so I had our integrator mount the APs (Cisco AP 2702e) to 6′ of electrical conduit hanging from the ceiling. This way the APs are a little closer to the intended coverage space yet they are high enough to avoid being hit by the many fork lifts and other vehicles that traverse the distribution center.
Weekly I review the power settings and RF coverage as the building literally fills up with product…which attenuates the signal from the wireless APs little by little. The majority of my WiFi experience is in hospitals, very old buildings and corporate office space where the RF signal only covers about 40-50′ at max and sometimes much smaller (hospitals). The biggest surprise in this facility is the propagation of the RF signal, there’s literally too much RF signal. I’ve had to turn off quite a few AP radios especially in the 2.4Ghz (802.11b/g) band to keep the channel interference to a minimum.
While RF coverage issues are no surprise to anyone working in the industry I have had an interesting time tracking down an idle timeout issue between the Motorola handheld scanners and the IBM AS400 host which runs our warehouse inventory system. The default idle user timeout on the Cisco WLC 5508 is 5 minutes, so I had to changed that parameter to 43200 seconds (12 hours). I also changed the session timeout to 86400 seconds (24 hours). Our employee shifts run 10 hours so these settings looked like they would work – and they do work, except they didn’t work for everyone. In my initial testing I found that I could login from one side of the distribution center, walk to the other side of the distribution center (6 minute walk – it’s that big) and I could pickup my session where I left off without issue even with the handheld going to sleep as I physically moved through the distribution center. I thought the problem was solved and moved onto other issues only to eventually hear that some users were still reporting that they were being “kicked out” – but not everyone.
It was a very specific set of users… those working the receiving dock and the wholesale racking. What was the common denominator? These users would occasionally put down their Motorola handhelds for upwards of 10-15 minutes as they unloaded a truck or prepared to put away a pallet. When they tried to use the handheld it would prompt that their AS400 session had ended and they would need to log in again.
I ran a few tests from the office space in the distribution center and quickly determined that if I was idle longer than 12 minutes I would generally get disconnected from my AS400 session. Was this issue coming from the wireless network or the AS400 host itself? I setup a quick and easy control test. I enabled telnet on one of my Linux jump boxes and established a connection from my test MC9190. After logging into my Linux jumpbox I sat the device on my desk for 25 minutes with no input from me. The Motorola MC9190 went to sleep in about 30 seconds, 25 minutes later I picked up the device, waited for it to power up and connect to the network and found my telnet session still working without issue so this test essentially eliminated the wireless network.
It turns out that the AS400 uses a TCP keepalive to determine if a user is still connected to the system. I’m guessing that since the handheld is sleeping it’s not responding to the keepalive packets so the host system is eventually closing the connection to the handheld. I have some inquires into our AS400 team and IBM to validate my theory.
Update: Sunday October 25, 2015
We were able to update the session keep-alive parameter on the IBM AS400 host. This value defaults to 10 minutes. So every 10 minutes the host will check if the session is still valid by sending a keep-alive packet. If the IBM AS400 host doesn’t get a response to the check it will logoff the session. We increased this value to 20 minutes and that’s helped greatly.