Posted: Mon Sep 10, 2012 21:02 Post subject: Intermittent loss of IP broadcast packets over Client Bridge
Background:
We a running a number of Wireless networks based on Engenius EOC5611/2611 devices. The networks are fed from an Internet connected backhaul which is then distributed over a 5Ghz network of EOC5611's in a star topology with EOC5611's in AP mode at the centre which feed via a wired connection to EOC5611's in CB mode which in turn may feed down through an other layer (or two in some cases) of AP's and CB's. At the end of the 5Ghz core network, EOC2611's are used to give end user 2.4Ghz access. Network services such as DNS and DHCP are external to the Engenius devices running on a server at the backhaul. End user devices connect to the 2.4 Ghz AP's and obtain their IP configuration via DHCP from the backhaul DHCP server. All working ok with native Engenius firmware.
New configuration:
However, we found the native Engenius firmware to be somewhat lacking in functionality and a little buggy. So on our next deployment we decided to use DD-WRT in all the EOC5611 and EOC2611 devices. We built up a network in the labs, ran all sorts of test for loading, speed, interoperability etc. etc. All worked on. Then we went on to deploy a network for real and in this more complex environment we found a problem that did not appear in the labs.
The problem:
From time to time, in different parts of the network, for a period of time which can run into hours, end user devices on the 2.4GHz network failed to connect to the internet even though they got a good wireless connection. Further investigation showed they were failing to pick up an IP address from the central DHCP server so were falling back to a 169.... IP address which would not route through the network. But other parts of the network continued to function ok and get their IP configuration from the DHCP server.
Looking at it from the pint of view of the DHCP server, we see DHCP DISCOVER broadcast packets arriving from the end user device that the DHCP server responded to with a DHCP OFFER but the expected NACK or ACK part of the DHCP conversation did not follow. What follows is a further attempt from the end user device to get its configuration, resulting in further DISCOVER/OFFER pairs and on and on.
When a failing section of the network is located, we found that we could track it down to a 5611 CB to 5611 AP connection that was not passing the DHCP OFFER broadcast back down to the end user device. Using tcpdump on both ends of the AP/CB wireless connection, we could see that IP broadcast packets (of any type not just DHCP) were flowing ok from CB to AP but none were flowing back from AP to CB. Sometimes, while observing the problem for some time, it suddenly starts to work ok and broadcast packets flow in both directions ok.
Other factors:
The SPI Firewall has been disabled although this may well be irrelevant as the 5611 has no WAN port as such. The Ethernet port is used as a LAN port (eth0) which is bridged to the Wireless adapter (ath0).
The 5Ghz network runs WPA2. However, we still get the same problem if we drop down to Open (Disabled) security.