I think it’s fair to say that of the many challenges that awaited us on joining OVO’s Infrastructure team in mid 2014, it’s the wireless that we have ended up investing the most time in.
We inherited a cloud based system which uses a virtual controller as opposed to a hardware appliance on premise. In theory, the system is a very intelligent one and is more than adequate for a small business office with a few users browsing the web and receiving and sending emails. However OVO is much more than that and we soon found that the existing system was not going to cut the mustard for the levels of usage that were being demanded of it.
Wifi is a shared medium - the airspace is shared not only by OVO’s own Wifi usage but also surrounding companies’ Wifi and such things as Bluetooth, weather radars, wireless headsets, cordless phones and even microwaves - so everyone has to share the airspace and the bandwidth with everyone else in the vicinity.
When you access the internet over Wifi your Wireless Network card is generating hundreds of thousands of “packets” which need to be transmitted across the airspace to the Wireless Access Point (AP) that you are connected to. From there the traffic is sent via the Access-point’s network cable across the OVO network before being sent out into the world wide web of fibre optics and transatlantic cables to reach your desired website. Whilst your Wifi card is sending and receiving these packets, so is anyone else’s who is connected to that AP - often 20-30 clients will share an AP - and they all have to wait their turn to transmit, waiting for the airspace to clear.
So with OVO’s open culture of allowing staff to stream music and video whilst working, we have a situation where user’s Wifi network cards are constantly sending and receiving data throughout the day whilst users are also performing work functions such as web browsing, hangouts, Google drive, file sharing and sending and receiving email.
Before Google Drive came along users were having to move their files around using tools such as Dropbox, which meant very large transfers of data across the Wifi placing a huge demand on the airspace.
Our initial attempts at fixing what we inherited was to try and fine tune the Wifi to make best use of the airspace available to us. Wifi comes in two different “flavours”, the older standard runs at a frequency of 2.4Ghz and the newer one at 5Ghz. At 2.4Ghz you have the option to use one of eleven “channels” per access-point. Any traffic transmitted on a certain channel is subject to interference to any other traffic also being transmitted on that channel, regardless if it is OVO’s network or another’s in close proximity (or the afore-mentioned Bluetooth or Microwave interference).
Because the eleven 2.4Ghz channels gradually overlap, we can only actually use 3 of them to avoid interference on our own network. They are channels 1,6 and 11.
However, it is impossible to stop surrounding companies from breaking this rule and using the channels in between which will then cause interference on all the other channels it overlaps. This was a huge issue in one of our London offices as we could see over 100 interfering Wifi networks.
It quickly became apparent that trying to get stable, usable Wifi on the 2.4Ghz spectrum was fighting a losing battle. There are a limited number of options that you can try to tune the APs to use a quiet channel and not suffer from interference and we had exhausted them all with little noticeable improvement.
So one of the first things we did was buy a higher spec’d model of Access Point (AP) and configured a “5Ghz only” network on them.
5Ghz has many more channels to use than 2.4Ghz - depending on other settings you can technically have 19 different channels to choose from although to achieve higher throughput these days they are often bonded together to improve speeds. The channels are often a lot less congested than the 2.4 ones so there is a larger selection to choose from. With assistance from our IT support team we also began a program of replacing older laptops that had 2.4Ghz wifi cards with a 5Ghz wifi card instead.
Unfortunately after moving into our new Bristol office it became apparent that even running at 5Ghz the cloud based APs simply weren’t going to be powerful enough to support the levels of traffic that our Users were throwing at them.
Clients were being indiscriminately booted off the AP due to resources running out causing frustration for both the users and us, the administrators. With the cloud vendor support team also at a loss as to how to improve our situation, a decision was made that we would have to change vendor.
And so began the summer of POC (proof of concept) as we tested new systems to see if they could provide the level of service and reliability that we demanded. First up was “Vendor #1” and initial tests in our London office proved positive with stability vastly improved and the number of support tickets raised regarding wifi performance dropping to almost zero.
Due to the positive testing we rolled out the Vendor #1 system in our Bristol office which involved a Sunday of climbing ladders and replacing over 40 access points across all 5 floors.
However on Monday morning it was clear that the system was in melt-down and not performing at all due to its unique channel bonding method that was adversely affected by the open nature of the office’s floors and atrium. Despite a week of troubleshooting and tinkering alongside the vendor’s top engineers, the system could not be made to work at any level of acceptance and was therefore deemed to have not passed the POC and rejected.
There followed another Sunday up and down ladders taking down the APs and putting the original cloud based units back up.
The next vendor on the list were approached and their salesman assured us “it just works.” Well, after the couple of years we had been through it’s fair to say we were sceptical but we had nothing to lose so spent yet another Sunday climbing ladders and replacing APs. As the ceiling tiles are metal, each AP replacement also came with the free gift of a static shock which added to the fun.
So, another Monday morning came around with the new system in place and… it just worked.
No disconnections, no slowness, no drop-outs, no buffering, no yellow triangle, just seamless, working, stable wifi with speeds of over 100Mb/s to each 5Ghz client.
Since implementing the new system things have improved beyond recognition. But it’s still not all plain sailing; the complexities of configuring and tuning a Wifi network are many and we have had plenty of challenges along the way from “DFS radar events” to bugs with the system and APs “vanishing” from the controller.
One annoyance with the 5Ghz spectrum is the concept of DFS (Dynamic Frequency Selection) DFS is used by weather radars primarily and if an access-point hears a “DFS event” then it has to immediately stop transmitting and change the channel it’s on.
This causes all clients to be disconnected and they must roam to another AP whilst the original one is offline and scanning for a new channel to use.
Due to the way we designed the positional overlap of the APs most clients won’t even realise this has happened but in a smaller office space where there is only one AP we cannot allow this to happen.
So in this scenario, we have had to tune the AP to use a non-DFS channel. This does come at a slight loss of bandwidth as we have to run with two channels bonded (40Mhz) rather than four (80Mhz) but clients should still achieve throughput of way over 100Mb/s - and therefore will only be ultimately restricted by the office’s internet pipe’s bandwidth rather than the local speed of their wifi connection.
So, what were the lessons that we have learned over the past 3 years?
Firstly, it is well worth getting a professional Wifi survey done to ensure you will have cell coverage across the entire office space and meeting rooms and also to discover any sources of interference that can be isolated or removed.
Secondly, don't be afraid to request a Proof of Concept trial from the various vendors you are considering implementing.
Different systems work better in certain environments and if you have a particularly challenging environment like we did at our Bristol HQ then there is no guarantee that it will work well just because you have spent lots of money on the system. Get some trial APs installed and test it thoroughly, especially in your most demanding areas.
Thirdly, get as many of your clients on to the 5Ghz spectrum as you can - these days pretty much every new device is 5Ghz compliant and our current client pool is well above 70% on the 5Ghz spectrum which is great, but there are still older devices out there on 2.4Ghz and they will not be having anywhere near as a good a time as those on 5Ghz.
It’s also well worth testing the varying levels of frequency at 5Ghz - running at 20Mhz might technically reduce the available bandwidth but it gives you a huge amount of channels to play with ensuring minimal interference and you will still see speeds of over 100Mb/s at that range.
And finally.. never give up! Wifi is a complicated beast but it can be tamed!