The Customer Interaction Platform team here at OVO wanted to automated some manual post-release checks as part of their CI release stage. This post details some of the challenges of automating that process.
These checks were generally executed manually by the engineer following the post-release run-book, its essentially a high level integration test, to ensure that the telephony, data protection act verification and CRM systems are working together as expected. These three systems are tied together using a Chrome extension to provide a seamless user experience for our call centre agents.
As all three of the systems are browser based, thoughts immediately went to using browser automation tools such as Cypress, as we needed multiple tabs and extensions we needed an entire browser to automate, including multiple tabs and extensions to complete the checks fully.
The corporate browser here at OVO is Google's Chrome as we are heavy users of G-Suite and Meet internally.
The obvious choice from here is Selenium, which as it operates outside the browser rather than within it, has the ability to control things that cannot securely be accessed from within the browser such as controlling multiple tabs.
Headed Chrome in Docker
Running a headless Chrome in Docker is pretty much a solved problem: Chrome has first class support for running headless and Selenium ship pre-made chrome based Docker images. Indeed we initially used these images until we discovered that headless Chrome does not support extensions.
The flow we were trying to automate relied on an in-house Chrome extension to function, so we were out of luck.
To run a full blown headed chrome we would need a display server. Running a full X server inside docker seemed overly complex though, so it was Xvfb, a virtual framebuffer, to the rescue.
We also pass
--disable-gpu to stop various warnings due to resources (cgroups and grapics hardware) being unavailable in our Docker environment.
We put this script in the location
chromedriver expects the chrome binary
/usr/bin/google-chrome and it ensures that Chrome can start, and that Xvfb is killed if Chrome exits.
Python Selenium can provide the Chromedriver a list of packed extensions, which will be loaded into the temporary profile used by Chrome. This didn't work in our testing with unpacked extensions, and the internal extension are unpacked.
To solve this we downloaded the internal extension to a known path at Docker build time, and loaded it in as an unpacked extension with the
load-extension=/path/to/extension argument to Chrome.
Now we were ready to start automating the processes, the first task was to authenticate to the systems, all of which used different mechanisms.
Selenium and Salesforce 2FA
Our Salesforce instance enforces 2FA via email. Usually this is a pretty infrequent prompt, but as the tests are starting with an empty browser profile our cookies get lost each time. This meant we would have to provide a numerical pin that is emailed to the account address at login time.
Enquires were made if we could disable the the 2nd factor ease of integration on this account, but reports from the team suggested it was a globally acting option only.
So a work around was need, it seems simple enough: connect to the inbox via IMAP, wait for an email from Salesforce, load the email then grab the code via regex and pass it back into Selenium.
Selenium and Basic Authentication
Another unexpected challenge was the login to the telephony system was protected by "Basic Auth". Due to recent security enhancements in chrome the tried and tested
user:firstname.lastname@example.org syntax is no longer allowed as of Chrome v59.
Surprisingly there is also no way to access the pop-up authentication box you receive as a user from selenium, or to inject a custom header (so we can send the
Authorization header ourselves).
This means in the end we ended up using another chrome extension to fill in the authentication details, again, loaded via a an argument to chrome at start time. This 2nd small extension is dynamically generated and adds an authentication listener: e.g
chrome.webRequest.onAuthRequired.addListener(). This listener then returns the appropriate login information, as supplied by environment variables injected from the Docker environment.
Dealing with multiple tabs
As the tests were written as a
pytest pack, we made each tab available to the tests via fixture. We store a list of the
windowIds and then switch to the relevant tab before the test is run, an example snippet is below:
This places inherent limits on the number of tests you can run at once, as you can only have a single active tab.
When developing using Selenium locally you get the relative luxury of being able to actually see what state of the browser and its contents, a luxury which is very much not available when your docker image is running in Kubernetes somewhere!
Selenium has good support for taking screenshots so we had the script upload to Google cloud storage at various stages.
To assist with debugging we also upload to the same bucket a copy of the DOM state at exit in case of an error, a copy of the formatted dev-console logs and finally the chromedriver log itself
A final quick note on chromedriver logs, if the parent directory of the path you set for
service_log_path does not exist when chromedriver starts, it will exit with no error message! Hopefully that titbit will save you some minutes of confusion.
As you can see what seems a conceptually simple task has many layers and problems to solve as you work through them. Hopefully these notes will help give an idea of the kinds of items that need navigating when coming up with working solutions with real world software.