
OVO has grown in complexity, and so has our data
OVO was built on the foundations of simplicity and customer centricity; simplicity across our organisation enabled us to provide excellence in our customer interactions.
Our rapid growth over the last decade has inevitably led to complexity - more systems, more data, as well as more people, with more ideas and ambition.
Our Data Platform team started as a small group of engineers building a data warehouse to better understand our members, our financial position, and the needs of our most vulnerable customers. We built something pretty useful. So much so that it became invaluable to many colleagues beyond those we were immediately trying to help.
We found that as well as analysts and data scientists, software engineers were using it to seed their applications and services with data from its easy-to-understand data model. This saved them having to find and interpret data from across our various source systems separately.
The increased demand on our on-premise infrastructure became expensive to scale and run, leading us to move it to the cloud. We benefited from the petabyte-scale possibilities of Google’s BigQuery to provide a more reliable self-serve capability to our users.
At the same time OVO’s software engineers were starting to use Kafka, publishing data centrally, which was in part driven by the Smart Metering revolution in the energy industry. They had a desire to access this real-time data in real time to power services, like giving customers an up-to-date view of their usage. Unfortunately, the data warehouse in BigQuery wasn’t scoped to meet these needs.
Our data warehouse was optimised for aggregate reporting and analytics where a 99.9% accuracy is acceptable (i.e. 1 error in 1,000). It wasn’t designed for operational use cases requiring much higher levels of precision (1 in 100k or fewer). For example to avoid charging a customer the wrong amount, or attempting to install a meter at the wrong address.
We attempted to build a single product to meet these needs, but failed due to the size of the task
We knew that a new solution was needed to meet the growing demand for real-time data, to support the various operational use cases throughout OVO.
So as a central data engineering team, we devised a data product that we hoped would provide high quality data from across our source-system estate, in a single consistent data model, in real-time via Kafka.
We focused on the technology of the problem - we evaluated multiple stream processing libraries, devised a well thought-out data model encapsulating all the needed metadata and naming standards, and created tooling to make the process of building the pipelines more straightforward.
We hoped that in time we would encourage contribution from teams across the organisation, and built a product to meet this vision.
But after all of this upfront investment, we realised that our central data engineering team was the only one contributing to it. And with a small group of people working on it, we could not build it quickly enough, to the right level of quality, to meet the demands we set out to achieve.
Technology alone couldn't solve this problem; we needed a change in mindset wrt data at OVO
Instead of trying to build the whole thing ourselves, we needed a culture shift in how our product, tech, and business teams consider data. From our own experiences, and from keeping an eye on those of others in the wider tech and data world (e.g. 1,2), we have learned that we need to educate and empower our leaders and teams to consider data as importantly as features.
We have begun to work with leaders across OVO to define the right ownership model for data. This means identifying gaps in data ownership, and finding the right home for source systems, data pipelines, topics, and tables within each of the cross-functional teams across the company. These teams should be aligned to specific business areas or goals, and which we refer to as domain teams.
And we are supporting these domain teams to map out additional people or skills they may need to be able to take on this new set of responsibilities. This will give them truly cross-functional teams of product, design, software engineering, data engineering, data analysts and data scientists. This is a further evolution from the team setup envisioned in this post previously covering OVO’s tech culture.
We hope that the domain teams will in time be able to say:
- what data and data products they are responsible for in their domain
- who their data users are, and how they are using their data
- they actively engage with their users’ requests for new data, changes, and bugs
- their data is published centrally, and is accurate and timely
- their data is discoverable, well documented, and simple to self-serve
- their data is easy to combine with data from other domains
- their data is stored securely, and actively managed in line with GDPR requirements
Data Platform teams as an enabler
Within the Data Platform teams, we are moving away from building and owning lots of data pipelines ourselves, and instead providing tools and services to enable the rest of OVO to do this for themselves, in many cases leveraging open source technologies to underpin them.
- Tooling: helping OVO’s teams make their data available efficiently with pre-composed modules to move data into or out of the Data Platform
- Discovery: connecting our people with the data they need and its owners, allowing them to more effectively draw insight and build applications more quickly
- Architecture: guidance on best practices for data modelling, classification, storage, and naming convention to provide consistency across domains
Conclusion
In OVO's Data Platform teams, we are focused on building a set of data tools and products that will support our business’s insatiable demand for progress, on our journey to net zero. We want to put as much data as possible about our members in the hands of our data scientists, software engineers, and customer service agents who can use it to help our members reduce their carbon footprints and energy bills. Whilst doing this we also want to ensure our members have full control over what data they give us, and how we are able to use it.
We’ve realised that new technology isn’t the only thing that is needed to make high-quality, real-time data easily available to those who need it. We need a culture shift in the way that everyone thinks about data: from our business leaders, to our product managers, software engineers, and designers.