Attempting to define a culture in words is challenging, often when we discuss tech culture we’re really talking about a set of behaviours that we ask ourselves and our teams to exhibit. Creating the right environment to work in is arguably the most important thing any technology company can do. We take it really quite seriously here, and truly believe that creating a forward thinking, leading technology culture enables us to perform at our best and attract, develop and retain top talent. This translates to a faster, more agile organisation that consistently delivers products and experiences that customers value, ultimately enabling us to build a successful business.
Dan Pink argues that to create motivation in a work force you need to provide your employees with autonomy, mastery and purpose. The culture we’ve been building at OVO reflects these objectives with particular attention paid to autonomy. In it’s most simplistic terms we’d define our culture as “autonomy with responsibility”.
Autonomy has become a bit of a hot topic in technology, and there are of course varying levels on offer, so it’s worth clarifying at this point to what extent we believe in autonomy and how it manifests itself in decentralised decision making. There are two high level elements of autonomy on offer if you work in a cross-functional feature team - technical autonomy and product autonomy.
We create feature teams around end to end customer problems. This means we don’t have frontend or backend teams, we have teams that have all the capabilities they need to solve the customer problems they own. Once we have defined the team’s overall mission and worked out the metrics that determine success from a customer's perspective, we aspire to give each team true ownership of their roadmap. This means we view it as everyone’s responsibility, not just the product managers to define and execute on their roadmap. The hypothesis supporting this is that the teams will naturally seek the path of least resistance in trying to achieve their goals, picking the low effort high value changes along the way. Conversations within the team should always be framed as value to customers vs effort.
From time to time we still have “top down” projects that cut across multiple feature teams. Technical autonomy still remains in these scenarios however, the product autonomy is challenged.
In this section we will explain our approach to building and maintaining software, the type of people we look to hire, our reactive microservice architecture, our org structure and finally our approach to building a community of practice.
In an environment where teams are motivated by improving customer outcomes we believe they should have choice in the tools that enable them to solve those problems most effectively. This means they select their preferred tools, continuous integration, monitoring, logging, frameworks, libraries and to an extent programming languages. Standardisation, where it does occur, is driven by communities of practice and not from the management team. We’re very conscious of team coupling, and although they often agree on the use of certain tools, we’re cautious about sharing the implementation.
We give teams full responsibility for designing, building, testing and running their applications. As a by-product of this we don’t have some of the centralised technology teams you might see in other organisations such as architecture, devops or QA. We promote a devops culture within our teams and ask them to manage their own infrastructure in the cloud.
Designing the solutions to our customer problems is one of the most important aspects of the technical autonomy we offer. We believe it motivates our teams, produces better outcomes and emphasizes speed over governance. The argument against this type of approach is a risk of duplication, however we believe light duplication is acceptable, if not desirable if it leads to greater understanding of the problem domain within the team, and our communities of practice help to guard against large scale waste. We lean towards sharing code at a service level over a library level, particularly guarding against sharing large libraries with transitive dependencies.
Providing team autonomy isn’t (unfortunately) just a matter of conveying the message of autonomy. There are things we need to do as an organisation to allow teams to operate in an environment with decentralised governance. How we go about structuring our teams and the technical decisions we make support our cultural goals. We’re always keeping in mind the importance of team coupling when we make decisions. Teams owning their own AWS accounts, a step away from large shared libraries and our adoption of Kafka help us to keep our teams as independent as possible.
We’d like to start with arguably the most important part, you need the desire and skill from your employees to operate in this model. It’s become common in our industry for organisations to spread the responsibility of delivering software over multiple teams, particularly teams managing infrastructure, QA teams testing software or architects designing systems. Some engineers prefer to work in environments like that, where you simply need to concern yourself with producing code. It doesn’t align with what we’re working towards here. We want engineers who crave the ownership of a problem from start to finish and have the right skill set to execute a holistic solution.
Http and Kafka
Using point to point http as your sole mechanism for inter service communication is a sure fire way to have tight coupling between systems and teams. The default microservices implementation you’ll read about online often uses REST over http, this has led to most microservice deployments being implemented in a similar way. Whilst this might seem okay with relatively small deployments as you grow you’ll build up a complex web of dependencies between your services as each service becomes aware of all other services interested in changes in its data set. Any time a new service comes online and is interested in the data residing in another service you need to make changes in the originating service. As systems grow, point to point communication becomes a major blocker to team independence.
An example of a platform built with point to point communication. Taken from here.
Enter Apache Kafka.
Kafka is a distributed streaming platform that gives us platform wide data accessibility via a publish / subscribe model. It safely stores and replicates data set changes. If you’re interested in the data, you can subscribe to a topic (a way of categorising event types) in Kafka and stay up to date as changes occur.
Our platform continues to grow and more and more services require data from across the system. When that data is available on Kafka, the originating service doesn’t need to make any additional changes. This feature enables teams to solve problems independently. Integrating with Kafka provides a single point of reference for all services. Other teams with different objectives can access information without interrupting the original roadmap.
The leadership team along with the engineering teams have produced a handful of documents to help formalise the contractual obligations of services. Our key document is the “Service Checklist” which also points to both our Kafka and Security Manifestos. These documents cover the essentials of what services should do and deliberately don’t go into the implementation detail, leaving the team a good degree of solution freedom. Examples from the checklist include: “Have agreed operational metrics and report on them”, and “Have up to date documentation of your APIs”.
Managers and Coaches
Producing the right structure within a team is key part of getting the overall culture right. We’ve recently removed all reporting lines from within a cross-functional feature team. We believe that teams operate best within a flat structure, where all technical challenges are welcome and everybody is a peer.
This alleviates the pressure on a team lead to be the strongest technical individual and a people manager, two skills which are often in contrast to each other. The Engineering Manager comes in place of the previous team lead. This is a more specialist management role to ensure all team members get the right personal development, ensure the team's voice is being heard, remove blockers, encourage communication and enable the teams to make effective technical decisions. The Engineering Manager is not part of the team, but instead falls into an enabling category looking after multiple teams - getting the most out of the individuals and the teams.
In addition to the manager roles we also have some specialist coach roles, a Devops Coach, an Agile Coach and a Security Coach. These coaches work with teams to ensure they have the necessary skills and knowledge required to own the complete lifetime of their software.
Tribes, Champions and Katas
One of the interesting side effects of creating technically loosely coupled teams is that there are fewer concrete reasons to speak to each other. To combat a reduction in informal knowledge sharing we have three, more formalised forums for communication and knowledge sharing.
Our Tribe session acts as a technology wide show-and-tell, each meeting a few teams present their recent work particularly calling out any important decisions, new services or tools and pieces of research. This creates technical transparency and acts as a forum for feedback and review. This cross-team awareness enables follow up conversations between teams as engineers encounter similar problems or choices.
The Champion groups are formed around specific areas of interest like monitoring, security or testing. The discussions within these more focussed groups are around best practises and produce a bottom up standardisation where it makes sense. These unenforced standardisation practises give teams good guidelines to work within, but don’t have the stifling effect on innovation and progression that a more rigid top down governance structure provides.
Katas are our hands on workshops designed to share knowledge practically. An engineer will choose to run a session on a particular topic (TDD for example), for a set number of developers. The kata facilitator will produce a number of exercises for the attendees and be on hand to work through them with individuals as they need assistance. They’re like our own internal training camps.
Building self-organising, self-managing teams helps to produce better product output, improves retention and is a powerful weapon in the war on talent. It’s not easy to do though, we can’t just want it to happen, we have to build an organisation that can support autonomy. The technical choices we make, the communication forums we create and organisational structures we put in place all play a part in supporting a progressive approach to building software. All with the sole purpose of building something better for our customers.