OVO Tech Blog
OVO Tech Blog

Our journey navigating the technosphere

Andy Hall



How We Learned to Stop Role Playing and Love the Bomb

How we've used Keep Talking and Nobody Explodes to build incident response capability.

Following our efforts last year to simulate major incidents and build great incident response skills in our autonomous teams, the service management coaches have been looking for ways to do it more efficiently.

One drawback to building and running a simulation as a role play is the large amount of preparation to be done up front, coupled with the limited-use nature of each scenario: if a team has seen it once, it's no longer effective.

On recommendation from Pager Duty's incident training, we decided to trial the excellent Keep Talking and Nobody Explodes as an alternative.

In this post I'll tell you how we ran the exercise, what we learned, and how it compared to the role play scenario.

Playing the game


The rules of the game are simple...

To make things harder we revised the rules a little and split the players up into two teams, communicating via Slack...

We played several rounds and then reconvened to reflect on what we noticed and learned.

So what did we learn and how does this relate to building a great incident response?

The more you do something together, the slicker it gets!

In the first round the bomb exploded after five minutes, with two of three modules solved. Two of the module types were present every time they played. In the final round, the players solved those same two modules within the first minute.
They quickly began to focus on one module at a time. Once everyone knew what to expect and developed a common understanding of cadence and process, they solved problems faster.
They were able to progress to harder bombs in later rounds, and also learned to solve new modules more quickly.

Empathise with others for better communication.

Making decisions, taking technical ownership and communicating are all challenging and come with their own pressures.
Understanding each other's information needs enabled clear, concise exchange of relevant information.
The Expert Team learned quickly to wait for the Defusal Team to give them information about the bomb and modules, responding with targeted questions.

Develop a shared vocabulary.

By design, information contained within the bomb modules is not easily described with regular vocabulary. Take a look at this module and imagine describing each symbol to someone over Slack.
Within a couple of rounds the players had already started to build a common lexicon,"Ok, now press the wobbly cactus."

Make use of "dead time".

At first there was a tendency to wait and watch the Slack channel, waiting for a reply from the other side. Communication was very linear and transactional, from the Defuser, to the Commander, to the Communicator, to the experts and back.
After a few rounds, the Defusal Team was looking ahead to the next module, gathering and sharing information while the Expert Team deciphered the manual.
If you're not typing or fixing, start thinking and anticipating what you may need to do next.

Separate roles and responsibilities.

A tried and tested principle for great incident response, we chose to define roles as part of the rules so that people could easily understand how to play.
We did leave the Experts to figure out how best to work together. Interestingly, they naturally began to align themselves in a way that best suited the modules: for simple modules, they devoted a single expert; for complex modules, they worked together to solve the problem and validate their understanding before communicating with one voice.
The players appreciated the benefits of allowing Experts to focus on technical detail and dedicating resource to effective communication.

How does this differ from the simulation role play?

Forcing the use of Slack was a more realistic representation of how we tend to run incident response at OVO, with our engineering teams based in several locations. Communicating well using text comes with specific challenges and requires different sensibilities: that said, learnings here can be carried over into running major incident war rooms with voice or video.
This format enabled us to play a round, reset and repeat quickly. The short rounds and role rotation means players can make mistakes and learn faster.
The timer on the bomb gives tangible time pressure, which is harder to recreate with a simulation.
The focus here was on teaching great intra-team communication and the exercise did not extend to customer- or stakeholder-facing communication.
There is limited decision-making required when playing Keep Talking. Although the content and configuration of the bomb change as you play, the rules remain the same and there aren't many surprises once you've played a few rounds.
In this exercise we defined and preallocated roles. In the role plays we left the players to self organise. The latter is better suited to specifically teach the roles we use in our actual incident process.

Tips for doing this yourself


Andy Hall

View Comments