OVO Tech Blog
OVO Tech Blog

Our journey navigating the technosphere

Monty West

Software Engineer



Levelling up in Pure Functional Programming

It's a common journey, especially here at OVO. We get a taste of pure functional programming and fall in love. However, we struggle to apply our extensive knowledge of imperative paradigms and it's slow and frustrating to gain understanding.

I'm on this journey and have been since I joined OVO nearly 3 years ago. I'd worked with Scala before, but not like this. Previously I was writing "pretty Java", very much confined to imperative and object-orientated patterns, and now I was trying to write "JVM Haskell", without the language restrictions to keep me in line.

This blog is not a critique of Scala, or a discussion on Java vs Scala vs Haskell. On the one hand, this blog is about a book and what it made me realise about this journey. On the other hand, this is a blog about abstraction, the benefits it brings and how we utilise it with functional programming, with some code examples in Scala for illustration.

Eugenia Cheng's The Art of Logic

In December 2018, one year into my journey, I attended Scala eXchange here in London. The first day featured a keynote from Eugenia Cheng. No code was demoed, there was no mention of monads or implicits, instead the topics included white privilege, police brutality and gay marriage. The talk centred around abstraction and how we can take practices from mathematics and science and apply them to the real world to improve understanding and resolution. It's one of the best talks I've seen and I highly recommend watching it.

The talk served as an introduction to her book, The Art of Logic. With the extra time 2020 has gifted us all, I finally sat down to read it.

The Benefit of Abstraction

The book starts in a similar way to the talk: abstraction improves our discourse and furthers our mutual understanding; it allows us to have better arguments.

It doesn't solve the problem, but it gives us a clearer framework for discussing the problem, that's what abstraction does[.]
-- Eugenia Cheng, Scala eXchange 2018 Keynote

Eugenia's example of a great use of abstraction. The London tube map abstracts over the detail of real-world distance, allowing people to more easily plan their journeys.

When seeking a solution, trying to keep all the specifics in our heads at once can hinder our exploration. Trying to include these details, when we're communicating our ideas, can be long-winded and distracting. Temporarily forgetting some of these minor details and instead considering a higher level view can allow us to focus on the essence of a problem.

Moreover, this higher level view may reveal a similarity or connection to other problems. In turn, these relationships may reveal more about our original problem. We can use techniques and learnings from previous problems and apply them our issue.

Once we discover that different things are somehow the same, we can study them at the same time, which is much more efficient. ...
    ... For example, I have found a relationship between a Bach prelude for the piano and the way we might braid our hair. Finding relationships between different situations helps us understand them from different points of view, but it is also fundamentally a unifying act. We can emphasise differences, or we can emphasise similarities. ...
    When we look for similarities between things we often have to discard more and more layers of outer details, until we get to the deep structures that are holding things together.
-- Engenia Cheng, The Art of Logic

Precision in Abstraction

Unfortunately, abstraction is difficult and confusing. In her keynote, Eugenia describes the perils of misapplied or misunderstood abstraction:

Problems can arise in arguments about analogies, because someone will start hallucinating that you're actually using a different level of abstraction from the one that you're using if you haven't made it completely clear.
-- Eugenia Cheng, Scala eXchange 2018 Keynote

This is where logic enters the frame. Eugenia posits that logic is how we can ensure trust, consistency and mutual understanding.

Logic forms the basis for communicating our ideas within our abstractions. It gives us the language that all abstractions can agree upon and all parties can interpret unambiguously.

The primary benefit here is one of expression. We can very precisely say what we mean, and trust that it's understood. Moreover, as we make something more abstract it becomes easier to apply logic. As we forget superficial details, logic becomes more powerful. Abstraction and logic together afford us a more concise and declarative expression of our ideas.

Logic and abstraction are like shining a light at things. As we get more abstract, it's like raising the light off the ground. We see a broader context, but with less fierce detail; however, understanding the broad context helps us understand the detail later. In all cases the aim should be illumination of some kind. First we need some light, and then we can decide where, and how, to shine it.
-- Engenia Cheng, The Art of Logic

The second benefit is more subtle: composition. If all our abstractions use the same vehicle of expression, we're able to consider different aspects in isolation, resolve them separately and later recombine them. If we are able to do this precisely (i.e. logically) we know that the act of recombination won't undermine our total argument.

Said simply, if we can trust our smaller arguments, we can combine them safely into an equally trustworthy, larger argument. The caveat: we must be precise.

[Logic] is dry and formal, which can make it seem offputting and irrelevant. But its dryness is there for the excellent reason of making things crisp and clear rather than soggy and mushy. It also helps make things more concise, which in turn helps us build bigger and more complex arguments. It's a bit like those vacuum bags into which you put your clothes and then suck all the air out with a vacuum cleaner, condensing a whole pile of clothes into one compact unit.
-- Eugenia Cheng, The Art of Logic

Functional Programming as Abstractions

In listening to the talk, and now in reading the book, I can't help but see the parallels to functional programming. Seemingly I'm not alone, Eugenia was asked to give a keynote at a functional programming focussed conference.

I believe that functional programming can be seen as an abstraction over imperative programming. This is a particularly attractive idea because it avoids the sometimes adversarial nature of functional vs imperative. They are both entirely legitimate and valuable, they simply exist at different levels of abstraction.

The commonly cited benefits of functional programming map exactly to the benefits we discussed about abstraction above:

Avoid this with functional programming. Source https://heeris.id.au/2013/this-is-why-you-shouldnt-interrupt-a-programmer/

Usually, the justification for these benefits is lost among the description of the detail: what a monad is, or how implicits work. Instead, I like to view these benefits as goals. We're aiming to abstract over the problem, utilising functional programming concepts, to unlock these benefits. Your code isn't automatically more readable just because you use flatMap.

This is similar to the misapplied or misunderstood abstractions from earlier. We need to understand why we are applying the abstractions of functional programming. Moreover, we need ensure that we remain precise as we do so. Using the wrong level of abstraction and allowing imprecisions can often turn our benefits into flaws, making our code hard to reason about and difficult to work with.

Precision in Functional Programming

The primary building block in functional programming is, fittingly, functions. These are the vehicle by which we build out abstractions. Therefore, they must be precise (in the sense of abstraction) to ensure our abstractions are consistent, understandable and correct.

Functional Purity

The most important version of precision in functional programming is purity. A function is pure if it does nothing else than compute its return value. This is the same as saying all of the following:

Pure functions are referentially transparent, the function call can be replaced by the return value with no change to the program. The process of doing this is called substitution, and forms a good test for purity:

var myInt = 1

// Impure function
def addToMyInt(x: Int): Int = {
  myInt += x

println(addToMyInt(2) + addToMyInt(3))
// > 9

// Substituting functions for return values fails
println((1+2) + (1+3))
// > 7

Without referential transparency it's impossible to trust our functions. With pure functions, we know that we can use them wherever we need to, in any order. We don't need to know the internals of pure functions, only what they do. This trust is essential when building up larger and larger programs from smaller functions. Purity allows us to build on the work of others without needing to fully understand how it works.

Functional Integrity

As we'll see in the next section, functional programming has several function names that come up very often. Consistency with these function names is extremely important. To allow programmers to trust the functions they use, functional programming uses a very strict naming convention. Even more than that, these common functions are normally complying with certain laws.

For this blog, I'll refer to this as functional integrity. This is the ability to trust that function does what it says it does, regardless of whether or not it is pure. We'll see an example of this as we go through the levels below.

The 4 Levels of Abstraction in Functional Programming

The concept of functional programming as abstraction can be taken even further. The process of abstracting over key features of imperative programming can be broken down into four areas, and in my experience these tend to be encountered in sequence on the journey from imperative to functional.

Each level of abstraction could easily be their own blog post, so in the interest of brevity we'll follow a single thread through each level. Our aim is to introduce the concept at each stage and demonstrate how they provide value. Scala code examples will be provided and we'll keep an eye on purity and integrity as we go. I'll provide links for further reading where I can.

Abstraction: A shortcut to better code, albeit a scary one

Level 1: Abstracting over Implementation

Here we're talking about functions like filter, fold or map on predefined List classes. This tends to be a developer's first taste of functional programming. Usually it's a positive one: it allows them to declare exactly what they want to do, allowing the function to take care of all the boilerplate. Consider the following code example, where we're taking a list of strings and building the corresponding list of those strings' lengths.

def lengths(ls: List[String]): List[Int] = {
  if (ls.isEmpty) Nil
  else {
    val out = new ListBuffer[Int]()
    for (i <- 0 to ls.length) {
      val string = ls(i)

This code is doing quite a lot. We have to construct a new instance ourselves, we have to worry about the empty edge case, and we have to know how to traverse a list. These are all things that the List class is better placed to do itself:

def lengths(ls: List[String]): List[Int] =
  ls.map {
    string => string.length

map allows us to describe how we want to transform our list, item by item. The input to map is a function, from the type we have in the list (String) to the type we want (Int). In our example that function is  string => string.length. It applies this function to each element of our list, building a new list with the results.

class List[A] {
  def map[B](f: A => B): List[B]
Signature for map.

This solution clearly shorter, but more importantly it's declarative. The programmer is declaring their intent, rather than describing how it happens. This code is also clearer (as long as you understand what map does).

Furthermore, the programmer has to know far less about List, they only need to know how to extract the length from a string. In this way, the concerns are better organised. We allow the List implementer (or at least the implementer of the map function on List) to decide how to access and build lists, which they are very likely to be more knowledgeable about.

Removing the 'how' from our side of the implementation allows the owner of the structure we're operating on to decide it. It may be better for this List to handle elements several at a time, or maybe perform differently based on length. We're able to forget these details at our level and simply state what we want to happen. This 'what we want' over 'how to do it' is a central concept to functional programming.

The style of functional programming is to describe what you want, rather than how to get it.
-- Chris Renham, Stack Overflow response on functional programming

Purity at this level is rather simple. Our lengths function is pure if map is pure. Map stays pure by constructing a new list every time it is used (similar to how the first example works). If map were to modify the list in place then it would fail the substitution test. This is why functional programming favours immutable structures, it helps us ensure purity.

Beyond the need for purity, does this level restrict our abilities in any way? Will we always be able to find a function that allows us to transform the list in the way we want? I believe this to be a common source of frustration for developers approaching this level for the first time, and they often resort back to implementation specific solutions. Luckily, there is nearly always such a function (for lists, that function is usually fold). The reason: these functions aren't simply helpers on List, they're applicable for many other structures and in fact hint at a deeper concept.

Level 2: Abstracting over Structure

Imagine we enter a codebase to find the following.

def program(input: String): Output[String] = ...

We have a class (or structure) Output as the return type of the function. Our task is to write a function copies the functionality of program, but transforms the string(s) in the output with another function, which has been provided to us:

def transformOutput(string: String): Int = ???

How can we achieve this? Do we need to dig into the internals of Output? Do we need to copy the code out of the transformOutput and program functions? If we have the correct abstractions in place we don't need to do any of that.

If the Output structure has the map function, and we know that it's pure and does what we expect (integrity), then it's as easy as:

def ourTask(input: String): Output[Int] = {
  val output = program(input)

Or even more concise:

def ourTask(input: String): Output[Int] =

Once again, this is declarative and clear. It reuses all the parts that have been provided to us. In fact, our task was almost trivial in terms of effort. We didn't have to understand what program does, or what transformOutput does. Moreover, we didn't have to understand, or even know, the structure of Output.

Output may exist to allow program to return multiple outputs ( List structure), it might need to cater for the empty/null case (Option structure), or it may need to instead return an error message ( Either / Try structure). Not only is this not a concern for our task, it can also change independently without affecting our work.

All we needed to know is whether Output was "mappable" (i.e. has the map function). The examples listed above (List, Option, Either, Try) are all "mappable" and can all be safely substituted in the code. We can categorise all these structures together as "mappable". With this categorisation we move to an abstraction where the structural differences are forgotten and we're able to view them all as the same.

Other functionalities  (e.g. fold, zip, flatMap) of these structures can be categorised similarly. This gives us a plane of abstraction for each functionality, where we only care what a structure can do, rather than what the structure is. This is useful where we can't know (or don't want to) the underlying structure of something, but we still need to work with or transform it. All we really need to know is what categories (functionalities) our structures have.

Here our need for integrity in naming becomes clear. If these structures all had map functions that did different things then the whole abstraction would fall apart. This is why categories often come with laws: strict guidelines for what the functions they describe must do. For more on laws, check out this Rob Norris talk.

Once we have these categories we can find relationships and connections between them. Being in one category may mean your structure is automatically in another. This may allow the structures to be automatically granted functionality (e.g. you can build zip from flatMap). This abstraction of categories goes far beyond programming, there is a deep connection between functional programming and Category Theory in mathematics.

This is the basis of Scala libraries like cats. Many of the concepts and names are taken directly (e.g. structures in the "mappable" category are called Functors). This can make this process appear daunting and we spend too long within these abstractions we can start to lose sight of the benefit. Instead, let's check out another practical example:

Interlude: the Importance of flatMap

Let's go back to our program function and the Output and talk about specifically flatMap. Imagine our task is to write a function that applies one program, then takes the output and puts it through a second one:

def programOne(input: String): Output[String] = ...

def programTwo(input: String): Output[Int] = ...

For this we need flatMap:

class Output[A]{
  def flatMap[B](f: A => Output[B]): Output[B]

For list structures this would be akin to mapping each element to a new list and then flattening. For option structures, this will only apply the function if there is something in the output, otherwise it will return empty. Generally, we call structures that can implement this function monadic. It allows us to do our task as described, sequentially chaining together programs.

def ourTask(input: String): Output[Int] =
  programOne(input).flatMap{ outputOne =>

Now say we had a third or fourth program, we can just keep chaining them together with flatMap.

def ourTask(input: String): Output[Int] =
  programOne(input).flatMap { outputOne => 
    programTwo(outputOne).flatMap { outputTwo =>
      programThree(outputTwo).flatMap { outputThree =>

To avoid this endless nesting, Scala gives us some nice syntactic sugar for flatMap, a for comprehension. This allows us to write our flatMap chain as if they were lines of code:

def ourTask(input: String): Output[Int] = 
  for {
    outputOne   <- programOne(input)
    outputTwo   <- programTwo(outputOne)
    outputThree <- programThree(outputTwo)
    outputFour  <- programFour(outputFour)
  } yield outputFour

This reimagining of flatMap chains as "lines of code" is quite apt. Often in imperative programming, lines of code are doing exactly what flatMap does here: sequentially chaining together expressions or calculations. However, "lines of code" in an imperative codebase have no structure baked in. At each step they often have to worry about whether an output is null or whether the program can throw an exception/error. With flatMap and our abstraction over structure, we can have any structure we want in our "lines of code". If we want to handle null output, we can have Output be like Option, if we're worried about errors then we can have Output be like Try.

For me, this is one of the big payoffs of abstracting over structure. The categorisation of structures allows us to write code at the category level. This code can have an inherent structure baked into its "lines" (e.g.flatMap chains). We'll never have to write null checks or try/catch blocks, our underlying structure will take care of it for us.

Level 3: Abstracting over Effects

Most programs are only useful if they can be effectful. They'll have to read from a file, interact with a database or call an API. These actions have an effect outside of the program. Our abstractions are only truly useful if we can affect the real world with them.

However, our functions must be pure, and therefore cannot have side-effects, they cannot effect anything outside of themselves. For example, this function that simply prints to the console is impure. But what if we need to print to console within our program?

def printHello(): Unit = println("Hello")

val print = printHello()
(print, print)               // Prints once

// Using the substitution principle
(printHello(), printHello()) // Prints twice
Impure function, necessary functionality

This is where the IO Monad comes in. The IO Monad is our vehicle for affecting the outside world. It is a class that allows us to describe the effect we want to have, without actually running it. It only runs when we tell it to (but when we do, that function call is impure).

The IO Monad is a class that wraps up the effectful code we need to write and delays its execution:

def printHello(): IO[Unit] = IO.delay(println("Hello"))

val print = printHello()
(print, print)               // Prints nothing, tuple containing two IOs

// Using substitution principle
(printHello(), printHello()) // Prints nothing, tuple containing two IOs
Pure as it passes the substitution principle

IO is a structure like the others we've seen. Therefore, it benefits from the abstractions introduced in Level 2. As the name suggests, it is monadic, and therefore we can call flatMap on it. It also means we can chain these IOs together to build up more and more effects. Once we have the full chain, the one that describes our entire program, we can then run it as the last thing we do. We usually refer to this as running "at the end of the world", and as it's the last thing we do the side effects don't affect our ability to reason about the code.

final def main(args: Array[String]): Unit = {
  program.unsafeRunSync()         //Impurity as the last thing we do

def program(): IO[Unit] =         // Pure function
  for {
    dbValue    <- getFromDatabase()   // Each function returns an IO
    apiReponse <- callApi(dbValue)    // and chains them with flatMap
    _          <- printResponse(apiReponse)
  } yield ()

We can bake the IO structure into the "lines" of our code. We can write effectful code and remain pure. In remaining pure we can ensure our abstractions are precise and reasonable. We can write real world code with all the benefits we've gained from abstractions in levels 1 and 2.

Level 4: Abstraction over Control Flow

The IO Monad allows us to write real world, effectful code. However, it only allows us to write it as a chain of sequential computations. Can we abstract over the concepts of concurrency or streams and remain pure?

In my journey to pure functional programming I am still learning the abstractions at this level. Luckily there are several good resources on the subject, my favourite being this talk by fellow OVO colleague, Fabio Labella.

This is the hardest level in pure functional programming, but mainly because control flow itself is difficult. It's also the level where we can reap the most benefit from abstraction. I'm personally unable to write imperative code that does what I can write in a few lines of pure functional control flow.

Structures that gives us purity in data streaming, global atomic value and resource management make those difficult concepts easy to reason about. Categorising these structures similarly to how we did in level 2 allow us to manipulate control flow in a declarative and clear manner. Moreover, it means the nasty details can be hidden (within libraries) away from us, the program writers, when we're writing our business logic.

The End of the Journey

All levels combined, pure functional programmers don't necessarily need to understand the internals of difficult concepts or advanced structures in order to utilise them. They can transform and build programs in a declarative manner. Commitment to purity throughout means that these programs can be composed of smaller independent pieces. Overall, we're able to write concise, clear and more powerful code with less effort.

In Closing

I find it most beneficial to think of functional programming as abstractions of imperative programming concepts. The benefits we can gain from functional programming are the same that we gain from abstraction in general.

I've arranged these abstractions into levels as I feel they're a useful tool when discussing functional programming. It also tends to reflect the timeline of learning for new functional programmers, so I'm hoping it's possible for you to spot at which level you're currently working.

Focussing on the abstractions of functional programming can provide motivation in learning this new paradigm. The largest benefit is unlocked at the end of the journey, so knowing what we're aiming for can make earlier levels feel less excessive. Moreover, instilling the importance of precision can contextualise the potentially daunting appearance of functional concepts.

Lastly, I hope this blog demonstrates the overall benefit of functional programming. It doesn't replace imperative programming, it provides us with abstraction where we can: be declarative and concise, express our ideas exactly and safely and, best of all, utilise functionality that we're unable to write imperatively ourselves!

Monty West

Monty West

Software Engineer

View Comments