Understanding Scala Options
The option type can be a tricky pattern for new Scala developers to pick up. Most of that difficulty lies in using options practically in code as opposed to the concept itself -- and you will certainly be using it. While there are some Scala features that are mostly invisible to you (e.g. macros), options are ubiquitous in Scala code. Options are unavoidable in Scala if you're writing anything worth anything.
As a concept, options are relatively simple and most new Scala developers can easily understand the motivation behind it -- but when the going gets rough and you're neck-deep in option, it's easy to find yourself thinking "why do we even use this anyway?" So let's start with the motivation behind options to keep our sights set straight. If you don't need to be convinced about options, you can skip this section.
Motivation for Options
If you've written in any language that doesn't use some form of the Option
type (most languages), you've undoubtedly seen a NullPointerException
or something similar pop up at runtime, e.g. the anxiety-inducing Cannot read property '…' of undefined
in Javascript land. There's no guarantee that a variable references an actual value. From here on, I'll use "empty value" to mean the absence of a value instead of using one language-specific keyword like null
, undefined
, or nil
.
Let's consider a Javascript example:
What if thing
's value is an empty value? The world blows up. Okay, so that's an easy fix right?
I'd rather not get into the rabbit hole of null/undefined checking in JavaScript, so this example will do.
Not awful, right? But now do this for every one of your functions for each parameter and this proposition becomes drastically less appealing. But sometimes you actually are sure that the value exists, because maybe you already checked it earlier on. How can you be sure of that? What if you re-purpose this function or it starts being called from another location that doesn't actually do that empty value check? So to combat this, you resort to always doing these checks, often times needlessly.
You could argue that code verbosity and tedious syntax is a solved issue through the use of an IDE or auto-completed code snippets. (I'd disagree, but that's beside the point). Then how about this less trivial issue: when writing a function, how do you signal to its callers that you might not actually have a value to return back i.e. you might be returning an empty value?
An example might be a method that takes an array and tries to find some element -- if that element is not found, you'll likely return an empty value, but how does the caller know that?
Some likely answers are 1. You find out the hard way--something breaks and then you make the change to your code 1. The method uses some other way of signalling this e.g. returning -1
1. You look through the documentation (if it exists) and read about it there
At this point we can address the issue of calling methods that could return empty values or using parameters that could be empty values in two ways 1. Simply don't check for empty values (this is decidedly the wrong way) 1. Do a bunch of manual empty value checking yourself. You must contend with the possibility of empty values one way or the other or risk potentially catastrophic failure of your application.
The secret third door here is to use options. If a reference may or may not have a value, you make its type an Option
to signify as much (we'll get into the details of precisely how to do that later).
Baking this feature into your language has a huge implication: all non-Option references are now guaranteed to be non-empty i.e. defined. This implicit consequence of using options may actually be more important than the use of options themselves.
This means no more null-checking for parameters -- they're guaranteed to exist (though we still have the ability to accept potentially empty values). Conversely, if your function may not have any value to return, make the return type Option
and the caller will be forced to deal with that possibility.
This design makes it so that we're always explicit about when we will have to deal with the possibility of empty values. This let's you have one less issue you have to worry about so you can focus on all the other terrible bugs you write.
Options Implemented
I'm operating under the assumption that you have some basic knowledge of Scala and type parameters
Option[A]
is an option of type A
--or an A
option. You can think of Option
as a container for some value. In a concrete example, Option[Int]
is an option of type Int, or an Int option. If I say a variable x
is of type Option[Int]
you know that x
may or may not actually have a value.
Option
is an abstract class so we can't create an instance of Option
directly. We use one of its two implementations: None
or Some[A]
. It should be apparent what those two mean . None
means a reference has no value i.e. null
. Some[A]
means a reference does have a value and the type of that value is A
.
Assigning variables a value of None
is simple (None
is a Scala object
so we can use it like a normal value):
val x: Option[Int] = None
-- make a variable x
and explicitly define that it has no value, which can be used anywhere where options of Int
s are accepted. An alternative could be Option.empty[Int]
, which does the same thing. You may have to use this alternative form when it's not clear to the compiler what Option
type to infer.
Let's do the same for Some[A]
:
val y: Option[Int] = Some(42)
--make a variable y and say it has the value 42
, which can be used anywhere where options of Int
s are accepted. The constructor for Some
takes one value of the type your option is parameterized on--in this case Int
.
If you're interop'ing with Java, you can use the Option
object's apply
method to wrap a value in an option to prevent those nasty null
s from invading Scala land:
Using Options
So this is cool and all, but how do we do anything meaningful with these values? For example, we can't just add two Option[Int]
s together, so how do we work with the actual underlying value in an option -- an Int
in this case -- if it in fact has one?
But first, since everyone loves an aside on frivolous things, let's talk about variable names. Typically you don't want to name your variables based on their types. For example, nameString: String
is a bad variable name because it's superfluous --name
is more appropriate and concise.
However, when it comes to options I think it's useful, if not important, to name those variables with some identifier that lets you know that you'll have to deal with the possibility of an empty value. It signifies not just the type but how you should be dealing with this variable.
Some examples I've seen range from nameOption
, nameOpt
, nameO
. My go-to is maybeName
. If you're looking for a great reason here, I don't have one. It's probably because that's how most option variables were named in the codebase I worked on when I started using Scala and now I just use it out of habit.
Feel free to use whatever variable naming scheme you like. While I'd highly recommend signifying your variable is an option in some way, it's certainly still acceptable to go without it e.g. name: Option[String]
. Let's get back to more important stuff now.
.get
Maybe the simplest way to use an option is with a couple of convenient methods, .isDefined
(or its inverse .isEmpty
) and .get
, which either returns the contained value for a Some
or throws a NoSuchElementException
(😱) if it's a None
.
Now that that's out of the way, let's be clear on never using .get
. Ever. No seriously, do not ever use .get
on an option. The most important reason is that you run the risk of throwing an exception where you don't expect it. In this example we're checking if the option is defined first, so the .get
is technically safe here, but it's easy to forget to perform that check every time.
Even if you do check if an option is defined every time before using it, this pattern is pretty clunky and gets tedious very quickly - and if you know anything about Scala developers, we hate repetitive tedious syntax. We can definitely write this in a more pleasing way, so there's basically no reason to ever use .get
.
Default values
Another easy way to deal with options is to just use a default value if the option is empty. This is fairly straight-forward to do with the .getOrElse
method. This is a pretty common method of handling options.
Pretty simple, right?
Pattern matching
Pattern matching may be the closest analogue to our first example with .get
. Just like pattern matching in general, using it on options allows you to easily take specific actions on all the different possibilities of values.
You can easily get the underlying value of the Some
with pattern matching as shown above.
Options as Collections
Using options as if they were collections is probably more conceptually difficult for new Scala developers but it delivers on the promise of elegant, idiomatic Scala that you should expect.
Think of options as collections with a maximum of one element -- so it's either an empty collection(None
), or a collection with just one element whose type is the A
in Option[A]
. Now that we have this special type of collection that can only be one element long, what can we do with it? Basically everything that a collection can do, which is what makes this comparison so powerful.
Let's take a look at some of the most common methods we use on a collection that we can also use on an option: map
, foreach
, flatMap
, flatten
, fold
, filter/filterNot
, and collect
. We'll also cover .getOrElse
a little more and a similar method, .orElse
.
.map
We can apply a function to the value contained in an option with the use of map. This works just like in a collection where we go through each element and apply some function. In the case that the option is empty, it just gets mapped to an empty option -- just like mapping on an empty collection.
While this is a perfectly suitable use for mapping with a transformative function, perhaps a more common use case is pulling out properties of a class that's wrapped in an option because it can get tedious quickly:
We can replace that whole match with a single map:
We really only want to do something when maybePlayer
has a value. Let's get into our options-as-collections mindset: mapping on an empty collection always gives back an empty collection--in the case of options, None
s always map to None
. Some
s work exactly like you'd expect a normal collection to map, just with a maximum of one element. We apply the function p => p.name
to each element of the "collection", so empty options stay as is, and non-empty options get the function applied to them and we get the result of the function application.
If you know a little bit of Scala syntactic sugar, you know we can further reduce this down to maybePlayer.map(_.name)
for some real clean, concise beauty.
.getOrElse
We can also use maps in conjunction with .getOrElse
that we used before since mapping returns another option.
Since .map
returns an Option
, we can chain on a .getOrElse
which will evaluate to the person's name if maybePlayer
is defined, or "No player" otherwise.
.orElse
A similar method is .orElse
, which, lets us provide a default value (just like .getOrElse
). However, instead of providing a default value of type A
we provide one of type Option[A]
. While .getOrElse
lets us either extract the contained value or provide a default both of type A
, .orElse
gives us the still-contained original value or a completely different option as a default. We can easily chain on as many as we'd like to get us many different possibilities. For whatever reason, you'll often see .orElse
called using infix notation, i.e. without a dot.
The infix notation starts looking a little weird to me once you start chaining, so I personally avoid it . In general, I basically avoid infix notation entirely unless the operator is a symbol.
.foreach
.foreach
works almost exactly like .map
does (just like its collection-based equivalent) except it's only meant for performing some side-effect, i.e. returns Unit
. If the option has a value, execute some function, otherwise do nothing.
.flatMap and .flatten
Just like in collections, flat mapping helps us avoid nesting values unnecessarily. If the function that we're using to map returns an option itself, it's probably best to use flat map to avoid something like Option[Option[A]]
, which can quickly become a nuisance to use. Flatten should be straight-forward if you've ever used it on a collection: .flatMap
is equivalent to .map(...).flatten
.
In our Player
class, height
could be missing, since it's an Option[Double]
--flat mapping gives us a clean way to extract that data.
This also enables us to cleanly provide a default value as well.
Notice the bad type inference on the lines where we don't do .getOrElse
in the function passed to .map
. The compiler infers an Any
because the resulting type when evaluating just the map is Option[Option[Double]]
, so calling .get
on it would give back Option[Double]
and we're providing a Double
as a default value. This forces us to also throw on a .getOrElse
in the map to get type we want-- so use flat map instead!
.fold
Folding with options works almost identically to the .map(...).getOrElse
pattern we saw before except in reverse order -- the default value comes first. Let's compare the two approaches:
They're visually similar and functionally identical. I tend to go the .map
& .getOrElse
route because its more intuitive to me to have the default value defined after. Whatever floats your boat here is fine.
.filter and .filterNot
These methods work the same as their collection-based counterparts. For .filter
, if an element satisfies a predicate, it remains in, otherwise it gets filtered out . .filterNot
simply inverts the predicate, as its name implies. Since there's only one "element" in an option, the predicate is only checked once. Filtering an option can make a Some
turn into either a Some
or None
, but filtering on a None
will always give back a None
.
.collect
In my opinion, .collect
on options is mostly not that useful. All operations you can do with a collect you could also do with a map. What separates them is that collect accepts a partial function whereas map accepts a plain function (which means it can also accept a partial function), and even then the implementation of collect on Option
actually calls lift
on the partial function, which converts it to a plain function. That implementation detail is actually the one thing that collect has over map -- you won't run into MatchError
s with collect.
As you can see, both map and collect accept partial functions as parameters but collect is safe from match errors. That said, this would be a rare use case as I'd say most partial functions for a collect don't have a possibility for MatchError
s anyway, so I typically find it of little utility.
Boolean Helpers
There are a few helpful methods that can concisely and idiomatically express common Boolean checks performed on options. Although these are technically still collection-like methods, I'd categorize them differently in my mental model because in the context of options they're more like convenient shorthands than collection methods.
For each of these, I'll give you what the actual implementation for them is (they're extremely short) and explain it in plain English to try and help give you intuition on how to use them. I'll also provide some alternative ways to write the same expression to give you and understanding of situations where you might use them.
.contains
This implementation is extremely simple, right? An option contains element elem
if the option is not empty AND the underlying value of the option is equal to elem
. This explanation should allow you think up what the truth table looks like for this. If the option is empty (None
) it can't contain anything! If you think of this expression in plain english you can easily tell what the evaluation should be by asking does my option contain this value?
Looking at the alternative forms, it's easy to see why we'd prefer .contains
.
These next two examples will use a variable p
, which gets passed as an argument to the methods. p
is a function A => Boolean
, i.e. a function that accepts a value of your option's contained type, A
, and returns a Boolean. It's p
for predicate -- we're checking whether an element of type A
satisfies the predicate p
, which returns true or false accordingly.
.exists
Read this as: there exists a value in this option that satisfies the predicate p
. If this looks familiar, it's because it's almost exactly the same as .contains
. . contains
is basically a specific case of .exists
where the predicate is this.get == elem
.
I won't list the other alternatives again since they're mostly the same as for .contains
. This is a convenient and concise way to do simple Boolean checks on the underlying value of an option.
.forall
Again, this implementation looks very familiar. The only difference this time is that empty options are acceptable to satisfy the predicate. Let's try to express this in plain english: every value contained in this option satisfies the predicate p
. The key word here is contained, because an option that doesn't have any values (None
) doesn't have any values that need to satisfy the predicate. This is expressed with the ||
in the implementation. An empty option OR a contained value that satisfies the predicate both make this evaluate to true.
You'll notice in the alternatives the only difference between this example and .exists
is that we're using .getOrElse(true)
instead of .getOrElse(false)
.
If cutting down on code verbosity is one of our goals, these Boolean helper methods certainly get us a step in that direction.
Simplified Source Code
In this section I'll provide a means what I think is an important way to understand Scala: looking directly at the source code. That can sound intimidating, but if you can cut to just the core logic of the code, you'll find it's actually quite simple. If you look at the source for Options, it can be a little difficult to follow as you're trying to read past the comments looking for actual code.
What I've done is stripped down the option code to just its barest, removing comments, annotations, even some keywords like final
, and reorganized the methods into what I think are logical groupings to provide you with the cleanest, simplest version of the source without changing the core of it.
Most of the type parameters shouldn't be difficult to follow, but the implicits in .orNull
and .flatten
aren't obvious, especially with the <:<
operator. If you'd like a good explanation of how that implicit parameter works, I'd recommend this excellent and thorough explanation. But if you just want to understand how options work, you can mostly ignore it.
Here's my simplified version of Option
:
Conclusions
Options are an inextricable and ubiquitous feature of Scala so it's important you're fully comfortable using and understanding them. Using options is not an option (😎)-- consider it mandatory for writing Scala. I hope this guide helped you navigate these new waters if you're a newer Scala developer or gave you a little more intuition for something you've already been using.
Last updated