Python's None problem

Nov 12, 2021 10:24 PM
JR Heard

Like many languages, Python has a "null value", which it calls None. By default, any Python variable can have the value None at any moment in time, and the only way to know whether something is currently None is to manually inspect it, like this:

foo = get_a_foo(123)

if foo is None:
	raise FooException("couldn't get a foo, oh no!")

In practice, the variables in our Python programs are non-None most of the time, and writing code that checks if things are None is tedious and ugly, so it can be very easy to conveniently forget that None exists. Instead of writing code like the snippet above, we often end up writing code like this:

foo = get_a_foo(123)

That second code snippet is a lot smaller. It's a lot easier to write and read. Look how pretty it is!

Unfortunately, if it's possible for get_a_foo() to sometimes return None, then that second code snippet can totally crash your entire program at any moment with an exception like this:

Traceback (most recent call last):
  File "foo_example.py", line 9, in <module>
  File "foo_example.py", line 5, in do_something_with
    return a_foo.x
AttributeError: 'NoneType' object has no attribute 'x'

This NoneType error is a very easy mistake to make, so it gets made very often. I have seen many production outages that were caused by this specific error.

Solving The Problem With Optional

For decades, this has been the cost of doing business in Python. There isn't any way to solve this problem at its root, because any Python variable can be None at any moment in time, and we have to remember to manually handle None values, and we forget to. None is a deficiency that's built into the core language and can never be removed.

Recently, though, the Python community gained the ability to mitigate that deficiency. With the typing module and a static type checker program like mypy, we can now use the Optional type annotation in order to clearly indicate exactly which parts of our program can sometimes be None.

# This function sometimes returns a Foo and sometimes returns None!
def get_a_foo(foo_id: int) -> Optional[Foo]:
		return load_foo_from_database(foo_id)
	except DatabaseError:
		return None

Optional comes with a very powerful corollary: in a program that uses type annotations, anything that's not explicitly marked as Optional is now never allowed to be None.

To illustrate that, let's add type annotations to the do_something_with() function we were using earlier.

def do_something_with(a_foo: Foo) -> int:
	return a_foo.x

The do_something_with() function now says that it takes a single Foo. If you try to pass something else to it, mypy will detect your mistake.

Let's try this out by seeing what mypy says when we try to call this function by passing in an integer instead of a Foo.

foo_example.py:21: error:
	Argument 1 to "do_something_with" has incompatible type "int";
	expected "Foo"  [arg-type]

Mypy says: hey, this thing takes a Foo but you gave it an int, there's a bug in your program! How nice 🙂

Now let's see what mypy says about that sometimes-None example from way earlier:

foo = get_a_foo(123)
foo_example.py:21: error:
	Argument 1 to "do_something_with" has incompatible type "Optional[Foo]";
	expected "Foo"  [arg-type]

I want to point out two things about this error message:

  1. It's just like the message from a moment ago, where we tried to pass in an int instead of a Foo. This Optional[Foo] message isn't a special case - it's just mypy telling us that we've tried to put some non-square peg into a square hole. Our function accepts a Foo, and nothing else will do - not an int, not a Bar, and not an Optional[Foo].
  2. Mypy detected a NoneType error - the extremely troublesome category of error that I was talking about earlier - without even running our program!
  3. Mypy uses a technique called "static analysis" to find the bug by just looking at our program. This means that even if this bug were buried in a little-used function in a dusty corner of our codebase, mypy would still find our mistake instantly.

We now have the ability to find out about NoneType errors before they make it into production and crash our application!

Unwrapping an Optional

If you have an Optional[Foo] and you want to access the Foo inside of it, you can do that by using either an if/else or an assert.

In general, you should prefer the if/else approach, because it allows you to unwrap your inner value and provides an opportunity for you to add some error-handling code that gracefully handles the case where your value turns out to be None (where "handles" could mean "raises a particular exception", or "falls back to an appropriate default value", etc). Here's an example:

foo = get_a_foo(123)
# `foo` is currently an `Optional[Foo]`.

if foo is None:
	raise FooException("couldn't get a foo, oh no!")
	# Within this `else` clause, `foo` is a plain old `Foo`.

You may recognize that code from the beginning of this article!

Sometimes, though, you know something that mypy doesn't; an assert statement is appropriate in that kind of situation. This often occurs in OOP-heavy code, e.g. in situations where a class has an attribute that isn't initialized until some time after the class is instantiated:

class Bar:
  	"""Formats a Foo.

  	You MUST call `.initialize_system()` before formatting your foo!
    my_foo: Optional[Foo]

    def __init__(self):
        self.my_foo = None

    def initialize_system(self, foo: Foo) -> None:
        self.my_foo = foo

    def format_foo(self) -> str:
        return f"My foo's x is {self.my_foo.x}!"

When mypy sees this program, it says:

foo_example.py:41: error:
	Item "None" of "Optional[Foo]" has no attribute "x"  [union-attr]

That's mypy's way of saying that self.my_foo can be either a Foo or None, and that it's noticed that you're not handling the None case, so your program contains a potential crash.

In order to make the mypy error message go away, you can use an assert statement:

def format_foo(self) -> str:
    # self.my_foo is currently an `Optional[Foo]`.
    assert self.my_foo is not None
    # For the rest of this method, `self.my_foo` is now a `Foo`.

    return f"My foo's x is {self.my_foo.x}!"

This assert statement is you telling mypy: "I've thought it over, and I'm so convinced that self.my_foo will not be None that I want to crash the program if it turns out that I'm wrong." Mypy knows that your program can only proceed past the assert line if self.my_foo is not None, and so it "unwraps" the Optional[Foo] into a Foo for the rest of that method.

If you choose to document one of these assert statements, a comment like # mypy makes us do this isn't very helpful; a comment (or assert message) like # Callers MUST have first called self.initialize_system() does a much better job of communicating your intent to your maintainer.

It's important to note that by using an assert, you're getting the exact same crash-y runtime behavior that we saw when when we talked about NoneType errors! The main difference here is that with this assert statement, you're explicitly opting into that crash-y behavior at the time of writing your code. Clearly-visible assert statements are a big improvement over the previous status quo, in which silent NoneType errors could be lurking anywhere, waiting to crash your code weeks after it's shipped. Your coworkers can also see your asserts, and can help you double-check their correctness during code review. Alternatively, sometimes your coworkers might be able to help you come up with alternative design for your class so that it no longer has an Optional attribute! If your class is able to do all of its initialization in its __init__() method, then it likely doesn't need to have an Optional attribute in the first place 😎

In summary: when you've got an Optional[Foo] that you want to unwrap into a Foo, mypy forces you to think about the situation and make an explicit choice: do you want to explicitly handle the None case, or are you convinced that it can never actually occur?

If you want to handle the None case, you should use an if/else. This is usually the way to go.

If you're convinced that it can never actually happen, you can use an assert.