The way to type checking 4 million lines of Python code. Part 1

Today we bring to your attention the first part of the translation of the material about how Dropbox is involved in type control of Python-code.

Dropbox writes a lot in Python. This is a language that we use extremely widely - both for backend services and desktop client applications. We also use Go, TypeScript and Rust in large volumes, but Python is our main language. Given our scope, and we are talking about millions of lines of Python code, it turned out that the dynamic typing of such code unnecessarily complicated its understanding and began to seriously affect productivity. To mitigate this problem, we started to gradually translate our code to static type checking using mypy. This is probably the most popular stand-alone type checking system for Python. Mypy is an open source project; its main developers work at Dropbox.

Dropbox was one of the first companies to implement static type checking in Python code at a similar scale. Nowadays, mypy is used in thousands of projects. This tool is countless times, as they say, "tested in battle." We, in order to get to where we are now, had to go a long way. On this way there were many unsuccessful undertakings and failed experiments. This material talks about the history of static type checking in Python - from its very difficult beginning, which was part of my scientific research project, to the present day, when type checks and type hints have become familiar to countless developers who write in Python. These mechanisms are now supported by many tools - such as IDEs and code analyzers.

→ Read the second part

Why is type checking necessary?

If you have ever used dynamically typed Python, you may have some confusion as to why such a buzz has recently been made around static typing and mypy. Or it may be that you like Python precisely because of its dynamic typing, and what is happening just upsets you. The key to the value of static typing is the scale of the decisions: the larger your project is, the more you tend to static typing, and, in the end, the more you really need it.

Suppose a project reaches tens of thousands of lines, and it turns out that several programmers are working on it. Considering such a project, based on our experience, we can say that understanding its code will be the key to supporting developer productivity. Without type annotations, it’s not easy to figure out, for example, which arguments you need to pass to the function, or which values of which types a function can return. Here are typical questions that are often difficult to answer without using type annotations:

Can this function return None

?
What should this items

argument be?
What is the id

: int

attribute type, is it str

, or maybe some custom type?
Should this argument be a list? Is it possible to pass a tuple into it?

If you look at the following code snippet, equipped with type annotations, and try to answer such questions, it turns out that this is the simplest task:

 class Resource:    id: bytes    ...    def read_metadata(self,                      items: Sequence[str]) -> Dict[str, MetadataItem]:        ...

read_metadata

does not return None

, because the return type is not Optional[…]

.
The items

argument is a sequence of strings. It cannot be iterated in random order.
The id

attribute is a string of bytes.

In an ideal world, one would expect that all such subtleties would be described in the built-in documentation (docstring). But experience gives a lot of examples of the fact that such documentation in the code that you have to work with is often not observed. Even if such documentation is present in the code, one cannot count on its absolute correctness. This documentation may be unclear, inaccurate, leaving a lot of possibilities for its misunderstanding. In large teams or in large projects, this problem can become extremely acute.

Although Python performs very well in the early or intermediate stages of projects, at some point successful projects and companies that use Python may face a vital question: “Do we need to rewrite everything in a statically typed language?”

Type checking systems like mypy solve the aforementioned problem by providing the developer with a formal language for describing types, and by checking that type descriptions are consistent with program implementations (and, optionally, checking for their existence). In general, we can say that these systems give us something like carefully checked documentation.

The use of such systems has other advantages, and they are already completely nontrivial:

The type checking system can detect some small (as well as not very small) errors. A typical example is when they forget to process the value None

or some other special condition.
Code refactoring is greatly simplified, since the type checking system often very accurately reports which code needs to be changed. At the same time, we do not need to hope for 100% coverage of the code with tests, which, in any case, is usually impossible. We do not need to examine the depth of the stack trace reports in order to find out the cause of the problem.
Even in large projects, mypy can often do a full type check in a split second. And the execution of tests usually takes tens of seconds or even minutes. The type checking system gives the programmer instant feedback and allows him to do his job faster. He no longer needs to write unit tests that are fragile and heavy in support, which replace real entities with mokas and patches just to get faster code test results.

IDEs and editors, such as PyCharm or Visual Studio Code, use the power of type annotations to provide developers with the ability to automatically complete code, highlight errors, and support commonly used language constructs. And these are just some of the advantages that typing gives. For some programmers, all this is the main argument for typing. This is what benefits immediately after implementation. This type-use case does not require a separate type-checking system, such as mypy, although it should be noted that mypy helps maintain consistency between type annotations and code.

Mypy Background

The mypy story began in the UK, in Cambridge, a few years before I joined Dropbox. As part of my doctoral research, I dealt with the unification of statically typed and dynamic languages. I was inspired by an article on the gradual typing of Jeremy Siek and Walid Taha, as well as the Typed Racket project. I tried to find ways to use the same programming language for various projects - from small scripts to code bases consisting of many millions of lines. At the same time, I would like for a project of any scale not to compromise too much. An important part of all this was the idea of a gradual transition from an untyped prototype project to a comprehensively tested statically typed finished product. These days, these ideas are largely taken for granted, but in 2010 it was a problem that was still being actively explored.

My initial work in the field of type checking was not aimed at Python. Instead, I used the little "home-made" language Alore . Here is an example that will let you understand what is at stake (type annotations are optional here):

 def Fib(n as Int) as Int  if n <= 1    return n  else    return Fib(n - 1) + Fib(n - 2)  end end

Using a simplified language of our own design is a common approach used in scientific research. This is not least due to the fact that this allows you to quickly conduct experiments, and also because the fact that the research is irrelevant, can be freely ignored. Actually used programming languages are usually large-scale phenomena with complex implementations, and this slows down experiments. However, any results based on a simplified language look a little suspicious, since when the results were obtained, the researcher may have sacrificed considerations that are important for the practical use of languages.

My type checking tool for Alore looked very promising, but I wanted to test it by experimenting with real code, which, one might say, was not written on Alore. Luckily, the Alore language was largely based on the same ideas as Python. It was simple enough to remake the type checking tool so that it could work with Python syntax and semantics. This allowed us to try type checking in open source Python code. In addition, I wrote a transporter to convert code written in Alore to Python code and used it to translate the code of my type checking tool. Now I had a type checking system written in Python that supported a subset of Python, some kind of language! (Certain architectural solutions that made sense for Alore were poorly suited to Python; this is still noticeable in parts of the mypy code base.)

In fact, the language supported by my type system at this point could not have been called Python: it was a Python variant due to some limitations of the syntax of Python 3 type annotations.

It looked like a mixture of Java and Python:

 int fib(int n):    if n <= 1:        return n    else:        return fib(n - 1) + fib(n - 2)

One of my ideas at the time was to use type annotations to improve performance by compiling this kind of Python in C, or possibly in JVM bytecode. I advanced to the stage of writing the prototype of the compiler, but I left this idea, since type checking itself looked quite useful.

I ended up presenting my project at the PyCon 2013 conference in Santa Clara. I also talked about this with Guido van Rossum, the generous lifelong dictator of Python. He convinced me to abandon my own syntax and stick to the standard Python 3 syntax. Python 3 supports function annotations, as a result, my example could be rewritten as shown below, getting a normal Python program:

 def fib(n: int) -> int:    if n <= 1:        return n    else:        return fib(n - 1) + fib(n - 2)

I needed to make some compromises (first of all, I want to note that I invented my own syntax for this reason). In particular, Python 3.3, the most recent version of the language at that time, did not support variable annotations. I have discussed with Guido by email the various possibilities for syntaxizing such annotations. We decided to use type comments for variables. This allowed us to achieve our goal, but it looked a bit cumbersome (Python 3.6 gave us a more pleasant syntax):

 products = [] # type: List[str] # Eww

Type comments also came in handy for supporting Python 2, which lacks built-in support for type annotations:

 f fib(n):    # type: (int) -> int    if n <= 1:        return n    else:        return fib(n - 1) + fib(n - 2)

It turned out that these (and other) compromises, in fact, did not have much significance - the benefits of static typing led to the fact that users soon forgot about the not quite perfect syntax. Since no special syntactic constructions were used in the Python code in which the types were controlled, the existing Python tools and code processing processes continued to work normally, which greatly facilitated the development of the new tool by developers.

Guido also convinced me to join Dropbox after I defended my graduation. This is where the fun part in mypy history begins.

To be continued…

Dear readers! If you use Python, please tell us about the scale of projects you are developing in this language.

All Articles

The way to type checking 4 million lines of Python code. Part 1

Why is type checking necessary?

Mypy Background

More articles: