The way to type checking 4 million lines of Python code. Part 3

We present to your attention the third part of the translation of material on the path that Dropbox has taken, introducing a system for checking the types of Python code.







โ†’ Previous parts: first and second



Reaching 4 million lines of typed code



Another important task (this was the second most popular problem that worried those who participated in internal polls) was to increase the amount of code in Dropbox covered with type checks. We tried several approaches to solve this problem - from the natural growth of the volume of the typed code base to the focus of the efforts of the mypy team on static and dynamic automated type inference. As a result, it seemed that there is no simple winning strategy, but we were able to achieve rapid growth in the volume of annotated code by combining many approaches.



As a result, in our largest Python repository (with backend code), the number of lines of annotated code has reached almost 4 million. Work on static typing of the code was carried out in about three years. Mypy now supports various types of code coverage reports that make it easier to monitor typing progress. In particular, we can generate reports on code with uncertainties in types, such as, for example, explicit use of the Any



type in annotations that cannot be verified, or such as importing third-party libraries in which there are no type annotations. As part of a project to increase the accuracy of type checking in Dropbox, we have contributed to improving type definitions (so-called stub files) for some popular open source libraries in the central typeshed Python repository.



We have implemented (and standardized in subsequent PEPs) new features of the type system, which allow us to use more precise types for some specific Python patterns. A notable example of this is TypeDict



, which provides types for JSON-like dictionaries that have a fixed set of string keys, each of which has a value of its own type. We will continue to expand the type system. Probably our next step will be to improve support for Python's ability to work with numbers.









Number of lines of annotated code: server









Number of lines of annotated code: client









The total number of lines of annotated code



Here is an overview of the main features of the actions that we performed to increase the volume of annotated code in Dropbox:



The rigor of annotation. We gradually increased the requirements for the rigor of annotating the new code. We started with linter tips that suggested adding annotations to files that already have some annotations. Now we require type annotations in the new Python files and in most existing files.



Typing reports. We send out weekly reports to teams about the level of typing of their code and give tips regarding what should be annotated in the first place.



Popularizing mypy. We talk about mypy at various events and communicate with teams to help them start using type annotations.



Polls. We conduct periodic user surveys to identify major issues. We are ready to go far enough in solving these problems (right up to creating a new language for the sake of accelerating mypy!).



Performance. We have greatly improved mypy performance through the use of the daemon and mypyc. This was done in order to smooth out the inconveniences that arise during the annotation process, and in order to be able to work with large amounts of code.



Integration with editors. We have created tools to support the launch of mypy in editors that are popular on Dropbox. This includes PyCharm, Vim, and VS Code. This greatly simplified the process of annotating the code and verifying its performance. Such actions are usually typical when annotating existing code.



Static Analysis We created a tool for outputting function signatures using static analysis tools. This tool can work only in relatively simple situations, but it helped us to increase the coverage of types with little effort.



Support for third-party libraries. Many of our projects use the SQLAlchemy toolkit. It uses the dynamic capabilities of Python, which the PEP 484 types are unable to model directly. According to PEP 561, we created the corresponding stub file and wrote a plugin for mypy ( open source ) that improves SQLAlchemy support.



The difficulties we met



The path to 4 million lines of typed code was not always easy for us. On this way we met a lot of holes and made some mistakes. Here are some of the problems we have encountered. We hope that the story about them will help others avoid such problems.



Skipped files. We started by checking only a small amount of files. Everything not included in the number of these files was not checked. Files were added to the check list when the first annotations appeared in them. If something was imported from a module located outside the scope of the check, then we were talking about working with values โ€‹โ€‹of type Any



, which were not checked at all. This led to a significant loss of typing accuracy, especially in the early stages of migration. This approach has worked surprisingly well until now, although it was typical that adding files to the scan area would reveal problems in other parts of the code base. In the worst case scenario, when two isolated areas of code were combined, in which, independently of each other, types were already checked, it turned out that the types of these areas were incompatible with each other. This made it necessary to make many changes to the annotations. Now, looking back, we understand that we should add basic library modules to the mypy type checking area as early as possible. This would make our work much more predictable.



Annotating old code. When we started, we had about 4 million lines of existing Python code. It was clear that annotating all of this code was no easy task. We created a tool called PyAnnotate, which can collect type information during test execution and can add type annotations to the code based on the collected information. However, we did not notice a particularly widespread introduction of this tool. Typing information about types was slow; automatically generated annotations often required many manual edits. We thought about automatically launching this tool every time you check the code, or about collecting type information based on an analysis of some small amount of real network requests, but decided not to, because any of these approaches is too risky.



As a result, it can be noted that most of the code was manually annotated by its owners. In order to direct this process in the right direction, we are preparing reports on especially important modules and functions that need to be annotated. For example, it is important to provide type annotations for the library module used in hundreds of places. But the old service, which is being replaced by a new one, annotating is no longer so important. We are also experimenting using static analysis to generate type annotations for old code.



Loop imports. Earlier, I talked about cyclic imports (โ€œtangle of dependenciesโ€), the existence of which complicated the acceleration of mypy. In addition, we had to work hard to provide mypy with support for all kinds of idioms caused by these cyclic imports. We recently completed a major system redesign project that fixed most of mypy's cyclic import issues. These problems, in fact, arose from the very early days of the project, also from Alore, the educational language that mypy was originally oriented to. Alore syntax makes it easy to solve the problems of cyclic import commands. Modern mypy has inherited some limitations from its early ingenuous implementation (which worked great for Alore). Python makes it difficult to work with circular imports, mainly due to the ambiguity of expressions. For example, during an assignment operation, a type alias may actually be determined. Mypy is not always able to detect such things until most of the import cycle has been processed. Alore did not have such ambiguities. Unsuccessful decisions made in the early stages of system development can present an unpleasant surprise to a programmer after many years.



Summary: the path to 5 million lines of code and new horizons



The mypy project has come a long way - from early prototypes to a system that controls the types of production code with a volume of 4 million lines. As mypy developed, type hints were standardized in Python. A powerful ecosystem has evolved around typing Python code these days. It found a place to support libraries, it contains auxiliary tools for IDEs and editors, it has several type control systems, each of which has its pros and cons.



Although type checking is already taken for granted in Dropbox, Iโ€™m sure that we still live at the dawn of Python typing. I think that type checking technologies will continue to evolve and improve.



If you have not used type checks in your large-scale Python project, then you should know that now is a good time to start the transition to static typing. I talked with those who made a similar transition. None of them regretted it. Type control turns Python into a language that is much better than "normal Python" for developing large projects.



Dear readers! Do you use type control in your Python projects?








All Articles