The author of the article, the translation of which we publish today, says that
this post and comments on it served as a source of inspiration for its writing. According to him, IT-specialists have misconceptions about types, use incorrect terminology and, discussing issues related to types, come to erroneous conclusions. He notes that he is not a defender of the static type system. The only thing that bothers him is the correct use of terms. This allows for constructive discussions. The author says that he wrote this material spontaneously, but hopes that there are no errors in it. If he confused something, he asks to let
him know about it.
Let’s understand once and for all what causes confusion when talking about type systems.
Dynamic typing and lack of typing
Some people think that a dynamic type system is the same as an untyped type system. The lack of typing means that in a certain type system it makes no sense to distinguish between types. It makes no sense to distinguish types even if there is only one type in the type system. For example:
- In assembler, the only type is a string of bits.
- In lambda calculus, the only type is function.
Someone may say this on this occasion: "But what difference does it make - dynamic typing or the lack of typing - also a question for me." But this, in fact, is a big and important issue. The fact is that if you equate dynamic typing to the lack of typing, it means the automatic adoption of the fact that a dynamic type system is the opposite of a static type system. As a result, two opposing camps of programmers are formed - the dynamic typing camp and the static typing camp (and this, as we will see in the corresponding section, is wrong).
Languages that do not limit the range of variable values are called untyped languages: they have no types, or, which is the same thing, they have only one universal type that contains all the values.
Type Systems, Luca Cardelli
Programming languages have one interesting feature that allows you to roughly divide their world into two groups:
- Untyped languages - programs simply execute. This happens quickly, without trying to check the “uniformity of forms”.
- Typed languages - an attempt is made to check the "uniformity of form" - either during compilation or during program execution.
“Type systems for programming languages,” Benjamin Pearce
Dynamic and static typing
A dynamic type system is a system in which types are checked dynamically (during program execution). A static type system is a system in which types are checked statically (during compilation or code translation).
Is one of these systems the opposite of the other? No, it's not. Both types systems can be used in the same language. In fact, most static type systems also have dynamic type checks. As an example, consider the validation of input-output operations (input-output, IO). Imagine that you need to read data provided by a user who must enter a number. You will check, during program execution, whether the number is the result of parsing the corresponding line (as a result of parsing, an exception may be thrown or something like
NaN
will be returned). When you check the data entered by the user, figuring out whether they can be considered as a number - you perform a dynamic type check.
As a result, we can note the absence of confrontation between static and dynamic types. You can use, in the same language, both those and others.
Moreover, it should be noted that static type checking is a complex process. Sometimes it is very difficult to statically check some parts of a program. As a result, instead of applying static type checks, you can resort to dynamic checks.
It is recommended to consider a static type system as types that are checked statically. A dynamic type system is like types that are checked dynamically.
Does using static types mean knowing types at compile time?
The question posed in the title of this section can be answered in the negative. If you open the source code of any parser (including the JavaScript parser), you can see that the parser knows the types of values during parsing (this is part of the compilation process).
let x = "test";
It turns out that the parser knows that
"test"
is a string. Does this make JavaScript a static typed language? No, it doesn’t.
Gradual typing
A gradual type system is a static type system that allows you to skip type checks for some parts of a program. For example, in TypeScript, this is implemented using
any
or
@ts-ignore
.
On the one hand, this makes the type system less secure. On the other hand, a type system with gradual typing allows you to gradually add type descriptions to languages with dynamic typing.
Reliable and unreliable type systems
If you use a reliable type system (sound type system), the program will not be “approved” during type checking if there are errors related to types in this program. Using an unsound type system leads to the fact that the program may contain errors associated with types. True, you should not panic after you find out about it. In practice, this may not concern you. Reliability or soundness is a mathematical property of a type checking algorithm that needs proof. Many existing compilers (internally, type checking systems) are unreliable.
If you want to work with reliable type systems, take a look at the programming languages of the ML family, which use the Hindley-Milner type system.
In addition, it must be understood that a reliable type system will not skip the wrong program (it does not give false positive test results, considering the wrong programs to be correct), but it may not miss the correct program (it can give false negative test results).
A type system that never rejects the correct program is called complete.
Does it happen that a type system is both reliable and complete? As far as I know, such type systems do not exist. I’m not sure about this to the end, but it seems to me that the existence of such type systems, if based on Gödel’s incompleteness theorem, is fundamentally impossible (I can be wrong, though).
Weak and strong typing
I find it inappropriate to use the terms “weak typing” and “strong typing”. These terms are ambiguous, their use can give more confusion than clarity. Let me give you a few quotes.
These languages can be called, figuratively speaking, languages with weak type checking (or weakly typed languages, as they are usually called in various publications). The use of weak type checking in the language means that some unsafe operations are detected statically, and some are not. The "weakness" of type checks in languages of this class varies greatly.
Type Systems, Luca Cardelli
Probably the most common way to classify type systems is to divide them into systems with “weak” and “strong” typing. This can only be regretted, since these words do not carry practically any meaning in themselves. It is possible, to a limited extent, to compare two languages that have very similar type systems and select one of them as having a stronger type system than the second. In other cases, the terms “strong typing” and “weak typing” are completely meaningless.
“What You Need to Know Before Discussing Type Systems,” Steve Klabnik
The terms “strong typing” and “weak typing” are extremely ambiguous. Here are some examples of their use:
- Sometimes “strong typing” is understood to mean “static typing”. It is not difficult to make such a “substitution”, but it’s better, speaking of static typing, to simply call it “static”. The fact is that most programmers quite clearly understand this term.
- Sometimes, when they say “strong typing”, they mean “the absence of an implicit type conversion”. For example, in JavaScript, you can use expressions like
"a" - 1
. This can be called a model of "weak typing." But almost all languages give the programmer some opportunities for implicit type conversion, for example, supporting the automatic conversion of integers to floating point numbers in expressions like 1 - 1.1
. In practice, most professionals who use the term “strong typing” in this way distinguish between “acceptable” and “unacceptable” type conversions. But there is no generally accepted boundary between such type conversions. The “acceptability” and “unacceptability” of transformations is a subjective assessment, depending on the opinion of a particular person. - Sometimes languages with "strong typing" are called those languages in which it is impossible to circumvent the rules of their type system.
- Sometimes “strong typing” means having a type system that allows you to safely work with memory. The C language is a notable example of a language that works unsafely with memory. For example, if
xs
is an array of four numbers, C will approve a code that uses constructs like xs[5]
or xs[1000]
without any problems. They will allow accessing the memory located after the addresses allocated for storage of the contents of the xs
array.
The Types, Gary Bernard
Do languages with static typing need type declarations?
Statically typed languages do not always need type declarations. Sometimes a type system can infer types (by making assumptions based on code structure). Here is an example (TypeScript):
const x = "test";
The type system knows that
"test"
is a string (this knowledge is based on the rules of code parsing). The type system also knows that
x
is a constant, that is, the value of
x
cannot be reassigned. As a result, it can be concluded that
x
is of string type.
Here is another example (Flow):
const add = (x, y) => x / y // ^ [1] . add(1, "2")
The type checking system sees that we call the
add
function, passing it a number and a string. This analyzes the declaration of the function. The type checking system knows that in order to perform the division operation, numbers must be on the right and left of the corresponding operator. One of the operands involved in the division operation is not a number. As a result, we are informed about the error.
There are no type declarations here, but this does not prevent a static type check from the above program. If you encounter similar situations in the real world, then, sooner or later, you will have to declare some types. The type system cannot deduce absolutely all types. But you need to understand that a language can be called "static" not because it uses type declarations, but because types are checked before the program starts.
Is TypeScript an unsafe language because code written in it is compiled into JavaScript code?
TypeScript is an unsound language. Therefore, the code written on it can turn into unsafe applications. But this has nothing to do with what it compiles into.
Most desktop compilers convert programs to something that looks like assembly language. And assembler is a language that is even less secure than JS.
Here, if you come back to the idea that TS is insecure due to compilation in JS, you might get the following thought: “Compiled code is executed in the browser, JS is an unsafe language, and it may well substitute
null
to the place where the line is expected ". Thought is sensible. But this, again, does not give a reason to call TS an unsafe language. In order for TS to guarantee security within the application, you need to place the "defense mechanisms" in those places where the TS code interacts with the outside world. That is, for example, you need to check the correctness of the data entering the program through the input-output mechanisms. Let's say this can be checking what the user enters, checking server responses, checking data read from the browser storage, and so on.
For example, the role of such “defense mechanisms” in Elm is played by “
ports ”. In TS, you can use something like io-ts for this.
The corresponding “protective mechanism” creates a bridge between static and dynamic type systems.
Here is a simplified example:
const makeSureIsNumber = (x: any) => { const result = parseFloat(x); if (isNaN(result)) { throw Error("Not a number"); } return result; } const read = (input: any) => { try { const n = makeSureIsNumber(input); // n, , // // makeSureIsNumber "" , n } catch (e) { } }
Is it true that types are only needed for compilers?
Types are just the hack needed to give hints to the compiler.
Waut Mertens
Are types only needed for compilers? This is a philosophical question. Types are needed for people, not cars. Compilers need types because they are programs written by people.
The phenomenon of types exists because of people. Types do not exist until a person perceives something in the form of a “data type”. The human mind distributes different entities into different categories. Types do not make sense without an observer.
Let's arrange a thought experiment. Think of the Life game. You have a two-dimensional grid consisting of square cells. Each of the cells can be in two possible states. It can be "alive" or "dead." Each cell can interact with its eight neighbors. These are cells that border it vertically, horizontally, or diagonally. In the process of finding the next state of the system, the following rules apply:
- A “living” cell with fewer than two “living” neighbors “dies”, as in a low population density.
- A “living” cell, having two or three “living” neighbors, survives, and falls into the next generation.
- A “living” cell with more than three “living” neighbors “dies”, as if it were overpopulated.
- The “dead” cell, which has exactly three “living” neighbors, becomes “live”, as in the reproduction of the population.
Outwardly, it looks like a field, divided into square cells, which are constantly “turned on” and “turned off”.
Here you can take a look at it.
If for some time to observe the “Life”, then on the field can appear stable structures like a “glider” (“glider”).
"Glider"
See him? A glider moves across the screen. True? Now let's slow down a bit. Does this glider really exist? These are just individual squares that appear and disappear. But our brain can perceive this structure as something objectively existing.
In addition, we can say that the “glider” exists because the squares are not independent (they depend on the neighbors), and even if the “glider” itself does not exist, then there is a “glider” in the form platonic ideas.
Summary
Consider any program written in a typed programming language. We can observe the types. Right? But the program compiles into machine codes. In these codes, the same thing is expressed as in the original program (though it’s hard for a person to read machine representations of programs). From a computer point of view, there are no types. He sees only sequences of bits - sets of zeros and ones (“dead” and “living” cells). Types exist for people, not for machines.
Dear readers! What type system could you consider ideal for web development?