Hello, my name is Dmitry Karlovsky and I ... I want to tell you about the fundamental feature of type systems, which often or do not understand or do not understand correctly through the prism of the implementation of a particular language, which, due to evolutionary development, has many atavisms. Therefore, even if you think you know what “variation” is, try to look at the issues with a fresh look. We will start from the very basics, so even a beginner will understand everything. And we continue without water, so that even the pros will be useful for structuring their knowledge. Code examples will be in a pseudo-language similar to TypeScript. Then the approaches of several real languages will be examined. And if you are developing your own language, then this article will help you not to step on someone else's rake.
Arguments and Parameters
The parameter is what we accept. Describing the type of parameter, we set a restriction on the set of types that can be passed to us. A few examples :
// function log( id : string | number ) {} // class Logger { constructor( readonly id : Natural ) {} } // class Node< Id extends Number > { id : Id }
An argument is what we pass on. At the time of transfer, the argument always has some particular type. However, in static analysis, a particular type may not be known, which is why the compiler again operates with type restrictions. A few examples:
log( 123 ) // new Logger( promptStringOrNumber( 'Enter id' ) ) // new Node( 'root' ) // ,
Subtypes
Types can form a hierarchy. A subtype is a special case of a supertype . A subtype can be formed by narrowing the set of possible values of the supertype. For example, the type Natural is a subtype of Integer and Positive. And all three are subtypes of Real at the same time. And the Prime type is a subtype of all of the above. At the same time, the Positive and Integer types are overlapping, but none of them constrict the other.
Another way to form a subtype is to expand it by combining it with another type orthogonal to it. For example, there is a “color figure” with the property “color”, and there is a “square” with the property “height”. By combining these types we get a "color square". And adding a “circle” with its “radius” we can get a “color cylinder”.
Hierarchies
For further narration, we need a small hierarchy of animals and a similar hierarchy of cells.
abstract class Animal {} abstract class Pet extends Animal {} class Cat extends Pet {} class Dog extends Pet {} class Fox extends Animal {} class AnimalCage { content : Animal } class PetCage extends AnimalCage { content : Pet } class CatCage extends PetCage { content : Cat } class DogCage extends PetCage { content : Dog } class FoxCage extends AnimalCage { content : Fox }
Everything below is a narrowing of the type above. A cage with a pet can only contain pets, but not wild animals. A cage with a dog can only contain dogs.
Covariance
The simplest and most understandable is the restriction of a supertype or covariance. In the following example, the function parameter is covariant to the type specified for it. That is, the function can accept both this type itself and any subtype of it, but cannot accept supertypes or other types.
function touchPet( cage : PetCage ) : void { log( `touch ${cage.content}` ) } touchPet( new AnimalCage ) // forbid touchPet( new PetCage ) // allow touchPet( new CatCage ) // allow touchPet( new DogCage ) // allow touchPet( new FoxCage ) // forbid
Since we do not change anything in the cage, we can safely transfer functions to the cage with the cat, since it is nothing more than a special case of the cage with a pet.
Contravariance
It’s a little harder to understand subtype restriction or contravariance. In the following example, the function parameter is contravariant to the type specified for it. That is, the function can accept both this type itself and any of its supertypes, but cannot accept subtypes or other types.
function pushPet( cage : PetCage ) : void { const Pet = random() > .5 ? Cat : Dog cage.content = new Pet } pushPet( new AnimalCage ) // allow pushPet( new PetCage ) // allow pushPet( new CatCage ) // forbid pushPet( new DogCage ) // forbid pushPet( new FoxCage ) // forbid
We can’t pass the cage with the cat, as the function can put the dog there, which is not permissible. But the cage with any animal can be safely transferred, since both the cat and the dog can be placed there.
Invariance
Subtype and supertype restriction can be at the same time. Such a case is called invariance. In the following example, the function parameter is invariant to the type specified for it. That is, the function can only accept the specified type and no more.
function replacePet( cage : PetCage ) : void { touchPet( cage ) pushPet( cage ) } replacePet( new AnimalCage ) // forbid replacePet( new PetCage ) // allow replacePet( new CatCage ) // forbid replacePet( new DogCage ) // forbid replacePet( new FoxCage ) // forbid
The replacePet
function inherits the restrictions of those functions that it uses internally: it took the restriction on the type from pushPet
, and the restriction on the subtype by pushPet
. If we give her a cage with any animal, she will not be able to transfer it to the touchPet function, which does not know how to work with foxes (a wild animal will simply bite off a finger). And if we transfer the cage with the cat, then it will not work to call pushPet
.
Bivariance
One cannot but mention the exotic absence of restrictions - bivariance. In the following example, a function can accept any type that is a subtype or subtype.
function enshurePet( cage : PetCage ) : void { if( cage.content instanceof Pet ) return pushPet( cage ) } replacePet( new AnimalCage ) // allow replacePet( new PetCage ) // allow replacePet( new CatCage ) // allow replacePet( new DogCage ) // allow replacePet( new FoxCage ) // forbid
In it you can transfer the cage with the animal. Then she will check that there is a pet in the cage, otherwise she will put it inside a random pet. And you can transfer, for example, a cage with a cat, then she just will not do anything.
Generalizations
Some believe that variance is somehow related to generalizations. Often because variance is often explained using generic containers as an example. However, in the whole story we still did not have a single generalization - it is entirely concrete classes:
class AnimalCage { content : Animal } class PetCage extends AnimalCage { content : Pet } class CatCage extends PetCage { content : Cat } class DogCage extends PetCage { content : Dog } class FoxCage extends AnimalCage { content : Fox }
This was done to show that the problems of variance are in no way connected with generalizations. Generalizations are needed only to reduce copy-paste. For example, the code above can be rewritten through a simple generalization:
class Cage<Animal> { content : Animal }
And now you can create instances of any cells:
const animalCage = new Cage<Animal>() const petCage = new Cage<Pet>() const catCage = new Cage<Cat>() const dogCage = new Cage<Dog>() const foxCage = new Cage<Fox>()
Declaration of limitations
Please note that the signatures of all four previously listed functions are exactly the same:
( cage : PetCage )=> void
That is, such a description of the accepted parameters of the function does not have completeness - it cannot be said from it that it can be transferred to the function. Well, unless it is clearly seen that it is definitely not worth passing the cage with the fox into it.
Therefore, in modern languages there is a means for explicitly indicating what type restrictions a parameter has. For example, the in and out modifiers in C #:
interface ICageIn<in T> { T content { set; } } // contravariant generic parameter interface ICageOut<out T> { T content { get; } } // covariant generic parameter interface ICageInOut<T> { T content { get; set; } } // invariant generic parameter
Unfortunately, in C # for each variant of modifiers it is necessary to start on a separate interface. In addition, as I understand it, bivariance in C # is generally inexpressible.
Output parameters
Functions can not only accept, but also return values. In general, the return value may not be one. As an example, take the function of taking a cage with a pet and returning two pets.
function getPets( input : PetCage ) : [ Pet , Pet ] { return [ input.content , new Cat ] }
Such a function is equivalent to a function that takes, in addition to one input parameter, two more outputs.
function getPets( input : PetCage , output1 : PetCage , output2 : PetCage ) : void { output1.content = input.content output2.content = new Cat }
External code allocates additional memory on the stack so that the function puts everything it wants to return into it. And upon completion, the calling code will already be able to use these containers for their own purposes.
From the equivalence of these two functions, it follows that the values returned by the function, in contrast to the parameters, are always contravariant to the specified output type. For a function can write to them, but cannot read from them.
Object Methods
Object methods are functions that take an extra pointer to an object as an implicit parameter. That is, the following two functions are equivalent.
class PetCage { pushPet() : void { const Pet = random() > .5 ? Cat : Dog this.content = new Pet } }
function pushPet( this : PetCage ) : void { const Pet = random() > .5 ? Cat : Dog this.content = new Pet }
However, it is important to note that a method, unlike a regular function, is also a member of the class, which is an extension of the type. This leads to the fact that there is an additional restriction of a supertype for functions calling this method:
function fillPetCage( cage : PetCage ) { cage.pushPet() }
We cannot pass such an pushPet
to it where the pushPet
method pushPet
not yet defined. This is similar to the case with invariance in that there is a restriction both from below and from above. However, the location of the pushPet
method may be higher in hierarchy. And that is where the overtype restriction will be.
Barbara Lisk Substitution Principle (LSP)
Many people believe that the ratio of a subtype-subtype is determined not based on the previously mentioned methods of narrowing and expanding the type, but by the ability to substitute the subtype at any place of using the supertype. Apparently, the reason for this error is precisely in the LSP. However, let's read the definition of this principle carefully, paying attention to what is primary and what is secondary:
Functions that use the base type should be able to use subtypes of the base type without knowing it and without violating the correctness of the program.
For immutable (including those not referencing mutable) objects, this principle is performed automatically, since there is no place to take the subtype restriction from.
With variable, it’s more and more difficult, since the following two situations are mutually exclusive for the LSP principle:
- Class
A
has a subclass ofB
, where the fieldB::foo
is a subtype ofA::foo
. - Class
A
has a method that can change theA::foo
field.
Accordingly, there are only three ways left:
- Prevent objects from inheriting from narrowing their field types. But then in the cage for the cat you can shove the elephant.
- Guided not by LSP, but by the variability of each parameter of each function separately. But then you have to think a lot and explain to the compiler where the type restrictions are.
- Spit on everything and go to
monasteryfunctional programming, where all objects are immutable, which means that their accepting parameters are covariant to the declared type.
TypeScript
The logic in the script is simple: all parameters of the function are considered covariant (which is not true), and the return values are considered contravariant (which is true). It was previously shown that the parameters of a function can have absolutely any variation, depending on what this function does with these parameters. Therefore, these are the following incidents:
abstract class Animal { is! : 'cat' | 'dog' | 'fox' } abstract class Pet extends Animal { is! : 'cat' | 'dog' } class Cat extends Pet { is! : 'cat' } class Dog extends Pet { is! : 'dog' } class Fox extends Animal { is! : 'fox' } class Cage<Animal> { content! : Animal } function pushPet( cage : Cage<Pet> ) : void { const Pet = Math.random() > .5 ? Cat : Dog cage.content = new Pet } pushPet( new Cage<Animal>() ) // forbid to push Pet to Animal Cage :-( pushPet( new Cage<Cat>() ) // allow to push Dog to Cat Cage :-(
To solve this problem, you have to help the compiler with rather nontrivial code:
function pushPet< PetCage extends Cage<Animal> >( cage: Cage<Pet> extends PetCage ? PetCage : never ): void { const Pet = Math.random() > .5 ? Cat : Dog cage.content = new Pet } pushPet( new Cage<Animal>() ) // allow :-) pushPet( new Cage<Pet>() ) // allow :-) pushPet( new Cage<Cat>() ) // forbid :-) pushPet( new Cage<Dog>() ) // forbid :-) pushPet( new Cage<Fox>() ) // forbid :-)
Flowjs
FlowJS has a more advanced type system. In particular, in the type description it is possible to indicate its variability for generalized parameters and for object fields. In our cell example, it looks something like this:
class Animal {} class Pet extends Animal {} class Cat extends Pet {} class Dog extends Pet {} class Fox extends Animal {} class Cage< Animal > { content : Animal } function touchPet( cage : { +content : Pet } ) : void { console.log( `touch ${typeof cage.content}` ) } function pushPet( cage: { -content: Pet } ): void { const Pet = Number((0: any)) > .5 ? Cat : Dog cage.content = new Pet } function replacePet( cage : { content : Pet } ) : void { touchPet( cage ) pushPet( cage ) } touchPet( new Cage<Animal> ) // forbid :-) touchPet( new Cage<Pet> ) // allow :-) touchPet( new Cage<Cat> ) // allow :-) touchPet( new Cage<Dog> ) // allow :-) touchPet( new Cage<Fox> ) // forbid :-) pushPet( new Cage<Animal> ) // allow :-) pushPet( new Cage<Pet> ) // allow :-) pushPet( new Cage<Cat> ) // forbid :-) pushPet( new Cage<Dog> ) // forbid :-) pushPet( new Cage<Fox> ) // forbid :-) replacePet( new Cage<Animal> ) // forbid :-) replacePet( new Cage<Pet> ) // allow :-) replacePet( new Cage<Cat> ) // forbid :-) replacePet( new Cage<Dog> ) // forbid :-) replacePet( new Cage<Fox>) // forbid :-)
Bivariance here is inexpressible. Unfortunately, I could not find a way to more conveniently set the variance without explicitly describing the types of all fields. For example, something like this:
function pushPet( cage: Contra< Cage<Pet> , 'content' > ): void { const Pet = Number((0: any)) > .5 ? Cat : Dog cage.content = new Pet }
C sharp
C # was originally designed without any understanding of variation. However, later in it out and parameter modifiers were added, which allowed the compiler to correctly check the types of arguments passed. Unfortunately, using these modifiers is again not very convenient.
using System; abstract class Animal {} abstract class Pet : Animal {} class Cat : Pet {} class Dog : Pet {} class Fox : Animal {} interface ICageIn<in T> { T content { set; } } interface ICageOut<out T> { T content { get; } } interface ICageInOut<T> { T content { get; set; } } class Cage<T> : ICageIn<T>, ICageOut<T>, ICageInOut<T> { public T content { get; set; } } public class Program { static void touchPet( ICageOut<Pet> cage ) { Console.WriteLine( cage.content ); } static void pushPet( ICageIn<Pet> cage ) { cage.content = new Dog(); } static void replacePet( ICageInOut<Pet> cage ) { touchPet( cage as ICageOut<Pet> ); pushPet( cage as ICageIn<Pet> ); } void enshurePet( Cage<Pet> cage ) { if( cage.content is Pet ) return; pushPet( cage as ICageIn<Pet> ); } public static void Main() { var animalCage = new Cage<Animal>(); var petCage = new Cage<Pet>(); var catCage = new Cage<Cat>(); var dogCage = new Cage<Dog>(); var foxCage = new Cage<Fox>(); touchPet( animalCage ); // forbid :-) touchPet( petCage ); // allow :-) touchPet( catCage ); // allow :-) touchPet( dogCage ); // allow :-) touchPet( foxCage ); // forbid :-) pushPet( animalCage ); // allow :-) pushPet( petCage ); // allow :-) pushPet( catCage ); // forbid :-) pushPet( dogCage ); // forbid :-) pushPet( foxCage ); // forbid :-) replacePet( animalCage ); // forbid :-) replacePet( petCage ); // allow :-) replacePet( catCage ); // forbid :-) replacePet( dogCage ); // forbid :-) replacePet( foxCage ); // forbid :-) } }
Java
To Java, the ability to switch varieties was added quite late and only for generalized parameters, which themselves appeared relatively recently. If the parameter is not generalized, then trouble.
abstract class Animal {} abstract class Pet extends Animal {} class Cat extends Pet {} class Dog extends Pet {} class Fox extends Animal {} class Cage<T> { public T content; } public class Main { static void touchPet( Cage<? extends Pet> cage ) { System.out.println( cage.content ); } static void pushPet( Cage<? super Pet> cage ) { cage.content = new Dog(); } static void replacePet(Cage<Pet> cage ) { touchPet( cage ); pushPet( cage ); } void enshurePet( Cage<Pet> cage ) { if( cage.content instanceof Pet ) return; pushPet( cage ); } public static void main(String[] args) { Cage<Animal> animalCage = new Cage<Animal>(); Cage<Pet> petCage = new Cage<Pet>(); Cage<Cat> catCage = new Cage<Cat>(); Cage<Dog> dogCage = new Cage<Dog>(); Cage<Fox> foxCage = new Cage<Fox>(); touchPet( animalCage ); // forbid :-) touchPet( petCage ); // allow :-) touchPet( catCage ); // allow :-) touchPet( dogCage ); // allow :-) touchPet( foxCage ); // forbid :-) pushPet( animalCage ); // allow :-) pushPet( petCage ); // allow :-) pushPet( catCage ); // forbid :-) pushPet( dogCage ); // forbid :-) pushPet( foxCage ); // forbid :-) replacePet( animalCage ); // forbid :-) replacePet( petCage ); // allow :-) replacePet( catCage ); // forbid :-) replacePet( dogCage ); // forbid :-) replacePet( foxCage ); // forbid :-) } }
C ++
C ++, thanks to its powerful template system, can express various variations, but of course there is a lot of code.
#include <iostream> #include <typeinfo> #include <type_traits> class Animal {}; class Pet: public Animal {}; class Cat: public Pet {}; class Dog: public Pet {}; class Fox: public Animal {}; template<class T> class Cage { public: T *content; }; template<class T, class = std::enable_if_t<std::is_base_of<Pet, T>::value>> void touchPet(const Cage<T> &cage) { std::cout << typeid(T).name(); } template<class T, class = std::enable_if_t<std::is_base_of<T, Pet>::value>> void pushPet(Cage<T> &cage) { cage.content = new Dog(); } void replacePet(Cage<Pet> &cage) { touchPet(cage); pushPet(cage); } int main(void) { Cage<Animal> animalCage {new Fox()}; Cage<Pet> petCage {new Cat()}; Cage<Cat> catCage {new Cat()}; Cage<Dog> dogCage {new Dog()}; Cage<Fox> foxCage {new Fox()}; touchPet( animalCage ); // forbid :-) touchPet( petCage ); // allow :-) touchPet( catCage ); // allow :-) touchPet( dogCage ); // allow :-) touchPet( foxCage ); // forbid :-) pushPet( animalCage ); // allow :-) pushPet( petCage ); // allow :-) pushPet( catCage ); // forbid :-) pushPet( dogCage ); // forbid :-) pushPet( foxCage ); // forbid :-) replacePet( animalCage ); // forbid :-) replacePet( petCage ); // allow :-) replacePet( catCage ); // forbid :-) replacePet( dogCage ); // forbid :-) replacePet( foxCage ); // forbid :-) return 0; }
D
D has no sane means of explicitly indicating variance, but he knows how to infer types based on their use.
import std.stdio, std.random; abstract class Animal {} abstract class Pet : Animal { string name; } class Cat : Pet {} class Dog : Pet {} class Fox : Animal {} class Cage(T) { T content; } void touchPet( PetCage )( PetCage cage ) { writeln( cage.content.name ); } void pushPet( PetCage )( PetCage cage ) { cage.content = ( uniform(0,2) > 0 ) ? new Dog() : new Cat(); } void replacePet( PetCage )( PetCage cage ) { touchPet( cage ); pushPet( cage); } void main() { Cage!Animal animalCage; Cage!Pet petCage; Cage!Cat catCage; Cage!Dog dogCage; Cage!Fox foxCage; animalCage.touchPet(); // forbid :-) petCage.touchPet(); // allow :-) catCage.touchPet(); // allow :-) dogCage.touchPet(); // allow :-) foxCage.touchPet(); // forbid :-) animalCage.pushPet(); // allow :-) petCage.pushPet(); // allow :-) catCage.pushPet(); // forbid :-) dogCage.pushPet(); // forbid :-) foxCage.pushPet(); // forbid :-) animalCage.replacePet(); // forbid :-) petCage.replacePet(); // allow :-) catCage.replacePet(); // forbid :-) dogCage.replacePet(); // forbid :-) foxCage.replacePet(); // forbid :-) }
Epilogue
That's all for now. I hope the material presented has helped you better understand the restrictions on types, and how they are implemented in different languages. Somewhere better, somewhere worse, somewhere nothing, but in general - so-so. Perhaps it is you who will develop the language in which all this will be implemented both conveniently and type-safe. In the meantime, join our telegram chat, where we sometimes discuss the theoretical concepts of programming languages .