Protocol Oriented Programming, Part 2

In continuation of the topic, we will understand protocol types and generalized code.







The following issues will be considered along the way:









Protocol Types



Implementation of polymorphism without inheritance and reference types:







protocol Drawable { func draw() } struct Point: Drawable { var x, y: Int func draw() { ... } } struct Line: Drawable { var x1, x2, y1, y2: Int func draw() { ... } } var drawbles = [Drawable]() for d in drawbles { d.draw() }
      
      





  1. Denote the Drawable protocol, which has a draw method.
  2. We implement this protocol for Point and Line - now you can handle them as with Drawable (call the draw method)


We still have a polymorphic code. The d element of the drawables array has one interface, which is indicated in the Drawable protocol, but has different implementations of its methods, which are indicated in Line and Point.







The main principle (ad-hoc) of polymorphism: "A common interface - many implementations"

Dynamic dispatch without virtual-table







Recall that the definition of the correct implementation of the method when working with classes (reference types) is achieved through Dynamic Sending and a virtual table. Each class type has a virtual table; it stores implementations of its methods. Dynamic dispatch defines the method implementation for the type by looking into its virtual table. All this is necessary due to the possibility of inheritance and overriding of methods.







In the case of structures, inheritance, as well as redefinition of methods, is impossible. Then, at first glance, there is no need for a virtual-table, but how then will dynamic dispatch work? How can a program understand which method will be called on d.draw ()?







It is worth noting that the number of implementations of this method is equal to the number of types that conform to the Drawable protocol.


Protocol witness table



is the answer to this question. Each type that implements a protocol has this table. Like a virtual table for classes, it stores implementations of the methods that the protocol requires.







hereinafter, the Protocol Witness Table will be called the โ€œprotocol-method tableโ€

Ok, now we know where to look for method implementations. Only two questions remain:







  1. How to find the appropriate protocol and method table for an object that implemented this protocol? How in our case to find this table for the d element of drawables array?
  2. The elements of the array must be of the same size (this is the essence of the array). Then how can a drawable array meet this requirement if it can store both Line and Point, and they have different sizes?


 MemoryLayout.size(ofValue: Line(...)) // 32 bits MemoryLayout.size(ofValue: Point(...)) // 16 bits
      
      





Existential container



To address these two issues, Swift uses a special storage scheme for instances of protocol types called an existential container. It looks like this:













It occupies 5 machine words (in x64 bit system 5 * 8 = 40 bits). It is divided into three parts:







value buffer - space for the instance itself

vwt - pointer to Value Witness Table

pwt - pointer to Protocol Witness Table







Consider all three parts in more detail:







Content Buffer







Just three machine words for storing an instance. If the instance can fit in the content buffer, then it is stored in it. If the instance has more than 3 machine words, then it will not fit in the buffer and the program is forced to allocate memory on the heap, put the instance there, and put a pointer to this memory in the contents buffer. Consider an example:







 let point: Drawable = Point(...)
      
      





Point () occupies 2 machine words and fits perfectly into the value buffer - the program will put it there:













 let line: Drawable = Line(...)
      
      





Line () occupies 4 machine words and cannot fit into a value buffer - the program will allocate memory for it for heap, and add a pointer to this memory in value buffer:













ptr points to an instance of Line () placed on the heap:













Life cycle table







As well as the protocol-method table, each type that has a protocol has this table. It contains an implementation of four methods: allocate, copy, destruct, deallocate. These methods control the entire life cycle of an object. Consider an example:







  1. When creating an object (Point (...) as Drawable), the allocate method from T.Zh. this object. The allocate method will decide where the contents of the object should be placed (in the value buffer or on the heap), and if it should be placed on the heap, it will allocate the required amount of memory
  2. The copy method will put the contents of the object in the appropriate place.
  3. After finishing work with the object, the destruct method will be called, which will reduce all reference counters, if any
  4. After destruct, the deallocate method will be called, which will free the memory allocated on the heap, if any


Protocol method table







As described above, it contains implementations of the methods required by the protocol for the type to which this table is bound.







Existential Container - Answers







Thus, we answered two questions posed:







  1. The protocol-method table is stored in the Existential container of this object and can be easily obtained from it
  2. If the element type of the array is a protocol, then any element of this array takes a fixed value of 5 machine words - this is exactly what is needed for an Existential container. If the contents of the element cannot be placed in the value buffer, then it will be placed on the heap. If it can, then all the content will be placed in the value buffer. In any case, we get that the size of the object with the protocol type is 5 machine words (40 bits), and it follows that all the elements of the array will have the same size.


 let line: Drawable = Line(...) MemoryLayout.size(ofValue: line) // 40 bits let drawables: [Drawable] = [Line(...), Point(...), Line(...)] MemoryLayout.size(ofValue: drawables._content) // 120 bits
      
      





Existential Container - Example







Consider the behavior of an existential container in this code:







 func drawACopy(local: Drawable) { local.draw() } let val: Drawable = Line(...) drawACopy(val)
      
      





An existential container can be represented like this:







 struct ExistContDrawable { var valueBuffer: (Int, Int, Int) var vwt: ValueWitnessTable var pwt: ProtocolWitnessTable }
      
      





Pseudo code







Behind the scenes, the drawACopy function takes in ExistContDrawable:







 func drawACopy(val: ExistContDrawable) { ... }
      
      





The function parameter is created manually: create a container, fill its fields from the received argument:







 func drawACopy(val: ExistContDrawable) { var local = ExistContDrawable() let vwt = val.vwt let pwt = val.pwt local.type = type local.pwt = pwt ... }
      
      





We decide where the content will be stored (in the buffer or heap). Call vwt.allocate and vwt.copy to populate local with the contents of val:







 func drawACopy(val: ExistContDrawable) { ... vwt.allocateBufferAndCopy(&local, val) }
      
      





We call the draw method and pass it a pointer to self (the projectBuffer method will decide where self is located - in the buffer or on the heap - and return the correct pointer):







 func drawACopy(val: ExistContDrawable) { ... pwt.draw(vwt.projectBuffer(&local)) }
      
      





We are finishing work with local. We clean all hip links from local. The function returns a value - we clear all the memory allocated for drawACopy (stack frame):







 func drawACopy(val: ExistContDrawable) { ... vwt.destructAndDeallocateBuffer(&local) }
      
      





Existential Container - Purpose







Using an existential container requires a lot of work - the example above confirmed this - but why is it even necessary, what is the purpose? The goal is to implement polymorphism using protocols and the types that implement them. In OOP, we use abstract classes and inherit from them by overriding methods. In EPP, we use protocols and implement their requirements. Again, even with protocols, implementing polymorphism is a big and energy-intensive job. Therefore, to avoid "unnecessary" work, you need to understand when polymorphism is needed, and when not.







Polymorphism in the implementation of EPP wins in the fact that, using structures, we do not need constant reference counting, there is no class inheritance. Yes, everything is very similar, classes use a virtual table to determine the implementation of a method, protocols use protocol-method. Classes are placed on the heap, structures can also sometimes be placed there. But the problem is that any class of pointers can be directed to the class placed on the heap, and reference counting is necessary, and only one pointer to the structures placed on the heap and it is stored in an existential container.







In fact, it is important to note that a structure that is stored in an existential container will retain the semantics of value types, regardless of whether it is placed on the stack or heap. The Life Cycle Table is responsible for the preservation of semantics since it describes methods that determine semantics.







Existential Container - Stored Properties







We examined how a protocol type variable is passed and used by a function. Let's consider how such variables are stored:







 struct Pair { init(_ f: Drawable, _ s: Drawable) { first = f second = s } var first: Drawable var second: Drawable } var pair = Pair(Line(), Point())
      
      





How are these two Drawable structures stored inside the Pair structure? What is the contents of pair? It consists of two existential containers - one for first, the other for second. Line cannot fit in the buffer and is placed on the heap. Point fit in the buffer. It also allows the Pair structure to store objects of different sizes:







 pair.second = Line()
      
      





Now, the contents of second are also placed on the heap, since it did not fit on the buffer. Consider what this may lead to:







 let aLine = Line(...) let pair = Pair(aLine, aLine) let copy = pair
      
      





After executing this code, the program will receive the following memory status:













We have 4 memory allocations on the heap, which is not good. Let's try to fix:







  1. Create an analog class Line


 class LineStorage: Drawable { var x1, y1, x2, y2: Double func draw() {} }
      
      





  1. We use it in Pair


 let lineStorage = LineStorage(...) let pair = Pair(lineStorage, lineStorage) let copy = pair
      
      





We get one placement on the heap and 4 pointers to it:













But we are dealing with referential behavior. Changing copy.first will affect pair.first (the same for .second), which is not always what we want.







Indirect storage and copying on change (copy-on-write)







Before that, it was mentioned that String is a copy-on-write structure (stores its contents on the heap and copies it when it changes). Consider how you can implement your structure, which is copied when changing:







 struct BetterLine: Drawable { private var storage: LineStorage init() { storage = LineStorage((0, 0), (10, 10)) } func draw() -> Double { ... } mutating func move() { if !isKnownUniquelyReferenced(&storage) { storage = LineStorage(self.storage) } // storage editing } }
      
      





  1. BetterLine stores all properties in storage, and storage is a class and is stored on the heap.
  2. Storage can only be changed using the move method. In it, we check that only one pointer points to storage. If there are more pointers, then this BetterLine shares storage with someone, and in order for BetterLine to behave completely as a structure, storage must be individual - we make a copy and work with it in the future.


Let's see how it works in memory:







 let aLine = BetterLine() let pair = Pair(aLine, aLine) let copy = pair copy.second.x1 = 3.0
      
      





As a result of executing this code, we get:













In other words, we have two instances of Pair that share the same storage: LineStorage. When changing storage in one of its users (first / second), a separate copy of storage will be created for this user so that its change does not affect others. This solves the problem of violation of the semantics of value types from the previous example.







Protocol Types - Summary



  1. Small values . If we work with objects that take up little memory and can be placed in the buffer of an existential container, then:




  1. Great value. If we work with objects that do not fit into the buffer, then:




The mechanisms of using rewriting for change and indirect storage have been demonstrated and can significantly improve the situation with reference counting in case of a large number of them.

We found that protocol types, like classes, are capable of realizing polymorphism. This happens by storing in an existential container and using protocol tables - life cycle tables and protocol-method tables.








All Articles