To arrange all the points on the “e”, I wrote several tests that use the usual approaches to data processing: passing to a method, copying, working with arrays, and so on. I decided not to make any big conclusions, everyone will decide for himself whether it is worth believing the tests, will be able to download the project and see how it will work for you, and try to optimize the work of a particular test. Perhaps even new chips will come out that I did not mention, or they are so rarely used that I just have not heard about them.
PS I started working on an article on Xcode 10.3 and I thought about trying to compare its speed with Xcode 11, but still the article is not about comparing two applications, but about the speed of our applications. I have no doubt that the runtime of functions will decrease, and that which has been poorly optimized will become faster. As a result, I waited for the new Swift 5.1 and decided to test the hypotheses in practice. Enjoy reading.
Test 1: Compare Arrays on Structures and Classes
Suppose we have a certain class, and we want to put the objects of this class into an array, the usual action on an array is to loop through it.
In an array, when using classes in it and trying to walk through it, the number of links increases, after completion the number of links to the object will decrease.
If we go through the structure, then at the time the object is called by index, a copy of the object is created, looking at the same memory area, but marked immutable. It is difficult to say what is faster: an increase in the number of links to an object or the creation of a link to an area in memory with no possibility to change it. Let's check it in practice:
Fig. 1: Comparison of getting a variable from arrays based on structures and classes
Test 2. Compare ContiguousArray vs Array
What is more interesting is to compare the performance of an array (Array) with a reference array (ContiguousArray), which is needed specifically for working with classes stored in the array.
Let's check the performance for the following cases:
ContiguousArray storing a struct with value type
ContiguousArray storing struct with String
ContiguousArray storing class with value type
ContiguousArray storing class with String
Array storing struct with value type
Array storing struct with String
Array storing class with value type
Array storing class with String
Since the test results (tests: passing to a function with inline optimization turned off, passing to a function with inline optimization turned on, deleting elements, adding elements, sequential access to an element in a loop) will include a large number of tests (for 8 arrays of 5 tests each) , I will give the most significant results:
- If you call a function and pass an array to it, turning off inline, then such a call will be very expensive (for classes based on the reference String, it is 20,000 times slower, for classes based on Value, the type is 60,000 times worse with the inline optimizer turned off) .
- If optimization (inline) works for you, then degradation should be expected only 2 times, depending on what type of data is added to which array. The only exception was the value type, wrapped in a structure lying in the ContiguousArray - without time degradation.
- Removal - the spread between the reference array and the usual one was about 20% (in favor of the usual Array).
- Append - when using objects wrapped in classes, ContiguousArray had speed about 20% faster than Array with the same objects, while Array was faster when working with structures than ContiguousArray with structures.
- Access to array elements when using wrappers from structures turned out to be faster than any wrappers over classes, including ContiguousArray (about 500 times faster).
In most cases, using regular arrays to work with objects is more efficient. Used before, we use further.
The loop optimization for arrays is served by the lazy collection initializer, which allows you to walk only once across the entire array, even if several filters or maps are used over array elements.
In the use of structures as an optimization tool, there are pitfalls, such as the use of types that inside have a reference nature: strings, dictionaries, reference arrays. Then, when a variable that stores a reference type in itself is input to a function, an additional reference is created for each element that is a class. This has another side, about it a little further. You can try using a wrapper class over a variable. Then the number of links when passing to the function will increase only for it, and the number of links to the values inside the structure will remain the same. In general, I want to see how many variables of a reference type must be in the structure so that its performance decreases lower than the performance of classes with the same parameters. There is an article on the web called “Stop Using Structs!” That asks the same question and answers it. I downloaded the project and decided to figure out what happens where and in what cases we get slow structures. The author shows the low performance of structures compared to classes, arguing that creating a new object is much slower than increasing the reference to the object is absurd (so I removed the line where a new object is created in the loop every time). But if we do not create a link to an object, but just pass it into a function to work with it, then the difference in performance will be very insignificant. Each time we put inline (never) on a function, our application must execute it and not create code in a string. Judging by the tests, Apple made it so that the object passed to the function is slightly modified; for structures, the compiler changes mutability and makes access to non-mutable properties of the object lazy. Something similar happens in the class, but at the same time increases the number of references to the object. And now we have a lazy object, all its fields are also lazy, and every time we call an object variable, it initializes it. In this, structures have no equal: when a function calls two variables, the structure of the object is only slightly inferior to the class in speed; when you call three or more, the structure will always be faster.
Test 3: Compare the performance of Structures and Classes storing large classes
I also slightly changed the method itself, which was called when another variable was added (in this way three variables were initialized in the method, and not two, as in the article), and so that there would not be an Int overflow, I replaced the operations on the variables with the sum and subtraction. Added more understandable time metrics (in the screenshot it’s seconds, but it’s not so important for us, understanding the resulting proportions is important), removing the Darwin framework (I don’t use in projects, maybe in vain, there are no differences in the tests before / after adding the framework in my test), the inclusion of maximum optimization and build on the release assembly (it seems that this will be more honest), and here is the result:
Fig. 2: The performance of structures and classes from the article “Stop Using Structs”
The differences in test results are negligible.
Test 4: Function Accepting Generic, Protocol, and Function Without Generic
If we take a generic function and pass two values there, united only by the possibility of comparing these values (func min), then the code of three lines will turn into code of eight (as Apple says). But this is not always the case, Xcode has optimization methods in which, if, when calling a function, it sees that two structural values are passed to it, it automatically generates a function that takes two structures and does not copy the values anymore.
Fig. 3: Typical Generic Function
I decided to test two functions: in the first, the Generic data type is declared, the second accepts just Protocol. In the new version of Swift 5.1 Protocol it’s even a little faster than Generic (before Swift 5.1 the protocols were 2 times slower), although according to Apple it should be the other way around, but when it comes to passing through an array, we already need to type, which slows down Generic (but they are still great, because they’re faster than protocols):
Fig. 4: Comparison of Generic and Protocol host functions.
Test 5: Compare the call of the parent method and the native one, and at the same time check the final class for such a call
What also always interested me is how slowly classes work with a large number of parents, how quickly a class calls up its functions and that of a parent. In cases where we are trying to call a method that takes a class, dynamic dispatch comes into play. What it is? Every time a method or variable is called inside our function, a message is generated asking the object for this variable or method. The object, receiving such a request, begins the search for the method in the dispatch table of its class, and if an override of the method or variable was called, it takes it and returns, or it recursively reaches the base class.
Fig. 5: Class method calls, for dispatch testing
Several conclusions can be drawn from the test above: the larger the class of parent classes, the slower it will work, and that the difference in speed is so small that it can be safely neglected, most likely code optimization will make it so that there will be no difference in speed. In this example, final class modifier does not have an advantage, on the contrary, the work of the class is even slower, perhaps this is due to the fact that it does not become a really fast function.
Test 6: Calling a variable with final modifier against a regular class variable
Also very interesting results with assigning the final modifier to a variable, you can use it when you know for sure that the variable will not be rewritten anywhere in the class’s heirs. Let's try to put the final modifier to a variable. If in our test we created only one variable and called a property on it, then it would be initialized once (the result is from below). If we honestly create each time a new object and request its variable, the speed will noticeably slow down (the result is above):
Fig. 6: Call final variable
Obviously, the modifier did not go for the benefit of the variable, and it is always slower than its competitor.
Test 7: Problem of polymorphism and protocols for structures. Or performance Existential container
Problem: if we take a protocol that supports a certain method and several structures inherited from this protocol, what will our compiler think when we put structures with different volumes of stored values into one array, united by the original protocol?
To solve the problem of calling a method predefined in the heirs, the Protocol Witness Table mechanism is used. It creates shell structures that reference the necessary methods.
To solve the problem of data storage, an Existential container is used. It stores 5 information cells, each of 8 bytes. In the first three, space is allocated for the stored data in the structure (if they do not fit, then it creates a link to the heap in which the data is stored), the fourth stores information about the types of data that are used in the structure, and tells us how to manage this data , the fifth one contains references to the methods of the object.
Figure 7. Comparison of the performance of an array that creates a link to an object and which contains it
Between the first and second results, the number of variables tripled. In theory, they should be placed in a container, they are stored in this container, and the difference in speed is due to the volume of the structure. Interestingly, if you reduce the number of variables in the second structure, then the operating time will not change, that is, the container actually stores 3 or 2 variables, but apparently, there are special conditions for one variable that significantly increase the speed. The second structure fits perfectly into the container and differs in volume from the third by half, which gives a strong degradation in runtime, in comparison with other structures.
A bit of theory to optimize your projects
The following factors can influence the performance of structures:
- where its variables are stored (heap / stack);
- the need for reference counting for properties;
- scheduling methods (static / dynamic);
- Copy-On-Write is used only by data structures that are reference types pretending to be structures (String, Array, Set, Dictionary) under the hood.
It’s worth to clarify right away that the fastest of all will be those objects that store properties in the stack, do not use reference counting with the static method of medical examination.
Than classes are bad and dangerous in comparison with structures
We do not always control the copying of our objects, and if we do this, we can get too many copies that are difficult to manage (we created objects in the project that are responsible for creating the view, for example).
They are not as fast as structures.
If we have a link to an object, and we are trying to manage our application in a multi-threaded style, we can get the Race Condition when our object is used from two different places (which is not so difficult, because a project built with Xcode is always a bit slower, than Store version).
If we try to avoid the Race Condition, we spend a lot of resources on Lock and our data, which starts to eat up resources and waste time instead of fast processing and we get even slower objects than the same ones built on structures.
If we do all the above actions on our objects (links), then the probability of unforeseen deadlocks is high.
Code complexity is increasing because of this.
More code = more bugs, always!
conclusions
I thought that the conclusions in this article are simply necessary, because I do not want to read the article from time to time, and a consolidated list of points is simply necessary. Summing up the lines under the tests, I want to highlight the following:
- It is best to put structures in an array.
- If you want to create an array from classes, it is better to choose a regular Array, since ContiguousArray rarely provides advantages, and they are not very high.
- Inline optimization speeds up work, do not turn it off.
- Access to Array elements is always faster than access to ContiguousArray elements.
- Structures are always faster than classes (unless of course you enable Whole module optimization or similar optimization).
- When passing an object into a function and calling its properties, starting from the third, the structure is faster than classes.
- When passing a value to a function written for Generic and Protocol, Generic will be faster.
- With multiple class inheritance, the speed of a function call degrades.
- Variables marked final are slower than regular peppers.
- If a function accepts an object that combines several objects with the protocol, then it will work quickly if only one property is stored in it, and will degrade greatly when adding more properties.
References:
medium.com/@vhart/protocols-generics-and-existential-containers-wait-what-e2e698262ab1
developer.apple.com/videos/play/wwdc2016/416
developer.apple.com/videos/play/wwdc2015/409
developer.apple.com/videos/play/wwdc2016/419
medium.com/commencis/stop-using-structs-e1be9a86376f
Test source code