Today we’ll talk about how to write your
AutoMapper . Yes, I would really like to tell you about this, but I can’t. The fact is that such solutions are very large, have a history of trial and error, and have also come a long way to the application. I can only give an understanding of how this works, give a starting point for those who would like to understand the mechanism of work of the “mappers”. You could even say that we will write our bike.
Denial of responsibility
I remind you once again: we will write a primitive mapper. If you suddenly decide to modify it and use it in the prod - do not do this. Take a ready-made solution that knows the stack of problems in this subject area and
already knows how to solve them. There are several more or less significant reasons to write and use your bike mapper:
- Need some special customization.
- You need maximum performance in your conditions and you are ready to fill up cones.
- You want to understand how mapper works.
- You just like cycling.
What is the word mapper called?
This is the subsystem that is responsible for taking an object and converting (copying its values) it to another. Typical task: convert DTO into a business layer object. The most primitive mapper “runs” through the properties of the data source and compares them with the properties of the data type that will be output. After matching, the values are extracted from the source and written to the object, which will be the result of the conversion. Somewhere along the way, most likely, it will still be necessary to create this very “result”.
For the consumer, mapper is a service that provides the following interface:
public interface IMapper<out TOut> { TOut Map(object source); }
I emphasize: this is the most primitive interface, which, from my point of view, is convenient for explanation. In reality, we will most likely be dealing with a more specific mapper (IMapper <TIn, TOut>) or with a more general facade (IMapper), which itself will select a specific mapper for the specified types of input-output objects.
Naive implementation
Note: even a naive implementation of mapper requires basic knowledge of
Reflection and
ExpressionTrees . If you have not followed the links or heard anything about these technologies - do it, read it. I promise the world will never be the same.
However, we are writing your own mapper. To get started, let's get all the properties (
PropertyInfo ) of the data type that will be output (hereinafter I will call it
TOut ). This is quite simple: we know the type, since we are writing the implementation of a generic class parameterized with the TOut type. Next, using an instance of the Type class, we get all its properties.
Type outType = typeof(TOut); PropertyInfo[] outProperties = outType.GetProperties();
When getting properties, I omit the features. For example, some of them may be without a setter function, some may be marked as ignored by the attribute, some may be with special access. We are considering the simplest option.
We go further. It would be nice to be able to create an instance of type TOut, that is, the very object into which we map the incoming object. In C #, there are several ways to do this. For example, we can do this: System.Activator.CreateInstance (). Or even just new TOut (), but for this you need to create a restriction for TOut, which you would not want to do in the generalized interface. However, we both know something about ExpressionTrees, which means we can do it like this:
ConstructorInfo outConstructor = outType.GetConstructor(Array.Empty<Type>()); Func<TOut> activator = outConstructor == null ? throw new Exception($"Default constructor for {outType.Name} not found") : Expression.Lambda<Func<TOut>>(Expression.New(outConstructor)).Compile();
Why so? Because we know that an instance of the Type class can give information about what constructors it has - this is very convenient for cases when we decide to develop our mapper so that we will pass any data to the constructor. Also, we learned a little more about ExpressionTrees, namely, they allow plaque to create and compile code, which can then be reused. In this case, it is a function that actually looks like () => new TOut ().
Now you need to write the main mapper method, which will copy the values. We will go along the simplest way: go through the properties of the object that came to us at the entrance, and look for the properties with the same name among the properties of the outgoing object. If found - copy, if not - move on.
TOut outInstance = _activator(); PropertyInfo[] sourceProperties = source.GetType().GetProperties(); for (var i = 0; i < sourceProperties.Length; i++) { PropertyInfo sourceProperty = sourceProperties[i]; string propertyName = sourceProperty.Name; if (_outProperties.TryGetValue(propertyName, out PropertyInfo outProperty)) { object sourceValue = sourceProperty.GetValue(source); outProperty.SetValue(outInstance, sourceValue); } } return outInstance;
Thus, we have fully formed the
BasicMapper class. You can familiarize yourself with his tests
here . Please note that the source can be either an object of any particular type or an anonymous object.
Performance and boxing
Reflection is great, but slow. Moreover, its frequent use increases memory traffic, which means it loads the GC, which means it slows down the application even more. For example, we just used the
PropertyInfo.SetValue and
PropertyInfo.GetValue methods. The GetValue method returns an object in which a certain value is wrapped (boxing). This means that we received an allocation from scratch.
Mappers are usually located where you need to turn one object into another ... No, not one, but many objects. For example, when we take something from the database. In this place, I would like to see normal performance and not lose memory on an elementary operation.
What we can do?
ExpressionTrees will help us again. The fact is that .NET allows you to create and compile code "on the fly": we describe it in the object representation, say what and where we will use it ... and compile it. Almost no magic.
Compiled mapper
In fact, everything is relatively simple: we already did new with Expression.New (ConstructorInfo). You probably noticed that the static New method is called exactly the same as the operator. The fact is that almost all C # syntax is reflected in the form of static methods of the Expression class. If something is missing, it means that you are looking for the so-called "Syntactic sugar."
Here are some of the operations we will use in our mapper:
- Variable declaration - Expression.Variable (Type, string). The Type argument tells what type of variable will be created, and string is the name of the variable.
- Assignment - Expression.Assign (Expression, Expression). The first argument is what we assign, and the second argument is what we assign.
- Access to the property of an object is Expression.Property (Expression, PropertyInfo). Expression is the owner of the property, and PropertyInfo is the object representation of the property obtained through Reflection.
With this knowledge, we can create variables, access properties of objects, and assign values to properties of objects. Most likely, we also understand that ExpressionTree needs to be compiled into a delegate of the form
Func <object, TOut> . The plan is this: we get a variable that contains the input data, create an instance of type TOut and create expressions that assign one property to another.
Unfortunately, the code is not very compact, so I suggest taking a look at the implementation of
CompiledMapper right away. I brought here only key points.
First, we create an object representation of the parameter of our function. Since it takes an object as input, the object will be a parameter.
var parameter = Expression.Parameter(typeof(object), "source");
Next, we create two variables and an Expression list into which we will sequentially add assignment expressions. The order is important, because that's how the commands will be executed when we call the compiled method. For example, we cannot assign a value to a variable that has not yet been declared.
Further, in the same way as in the case of naive implementation, we go through the list of type properties and try to match them by name. However, instead of immediately assigning values, we create expressions for extracting values and assigning values for each associated property.
Expression sourceValue = Expression.Property(sourceInstance, sourceProperty); Expression outValue = Expression.Property(outInstance, outProperty); expressions.Add(Expression.Assign(outValue, sourceValue));
An important point: after we have created all the assignment operations, we need to return the result from the function. To do this, the last expression in the list should be Expression, which contains an instance of the class that we created. I left a comment next to this line. Why does the behavior corresponding to the return keyword in ExpressionTree look like this? I'm afraid this is a separate issue. Now I suggest it is easy to remember.
Well, at the very end we have to compile all the expressions that we built. What are we interested in here? The body variable contains the "body" of the function. “Normal functions” have a body, right? Well, which we enclose in braces. So, Expression.Block is exactly that. Since curly braces are also a scope, we must pass in the variables that will be used there - in our case sourceInstance and outInstance.
var body = Expression.Block(new[] {sourceInstance, outInstance}, expressions); return Expression.Lambda<Func<object, TOut>>(body, parameter).Compile();
At the output, we get Func <object, TOut>, i.e. a function that can convert data from one object to another. Why such difficulties, you ask? I remind you that, firstly, we wanted to avoid boxing when copying ValueType values, and secondly, we wanted to abandon the PropertyInfo.GetValue and PropertyInfo.SetValue methods, since they are somewhat slow.
Why not boxing? Because the compiled ExpressionTree is a real IL, and for runtime, it looks just like (almost) like your code. Why is the “compiled mapper” faster? Again: because it is just plain IL. By the way, we can easily confirm the speed using the
BenchmarkDotNet library, and the benchmark itself can be found
here .
In the Ratio column, “CompiledMapper” (CompiledMapper) showed a very good result, even compared to AutoMapper (it is baseline, i.e. 1). However, let's not rejoice: AutoMapper has significantly greater capabilities compared to our bike. With this plate, I just wanted to show that ExpressionTrees is much faster than the "classic Reflection approach".
Summary
I hope I was able to show that writing your mapper is quite simple. Reflection and ExpressionTrees are very powerful tools that developers use to solve many different tasks. Dependency injection, Serialization / Deserialization, CRUD repositories, building SQL queries, using other languages as scripts for .NET applications - all this is done using Reflection, Reflection.Emit and ExpressionTrees.
What about mapper? Mapper is a great example where you can learn all this.