C#
is an incredibly flexible language. You can write on it not only the backend or desktop applications. I use C#
to work with scientific data, which impose certain requirements on the tools available in the language. Although netcore
agenda (considering that after netstandard2.0
most features of both languages โโand runtime are not netframework
to netframework
), I continue to work with legacy projects.
In this article, I look at one non-obvious (but probably desired?) Application of Span<T>
and the difference between the Span<T>
implementation in netframework
and netcore
due to the specifics of clr
.
The code snippets in this article are by no means intended for use in real projects.
The proposed solution to the (far-fetched?) Problem is, rather, a proof-of-concept.
In any case, implementing this in your project, you do this at your own peril and risk.
I am absolutely sure that somewhere, in some case, this will definitely shoot someone in the knee.
Type safety bypass in C#
unlikely to lead to anything good.
For obvious reasons, I did not test this code in all possible situations, however, preliminary results look promising.
Span<T>
Spen allows you to work with arrays of unmanaged
types in a more convenient form, reducing the number of necessary allocations. Despite the fact that span support in BCL
netframework
almost completely absent, several tools can be obtained using System.Memory
, System.Buffers
and System.Runtime.CompilerServices.Unsafe
.
The use of spans in my legacy project is limited, however, I found them an unobvious use, while spitting on type safety.
What is this application? In my project I work with data obtained from a scientific tool. These are images, which, in general, are an array of T[]
, where T
is one of the unmanaged
primitive types, for example Int32
(aka int
). To correctly serialize these images to disk, I need to support the incredibly inconvenient legacy format , which was proposed in 1981 , and has since changed little. The main problem of this format is it is BigEndian . Thus, to write (or read) an uncompressed array of T[]
, you need to change the endianess of each element. The trivial task.
What are some obvious solutions?
T[]
BitConverter.GetBytes(T)
T[]
new byte[] {(byte)((x & 0xFF00) >> 8), (byte)(x & 0x00FF)};
T[]
Buffer.BlockCopy(intArray, 0, byteArray, 0, intArray.Length * sizeof(int));
C#
(C++)++
/unsafe
fixed(int* p = &intArr[0]) byte* bPtr = (byte*)p;
stackalloc byte[]
ArrayPool<byte>.Shared
It would seem that point 4 allows you to solve all problems, but the explicit use of unsafe
context and working with pointers is somehow completely different. Then Span<T>
comes to our aid.
Span<T>
Span<T>
should technically provide tools for working with memory plots almost like working through pointers, while eliminating the need to โfixโ the array in memory. Such a GC
-aware pointer with array bounds. Everything is fine and safe.
One thing but - despite the wealth of System.Runtime.CompilerServices.Unsafe
, Span<T>
nailed to type T
Given that spen is essentially a 1 + length pointer, what if you pull out your pointer, convert it to another type, recalculate the length, and make a new span? Fortunately, we have public Span<T>(void* pointer, int length)
.
Let's write a simple test:
[Test] public void Test() { void Flip(Span<byte> span) {/* endianess */} Span<int> x = new [] {123}; Span<byte> y = DangerousCast<int, byte>(x); Assert.AreEqual(123, x[0]); Flip(y); Assert.AreNotEqual(123, x[0]); Flip(y); Assert.AreEqual(123, x[0]); }
More advanced developers than I should immediately realize what is wrong here. Will the test fail? The answer, as it usually happens, depends .
In this case, it depends primarily on runtime. On netcore
test should work, but on netframework
, how it netframework
.
Interestingly, if you remove some of the essays, the test starts to work correctly in 100% of cases.
Let's get it right.
1 I was wrong .
Why does the result depend ?
Let's remove all unnecessary and write here such a code:
private static void Main() => Check(); private static void Check() { Span<int> x = new[] {999, 123, 11, -100}; Span<byte> y = As<int, byte>(ref x); Console.WriteLine(@"FRAMEWORK_NAME"); Write(ref x); Write(ref y); Console.WriteLine(); Write<int, int>(ref x, "Span<int> [0]"); Write<byte, int>(ref y, "Span<byte>[0]"); Console.WriteLine(); Write<int, int>(ref Offset<int, object>(ref x[0], 1), "Span<int> [0] offset by size_t"); Write<byte, int>(ref Offset<byte, object>(ref y[0], 1), "Span<byte>[0] offset by size_t"); Console.WriteLine(); GC.Collect(0, GCCollectionMode.Forced, true, true); Write<int, int>(ref x, "Span<int> [0] after GC"); Write<byte, int>(ref y, "Span<byte>[0] after GC"); Console.WriteLine(); Write(ref x); Write(ref y); }
The Write<T, U>
method accepts a span of type T
, reads the address of the first element, and reads through this pointer one element of type U
In other words, Write<int, int>(ref x)
will output the address in memory + the number 999.
Normal Write
prints an array.
Now about the As<,>
method:
private static unsafe Span<U> As<T, U>(ref Span<T> span) where T : unmanaged where U : unmanaged { fixed(T* ptr = span) return new Span<U>(ptr, span.Length * Unsafe.SizeOf<T>() / Unsafe.SizeOf<U>()); }
C#
syntax now supports this fixed
state record by implicitly calling the Span<T>.GetPinnableReference()
method.
Run this method on netframework4.8
in x64
mode. We look at what turns out:
LEGACY [ 999, 123, 11, -100 ] [ 231, 3, 0, 0, 123, 0, 0, 0, 11, 0, 0, 0, 156, 255, 255, 255 ] 0x|00|00|02|8C|00|00|2F|B0 999 Span<int> [0] 0x|00|00|02|8C|00|00|2F|B0 999 Span<byte>[0] 0x|00|00|02|8C|00|00|2F|B8 11 Span<int> [0] offset by size_t 0x|00|00|02|8C|00|00|2F|B8 11 Span<byte>[0] offset by size_t 0x|00|00|02|8C|00|00|2B|18 999 Span<int> [0] after GC 0x|00|00|02|8C|00|00|2F|B0 6750318 Span<byte>[0] after GC [ 999, 123, 11, -100 ] [ 110, 0, 103, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ]
Initially, both spans (despite the different types) behave identically, and Span<byte>
, in essence, represents a byte-view of the original array. Exactly what is needed.
Okay, let's try to shift the beginning of the span to the size of one IntPtr
(or 2 X int
on x64
) and read. We get the third element of the array and the correct address. And then we will collect the garbage ...
GC.Collect(0, GCCollectionMode.Forced, true, true);
The last flag in this method asks the GC
compact the heap. After calling GC.Collect
GC
moves the original local array. Span<int>
reflects these changes, but our Span<byte>
continues to point to the old address, where now it is not clear what. A great way to shoot yourself all your knees at once!
Now let's look at the result of the exact same code fragment called on netcore3.0.100-preview8
.
CORE [ 999, 123, 11, -100 ] [ 231, 3, 0, 0, 123, 0, 0, 0, 11, 0, 0, 0, 156, 255, 255, 255 ] 0x|00|00|01|F2|8F|BD|C6|90 999 Span<int> [0] 0x|00|00|01|F2|8F|BD|C6|90 999 Span<byte>[0] 0x|00|00|01|F2|8F|BD|C6|98 11 Span<int> [0] offset by size_t 0x|00|00|01|F2|8F|BD|C6|98 11 Span<byte>[0] offset by size_t 0x|00|00|01|F2|8F|BD|BF|38 999 Span<int> [0] after GC 0x|00|00|01|F2|8F|BD|BF|38 999 Span<byte>[0] after GC [ 999, 123, 11, -100 ] [ 231, 3, 0, 0, 123, 0, 0, 0, 11, 0, 0, 0, 156, 255, 255, 255 ]
Everything works, and it works stably , as far as I can see. After compaction, both spains change their pointer. Fine! But how now to make it work in a legacy project?
I absolutely forgot that support for spans is implemented in netcore
through intrinsik . In other words, netcore
can even create internal pointers to a fragment of an array and correctly update links when the GC
moves it. In netframework
, the nuget
implementation of a span is a crutch. In fact, we have two different spen: one is created from the array and tracks its links, the second from the pointer and has no idea what it points to. After moving the source array, the span pointer continues to point to where the pointer passed into its constructor pointed. For comparison, this is an example implementation of span in netcore
:
readonly ref struct Span<T> where T : unmanaged { private readonly ByReference<T> _pointer; // - private readonly int _length; }
and in netframework
:
readonly ref struct Span<T> where T : unmanaged { private readonly Pinnable<T> _pinnable; private readonly IntPtr _byteOffset; private readonly int _length; }
_pinnable
contains a reference to the array, if one was passed to the constructor, _byteOffset
contains a shift (even the span throughout the array has a certain non-zero shift related to the way the array is represented in memory, probably ). If you pass the void*
pointer to the constructor, it is simply converted to an absolute _byteOffset
. Span will be nailed tight to the memory area, and all instance methods abound with conditions like if(_pinnable is null) {/* */} else {/* _pinnable */}
. What to do in such a situation?
This section is devoted to various implementations supported by netframework
, which allow netframework
Span<T> -> Span<U>
, keeping all necessary links.
I warn you: this is a zone of abnormal programming with possibly fundamental errors and an Undefined Behavior at the end
As the example showed, conversion of pointers will not give the desired result on netframework
. We need the _pinnable
value. Okay, weโll uncover the reflection by pulling out the private fields (very bad and not always possible), we will write it in a new spen, weโll be happy. There is only one small problem: spen is a ref struct
, it can neither be a generic argument, nor packed into an object
. Standard methods of reflection will require, one way or another, to push the span into the reference type. I did not find a simple way (also considering reflection on private fields).
Everything has already been done before me ( [1] , [2] , [3] ). Spen is a structure, regardless of T
three fields occupy the same amount of memory ( on the same architecture ). What if [FieldOffset(0)]
? No sooner said than done.
[StructLayout(LayoutKind.Explicit)] ref struct Exchange<T, U> where T : unmanaged where U : unmanaged { [FieldOffset(0)] public Span<T> Span_1; [FieldOffset(0)] public Span<U> Span_2; }
But when you run the program (or rather, when trying to use a type), we encounter a TypeLoadException
- a generic cannot be LayoutKind.Explicit
. Okay, it doesnโt matter, let's go along the difficult path:
[StructLayout(LayoutKind.Explicit)] public ref struct Exchange { [FieldOffset(0)] public Span<byte> ByteSpan; [FieldOffset(0)] public Span<sbyte> SByteSpan; [FieldOffset(0)] public Span<ushort> UShortSpan; [FieldOffset(0)] public Span<short> ShortSpan; [FieldOffset(0)] public Span<uint> UIntSpan; [FieldOffset(0)] public Span<int> IntSpan; [FieldOffset(0)] public Span<ulong> ULongSpan; [FieldOffset(0)] public Span<long> LongSpan; [FieldOffset(0)] public Span<float> FloatSpan; [FieldOffset(0)] public Span<double> DoubleSpan; [FieldOffset(0)] public Span<char> CharSpan; }
Now you can do this:
private static Span<byte> As2(Span<int> span) { var exchange = new Exchange() { IntSpan = span }; return exchange.ByteSpan; }
The method works with only one problem - the _length
field _length
copied as is, so when casting int
-> byte
length of the byte span is 4 times less than the real array.
No problem:
[StructLayout(LayoutKind.Sequential)] public ref struct Raw { public object Pinnable; public IntPtr Pointer; public int Length; } [StructLayout(LayoutKind.Explicit)] public ref struct Exchange { /* */ [FieldOffset(0)] public Raw RawView; }
Now through RawView
you can access each individual field of the span.
private static Span<byte> As2(Span<int> span) { var exchange = new Exchange() { IntSpan = span }; var exchange2 = new Exchange() { RawView = new Raw() { Pinnable = exchange.RawView.Pinnable, Pointer = exchange.RawView.Pointer, Length = exchange.RawView.Length * sizeof<int> / sizeof<byte> } }; return exchange2.ByteSpan; }
And it works as it should , if you ignore the use of dirty tricks. Minus - the generic version of the converter cannot be created, you have to be content with predefined types.
Like any normal programmer, I like to automate things. The need to write converters for any pair of unmanaged
types did not please me. What solution can be offered? That's right, get the CLR
to write code for you .
How to achieve this? There are different ways, there are articles . In short, the process looks like this:
Create a build builder -> create a module builder -> build a type -> {Fields, Methods, etc.} -> at the output we get an instance of Type
.
To understand exactly what the type should look like (it's a ref struct
), we use any tool of the ildasm
type. In my case, it was dotPeek .
Creating a type builder looks something like this:
var typeBuilder = _mBuilder.DefineType($"Generated_{typeof(T).Name}", TypeAttributes.Public | TypeAttributes.Sealed | TypeAttributes.ExplicitLayout // <- | TypeAttributes.AnsiClass | TypeAttributes.BeforeFieldInit, typeof(ValueType));
Now the fields. Since we cannot directly copy Span<T>
to Span<U>
because of the difference in lengths, we need to create two types of type for each cast
[StructLayout(LayoutKind.Explicit)] ref struct Generated_Int32 { [FieldOffset(0)] public Span<Int32> Span; [FieldOffset(0)] public Raw Raw; }
Here we can declare Raw
hand and reuse. Do not forget about IsByRefLikeAttribute
. With fields, everything is simple:
var spanField = typeBuilder.DefineField("Span", typeof(Span<T>), FieldAttributes.Private); spanField.SetOffset(0); var rawField = typeBuilder.DefineField("Raw", typeof(Raw), FieldAttributes.Private); rawField.SetOffset(0);
That's all, the simplest type is ready. Now cache the assembly module. Custom types are cached, for example, in the dictionary ( T -> Generated_{nameof(T)}
). We create a wrapper that, according to the two types TIn
and TOut
generates two types of helpers and performs the necessary operations on the spans. There is one but. As in the case with reflection, it is almost impossible to use it on spans (or on other ref struct
). Or I did not find a simple solution . How to be?
Reflection methods usually look something like this:
object Invoke(this MethodInfo mi, object @this, object[] otherArgs)
They do not carry information about types, so if boxing (= packaging) is acceptable to you, there are no problems.
In our case, @this
and otherArgs
must contain a ref struct
, which I could not get around.
However, there is a simpler way. Let's imagine that a type has getter and setter methods (not properties, but manually created simple methods).
For example:
void Generated_Int32.SetSpan(Span<Int32> span) => this.Span = span;
In addition to the method, we can declare a delegate type (explicitly in code):
delegate void SpanSetterDelegate<T>(Span<T> span) where T : unmanaged;
We have to do this because the standard action should have the signature Action<Span<T>>
, but spans cannot be used as generic arguments. SpanSetterDelegate
, however, is an absolutely valid delegate.
Create the delegates you need. To do this, carry out standard manipulations:
var mi = type.GetMethod("Method_Name"); // , public & instance var spanSetter = (SpanSetterDelegate<T>) mi.CreateDelegate(typeof(SpanSetterDelegate<T>), @this);
Now spanSetter
can be used as, for example, spanSetter(Span<T>.Empty);
. As for @this
2 , this is an instance of our dynamic type, created, of course, through Activator.CreateInstance(type)
, because the structure has a default constructor with no arguments.
So, the last frontier - we need to dynamically generate methods.
2 You may notice that something is going wrong here - Activator.CreateInstance()
packing the ref struct
instance. See end of next section.
Reflection.Emit
I think that methods could be generated using Expression
, as the bodies of our trivial getters / setters consist of literally a couple of expressions. I chose a different, more straightforward approach.
If you look at the IL- code of a trivial getter, you can see something like ( Debug
, X86
, netframework4.8
)
nop ldarg.0 ldfld /* - */ stloc.0 br.s /* */ ldloc.0 ret
There are tons of places to stop and debug.
In the release version, only the most important remains:
ldarg.0 ldfld /* - */ ret
The instance method's null argument is ... this
. Thus, the following is written in IL :
1) Download this
2) Load the field value
3) Bring it back
Just huh? In Reflection.Emit
there is a special overload that takes, in addition to the op code, also a field descriptor parameter. Just the same as we received earlier, for example spanField
.
var getSpan = type.DefineMethod("GetSpan", MethodAttributes.Public | MethodAttributes.HideBySig, CallingConventions.Standard, typeof(Span<T>), Array.Empty<Type>()); gen = getSpan.GetILGenerator(); gen.Emit(OpCodes.Ldarg_0); gen.Emit(OpCodes.Ldfld, spanField); gen.Emit(OpCodes.Ret);
For the setter, itโs a bit more complicated, you need to load this on the stack, load the first argument of the function, then call the write instruction in the field and return nothing:
ldarg.0 ldarg.1 stfld /* */ ret
Having done this procedure for the Raw
field, declaring the necessary delegates (or using the standard ones), we get a dynamic type and four accessor methods from which the correct generic delegates are generated.
We write a wrapper class that, using two generic parameters ( TIn
, TOut
), receives Type
instances that reference the corresponding (cached) dynamic types, after which it creates one object of each type and generates four generic delegates, namely
void SetSpan(Span<TIn> span)
Raw GetRaw()
Raw
void SetRaw(Raw raw)
Raw
Span<TOut> GetSpan()
Interestingly, dynamic type instances need to be created once. When creating a delegate, a reference to these objects is passed as an @this
parameter. Here is a violation of the rules. Activator.CreateInstance
returns object
. Apparently this is due to the fact that the dynamic type itself did not turn out ref
-like ( type.IsByRef
Like == false
), but it was possible to create ref
-like fields. Apparently, such a restriction is present in the language, but the CLR
digests it. Perhaps it is here that the knees will be shot in the case of non-standard use. 3
So, we get an instance of a generic type containing four delegates and two implicit references to instances of dynamic classes. Delegates and structures can be reused when performing the same castes in a row. To improve performance, we cache again (already a type converter) for a pair (TIn, TOut) -> Generator<TIn, TOut>
.
Span<TIn> -> Span<TOut>
public Span<TOut> Cast(Span<TIn> span) { // if (span.IsEmpty) return Span<TOut>.Empty; // Caller , if (span.Length * Unsafe.SizeOf<TIn>() % Unsafe.SizeOf<TOut>() != 0) throw new InvalidOperationException(); // // Span<TIn> _input.Span = span; _spanSetter(span); // Raw // Raw raw = _input.Raw; var raw = _rawGetter(); var newRaw = new Raw() { Pinnable = raw.Pinnable, // Pinnable Pointer = raw.Pointer, // Length = raw.Length * Unsafe.SizeOf<TIn>() / Unsafe.SizeOf<TOut>() // }; // Raw // Raw _output.Raw = newRaw; _rawSetter(newRaw); // // Span<TOut> _output.Span return _spanGetter(); }
Sometimes - for the sake of sports interest - you can bypass some of the limitations of the language and implement non-standard functionality. Of course, at your own peril and risk. It is worth noting that the dynamic method allows you to completely abandon pointers and unsafe / fixed
contexts, which can be a bonus. The obvious downside is the need for reflection and type generation.
And how fast is it all?
I compared the speed of castes in a silly scenario that does not reflect the actual / potential use of such castes and spans, but at least gives an idea of โโspeed.
Cast_Explicit
Cast_IL
Generator<TIn, TOut>
Cast_IL_Cached
Generator<TIn, TOut>
Buffer
โ int[N]
N/2
.
, , . , . , , . , unmanaged
.
BenchmarkDotNet=v0.11.5, OS=Windows 10.0.18362 Intel Core i7-2700K CPU 3.50GHz (Sandy Bridge), 1 CPU, 8 logical and 4 physical cores [Host] : .NET Framework 4.7.2 (CLR 4.0.30319.42000), 32bit LegacyJIT-v4.8.3815.0 Clr : .NET Framework 4.7.2 (CLR 4.0.30319.42000), 32bit LegacyJIT-v4.8.3815.0 Job=Clr Runtime=Clr InvocationCount=1 UnrollFactor=1
Method | N | Mean | Error | StdDev | Median | Ratio | RatioSD |
---|---|---|---|---|---|---|---|
Cast_Explicit | 100 | 362.2 ns | 18.0967 ns | 52.7888 ns | 400.0 ns | 1.00 | 0.00 |
Cast_IL | 100 | 1,237.9 ns | 28.5954 ns | 67.4027 ns | 1,200.0 ns | 3.47 | 0.51 |
Cast_IL_Cached | 100 | 522.8 ns | 25.2640 ns | 71.2576 ns | 500.0 ns | 1.46 | 0.27 |
Buffer | 100 | 300.0 ns | 0.0000 ns | 0.0000 ns | 300.0 ns | 0.78 | 0.11 |
Cast_Explicit | 1000 | 2,628.6 ns | 54.0688 ns | 64.3650 ns | 2,600.0 ns | 1.00 | 0.00 |
Cast_IL | 1000 | 3,216.7 ns | 49.8568 ns | 38.9249 ns | 3,200.0 ns | 1.21 | 0.03 |
Cast_IL_Cached | 1000 | 2,484.6 ns | 44.9717 ns | 37.5534 ns | 2,500.0 ns | 0.94 | 0.02 |
Buffer | 1000 | 2,055.6 ns | 43.9695 ns | 73.4631 ns | 2,000.0 ns | 0.78 | 0.03 |
Cast_Explicit | 1000000 | 2,515,157.1 ns | 11,809.8538 ns | 10,469.1278 ns | 2,516,050.0 ns | 1.00 | 0.00 |
Cast_IL | 1,000,000 | 2,263,826.7 ns | 23,724.4930 ns | 22,191.9054 ns | 2,262,000.0 ns | 0.90 | 0.01 |
Cast_IL_Cached | 1,000,000 | 2,265,186.7 ns | 19,505.5913 ns | 18,245.5422 ns | 2,266,300.0 ns | 0.90 | 0.01 |
Buffer | 1,000,000 | 1,959,547.8 ns | 39,175.7435 ns | 49,544.7719 ns | 1,959,200.0 ns | 0.78 | 0.02 |
Cast_Explicit | 100000000 | 255,751,392.9 ns | 2,595,107.7066 ns | 2,300,495.3873 ns | 255,298,950.0 ns | 1.00 | 0.00 |
Cast_IL | 100000000 | 228,709,457.1 ns | 527,430.9293 ns | 467,553.7809 ns | 228,864,100.0 ns | 0.89 | 0.01 |
Cast_IL_Cached | 100000000 | 227,966,553.8 ns | 355,027.3545 ns | 296,463.9203 ns | 227,903,600.0 ns | 0.89 | 0.01 |
Buffer | 100000000 | 213,216,776.9 ns | 1,198,565.1142 ns | 1,000,856.1536 ns | 213,517,800.0 ns | 0.83 | 0.01 |
JetBrains ( :-)) R# VS standalone- dotPeek , . BenchmarkDotNet
BenchmarkDotNet, youtube- NDC Conferences DotNext , , .
3 , ref
, , . ( ) . ref
structs,
static Raw Generated_Int32.GetRaw(Span<int> span) { var inst = new Generated_Int32() { Span = span }; return inst.Raw; }
, Reflection.Emit
. , ILGenerator.DeclareLocal
.
static Span<int> Generated_Int32.GetSpan(Raw raw);
delegate Raw GetRaw<T>(Span<T> span) where T : unmanaged; delegate Span<T> GetSpan<T>(Raw raw) where T : unmanaged;
, , ref
โ . Because ,
var getter = type.GetMethod(@"GetRaw", BindingFlags.Static | BindingFlags.Public).CreateDelegate(typeof(GetRaw<T>), null) as GetRaw<T>;
โ
Raw raw = getter(Span<TIn>.Empty); Raw newRaw = convert(raw); Span<TOut> = setter(newRaw);
UPD01: Fighting Glasses