Unsafe.AsSpan: Span how to replace pointers?

C#

is an incredibly flexible language. You can write on it not only the backend or desktop applications. I use C#

to work with scientific data, which impose certain requirements on the tools available in the language. Although netcore

agenda (considering that after netstandard2.0

most features of both languages and runtime are not netframework

to netframework

), I continue to work with legacy projects.

In this article, I look at one non-obvious (but probably desired?) Application of Span<T>

and the difference between the Span<T>

implementation in netframework

and netcore

due to the specifics of clr

.

Disclaimer 1

The code snippets in this article are by no means intended for use in real projects.

The proposed solution to the (far-fetched?) Problem is, rather, a proof-of-concept.

In any case, implementing this in your project, you do this at your own peril and risk.

Disclaimer 2

I am absolutely sure that somewhere, in some case, this will definitely shoot someone in the knee.

Type safety bypass in C#

unlikely to lead to anything good.

For obvious reasons, I did not test this code in all possible situations, however, preliminary results look promising.

Why do I need `Span<T>`

?

Spen allows you to work with arrays of unmanaged

types in a more convenient form, reducing the number of necessary allocations. Despite the fact that span support in BCL

netframework

almost completely absent, several tools can be obtained using System.Memory

, System.Buffers

and System.Runtime.CompilerServices.Unsafe

.

The use of spans in my legacy project is limited, however, I found them an unobvious use, while spitting on type safety.

What is this application? In my project I work with data obtained from a scientific tool. These are images, which, in general, are an array of T[]

, where T

is one of the unmanaged

primitive types, for example Int32

(aka int

). To correctly serialize these images to disk, I need to support the incredibly inconvenient legacy format , which was proposed in 1981 , and has since changed little. The main problem of this format is it is BigEndian . Thus, to write (or read) an uncompressed array of T[]

, you need to change the endianess of each element. The trivial task.

What are some obvious solutions?

We iterate over the array T[]

, call BitConverter.GetBytes(T)

, expand these few bytes, copy to the target array.
We iterate over the array T[]

, perform frauds of the form new byte[] {(byte)((x & 0xFF00) >> 8), (byte)(x & 0x00FF)};

(should work on double-byte types), write to the target array.
^* But is T[]

an array? Items are in a row, right? So you can go all the way, for example, Buffer.BlockCopy(intArray, 0, byteArray, 0, intArray.Length * sizeof(int));

. The method copies the array to the array ignoring type checking. It is only necessary not to miss the boundaries and allocation. We mix the bytes as a result.
^* They say that C#

is (C++)++

. Therefore, enable /unsafe

, fixed(int* p = &intArr[0]) byte* bPtr = (byte*)p;

and now you can run around the byte representation of the source array, change endianess on the fly and write blocks to disk (adding stackalloc byte[]

or ArrayPool<byte>.Shared

for the intermediate buffer) without allocating memory for a whole new byte array.

It would seem that point 4 allows you to solve all problems, but the explicit use of unsafe

context and working with pointers is somehow completely different. Then Span<T>

comes to our aid.

`Span<T>`

Span<T>

should technically provide tools for working with memory plots almost like working through pointers, while eliminating the need to “fix” the array in memory. Such a GC

-aware pointer with array bounds. Everything is fine and safe.

One thing but - despite the wealth of System.Runtime.CompilerServices.Unsafe

, Span<T>

nailed to type T

Given that spen is essentially a ¹ + length pointer, what if you pull out your pointer, convert it to another type, recalculate the length, and make a new span? Fortunately, we have public Span<T>(void* pointer, int length)

.

Let's write a simple test:

 [Test] public void Test() { void Flip(Span<byte> span) {/*   endianess */} Span<int> x = new [] {123}; Span<byte> y = DangerousCast<int, byte>(x); Assert.AreEqual(123, x[0]); Flip(y); Assert.AreNotEqual(123, x[0]); Flip(y); Assert.AreEqual(123, x[0]); }

More advanced developers than I should immediately realize what is wrong here. Will the test fail? The answer, as it usually happens, depends .

In this case, it depends primarily on runtime. On netcore

test should work, but on netframework

, how it netframework

.

Interestingly, if you remove some of the essays, the test starts to work correctly in 100% of cases.

Let's get it right.

¹ I was wrong .

Correct answer: depends

Why does the result depend ?

Let's remove all unnecessary and write here such a code:

 private static void Main() => Check(); private static void Check() { Span<int> x = new[] {999, 123, 11, -100}; Span<byte> y = As<int, byte>(ref x); Console.WriteLine(@"FRAMEWORK_NAME"); Write(ref x); Write(ref y); Console.WriteLine(); Write<int, int>(ref x, "Span<int> [0]"); Write<byte, int>(ref y, "Span<byte>[0]"); Console.WriteLine(); Write<int, int>(ref Offset<int, object>(ref x[0], 1), "Span<int> [0] offset by size_t"); Write<byte, int>(ref Offset<byte, object>(ref y[0], 1), "Span<byte>[0] offset by size_t"); Console.WriteLine(); GC.Collect(0, GCCollectionMode.Forced, true, true); Write<int, int>(ref x, "Span<int> [0] after GC"); Write<byte, int>(ref y, "Span<byte>[0] after GC"); Console.WriteLine(); Write(ref x); Write(ref y); }

The Write<T, U>

method accepts a span of type T

, reads the address of the first element, and reads through this pointer one element of type U

In other words, Write<int, int>(ref x)

will output the address in memory + the number 999.

Normal Write

prints an array.

Now about the As<,>

method:

  private static unsafe Span<U> As<T, U>(ref Span<T> span) where T : unmanaged where U : unmanaged { fixed(T* ptr = span) return new Span<U>(ptr, span.Length * Unsafe.SizeOf<T>() / Unsafe.SizeOf<U>()); }

C#

syntax now supports this fixed

state record by implicitly calling the Span<T>.GetPinnableReference()

method.

Run this method on netframework4.8

in x64

mode. We look at what turns out:

 LEGACY [ 999, 123, 11, -100 ] [ 231, 3, 0, 0, 123, 0, 0, 0, 11, 0, 0, 0, 156, 255, 255, 255 ] 0x|00|00|02|8C|00|00|2F|B0 999 Span<int> [0] 0x|00|00|02|8C|00|00|2F|B0 999 Span<byte>[0] 0x|00|00|02|8C|00|00|2F|B8 11 Span<int> [0] offset by size_t 0x|00|00|02|8C|00|00|2F|B8 11 Span<byte>[0] offset by size_t 0x|00|00|02|8C|00|00|2B|18 999 Span<int> [0] after GC 0x|00|00|02|8C|00|00|2F|B0 6750318 Span<byte>[0] after GC [ 999, 123, 11, -100 ] [ 110, 0, 103, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 ]

Initially, both spans (despite the different types) behave identically, and Span<byte>

, in essence, represents a byte-view of the original array. Exactly what is needed.

Okay, let's try to shift the beginning of the span to the size of one IntPtr

(or 2 X int

on x64

) and read. We get the third element of the array and the correct address. And then we will collect the garbage ...

 GC.Collect(0, GCCollectionMode.Forced, true, true);

The last flag in this method asks the GC

compact the heap. After calling GC.Collect

GC

moves the original local array. Span<int>

reflects these changes, but our Span<byte>

continues to point to the old address, where now it is not clear what. A great way to shoot yourself all your knees at once!

Now let's look at the result of the exact same code fragment called on netcore3.0.100-preview8

.

 CORE [ 999, 123, 11, -100 ] [ 231, 3, 0, 0, 123, 0, 0, 0, 11, 0, 0, 0, 156, 255, 255, 255 ] 0x|00|00|01|F2|8F|BD|C6|90 999 Span<int> [0] 0x|00|00|01|F2|8F|BD|C6|90 999 Span<byte>[0] 0x|00|00|01|F2|8F|BD|C6|98 11 Span<int> [0] offset by size_t 0x|00|00|01|F2|8F|BD|C6|98 11 Span<byte>[0] offset by size_t 0x|00|00|01|F2|8F|BD|BF|38 999 Span<int> [0] after GC 0x|00|00|01|F2|8F|BD|BF|38 999 Span<byte>[0] after GC [ 999, 123, 11, -100 ] [ 231, 3, 0, 0, 123, 0, 0, 0, 11, 0, 0, 0, 156, 255, 255, 255 ]

Everything works, and it works stably , as far as I can see. After compaction, both spains change their pointer. Fine! But how now to make it work in a legacy project?

Jit intrinsic

I absolutely forgot that support for spans is implemented in netcore

through intrinsik . In other words, netcore

can even create internal pointers to a fragment of an array and correctly update links when the GC

moves it. In netframework

, the nuget

implementation of a span is a crutch. In fact, we have two different spen: one is created from the array and tracks its links, the second from the pointer and has no idea what it points to. After moving the source array, the span pointer continues to point to where the pointer passed into its constructor pointed. For comparison, this is an example implementation of span in netcore

:

 readonly ref struct Span<T> where T : unmanaged { private readonly ByReference<T> _pointer; //  -   private readonly int _length; }

and in netframework

:

 readonly ref struct Span<T> where T : unmanaged { private readonly Pinnable<T> _pinnable; private readonly IntPtr _byteOffset; private readonly int _length; }

_pinnable

contains a reference to the array, if one was passed to the constructor, _byteOffset

contains a shift (even the span throughout the array has a certain non-zero shift related to the way the array is represented in memory, probably ). If you pass the void*

pointer to the constructor, it is simply converted to an absolute _byteOffset

. Span will be nailed tight to the memory area, and all instance methods abound with conditions like if(_pinnable is null) {/* */} else {/* _pinnable */}

. What to do in such a situation?

How to do it is not worth it, but I still did

This section is devoted to various implementations supported by netframework

, which allow netframework

Span<T> -> Span<U>

, keeping all necessary links.

I warn you: this is a zone of abnormal programming with possibly fundamental errors and an Undefined Behavior at the end

Method 1: Naive

As the example showed, conversion of pointers will not give the desired result on netframework

. We need the _pinnable

value. Okay, we’ll uncover the reflection by pulling out the private fields (very bad and not always possible), we will write it in a new spen, we’ll be happy. There is only one small problem: spen is a ref struct

, it can neither be a generic argument, nor packed into an object

. Standard methods of reflection will require, one way or another, to push the span into the reference type. I did not find a simple way (also considering reflection on private fields).

Method 2: We need to go deeper

Everything has already been done before me ( [1] , [2] , [3] ). Spen is a structure, regardless of T

three fields occupy the same amount of memory ( on the same architecture ). What if [FieldOffset(0)]

? No sooner said than done.

 [StructLayout(LayoutKind.Explicit)] ref struct Exchange<T, U> where T : unmanaged where U : unmanaged { [FieldOffset(0)] public Span<T> Span_1; [FieldOffset(0)] public Span<U> Span_2; }

But when you run the program (or rather, when trying to use a type), we encounter a TypeLoadException

- a generic cannot be LayoutKind.Explicit

. Okay, it doesn’t matter, let's go along the difficult path:

 [StructLayout(LayoutKind.Explicit)] public ref struct Exchange { [FieldOffset(0)] public Span<byte> ByteSpan; [FieldOffset(0)] public Span<sbyte> SByteSpan; [FieldOffset(0)] public Span<ushort> UShortSpan; [FieldOffset(0)] public Span<short> ShortSpan; [FieldOffset(0)] public Span<uint> UIntSpan; [FieldOffset(0)] public Span<int> IntSpan; [FieldOffset(0)] public Span<ulong> ULongSpan; [FieldOffset(0)] public Span<long> LongSpan; [FieldOffset(0)] public Span<float> FloatSpan; [FieldOffset(0)] public Span<double> DoubleSpan; [FieldOffset(0)] public Span<char> CharSpan; }

Now you can do this:

 private static Span<byte> As2(Span<int> span) { var exchange = new Exchange() { IntSpan = span }; return exchange.ByteSpan; }

The method works with only one problem - the _length

field _length

copied as is, so when casting int

-> byte

length of the byte span is 4 times less than the real array.

No problem:

 [StructLayout(LayoutKind.Sequential)] public ref struct Raw { public object Pinnable; public IntPtr Pointer; public int Length; } [StructLayout(LayoutKind.Explicit)] public ref struct Exchange { /* */ [FieldOffset(0)] public Raw RawView; }

Now through RawView

you can access each individual field of the span.

 private static Span<byte> As2(Span<int> span) { var exchange = new Exchange() { IntSpan = span }; var exchange2 = new Exchange() { RawView = new Raw() { Pinnable = exchange.RawView.Pinnable, Pointer = exchange.RawView.Pointer, Length = exchange.RawView.Length * sizeof<int> / sizeof<byte> } }; return exchange2.ByteSpan; }

And it works as it should , if you ignore the use of dirty tricks. Minus - the generic version of the converter cannot be created, you have to be content with predefined types.

Method 3: Crazy

Like any normal programmer, I like to automate things. The need to write converters for any pair of unmanaged

types did not please me. What solution can be offered? That's right, get the CLR

to write code for you .

How to achieve this? There are different ways, there are articles . In short, the process looks like this:

Create a build builder -> create a module builder -> build a type -> {Fields, Methods, etc.} -> at the output we get an instance of Type

.

To understand exactly what the type should look like (it's a ref struct

), we use any tool of the ildasm

type. In my case, it was dotPeek .

Creating a type builder looks something like this:

 var typeBuilder = _mBuilder.DefineType($"Generated_{typeof(T).Name}", TypeAttributes.Public | TypeAttributes.Sealed | TypeAttributes.ExplicitLayout // <-    | TypeAttributes.AnsiClass | TypeAttributes.BeforeFieldInit, typeof(ValueType));

Now the fields. Since we cannot directly copy Span<T>

to Span<U>

because of the difference in lengths, we need to create two types of type for each cast

 [StructLayout(LayoutKind.Explicit)] ref struct Generated_Int32 { [FieldOffset(0)] public Span<Int32> Span; [FieldOffset(0)] public Raw Raw; }

Here we can declare Raw

hand and reuse. Do not forget about IsByRefLikeAttribute

. With fields, everything is simple:

 var spanField = typeBuilder.DefineField("Span", typeof(Span<T>), FieldAttributes.Private); spanField.SetOffset(0); var rawField = typeBuilder.DefineField("Raw", typeof(Raw), FieldAttributes.Private); rawField.SetOffset(0);

That's all, the simplest type is ready. Now cache the assembly module. Custom types are cached, for example, in the dictionary ( T -> Generated_{nameof(T)}

). We create a wrapper that, according to the two types TIn

and TOut

generates two types of helpers and performs the necessary operations on the spans. There is one but. As in the case with reflection, it is almost impossible to use it on spans (or on other ref struct

). Or I did not find a simple solution . How to be?

Delegates to the rescue

Reflection methods usually look something like this:

  object Invoke(this MethodInfo mi, object @this, object[] otherArgs)

They do not carry information about types, so if boxing (= packaging) is acceptable to you, there are no problems.

In our case, @this

and otherArgs

must contain a ref struct

, which I could not get around.

However, there is a simpler way. Let's imagine that a type has getter and setter methods (not properties, but manually created simple methods).

For example:

 void Generated_Int32.SetSpan(Span<Int32> span) => this.Span = span;

In addition to the method, we can declare a delegate type (explicitly in code):

 delegate void SpanSetterDelegate<T>(Span<T> span) where T : unmanaged;

We have to do this because the standard action should have the signature Action<Span<T>>

, but spans cannot be used as generic arguments. SpanSetterDelegate

, however, is an absolutely valid delegate.

Create the delegates you need. To do this, carry out standard manipulations:

 var mi = type.GetMethod("Method_Name"); // ,    public & instance var spanSetter = (SpanSetterDelegate<T>) mi.CreateDelegate(typeof(SpanSetterDelegate<T>), @this);

Now spanSetter

can be used as, for example, spanSetter(Span<T>.Empty);

. As for @this

² , this is an instance of our dynamic type, created, of course, through Activator.CreateInstance(type)

, because the structure has a default constructor with no arguments.

So, the last frontier - we need to dynamically generate methods.

² You may notice that something is going wrong here - Activator.CreateInstance()

packing the ref struct

instance. See end of next section.

Meet `Reflection.Emit`

I think that methods could be generated using Expression

, as the bodies of our trivial getters / setters consist of literally a couple of expressions. I chose a different, more straightforward approach.

If you look at the IL- code of a trivial getter, you can see something like ( Debug

, X86

, netframework4.8

)

 nop ldarg.0 ldfld /* - */ stloc.0 br.s /*  */ ldloc.0 ret

There are tons of places to stop and debug.

In the release version, only the most important remains:

 ldarg.0 ldfld /* - */ ret

The instance method's null argument is ... this

. Thus, the following is written in IL :

1) Download this

2) Load the field value

3) Bring it back

Just huh? In Reflection.Emit

there is a special overload that takes, in addition to the op code, also a field descriptor parameter. Just the same as we received earlier, for example spanField

.

 var getSpan = type.DefineMethod("GetSpan", MethodAttributes.Public | MethodAttributes.HideBySig, CallingConventions.Standard, typeof(Span<T>), Array.Empty<Type>()); gen = getSpan.GetILGenerator(); gen.Emit(OpCodes.Ldarg_0); gen.Emit(OpCodes.Ldfld, spanField); gen.Emit(OpCodes.Ret);

For the setter, it’s a bit more complicated, you need to load this on the stack, load the first argument of the function, then call the write instruction in the field and return nothing:

 ldarg.0 ldarg.1 stfld /*   */ ret

Having done this procedure for the Raw

field, declaring the necessary delegates (or using the standard ones), we get a dynamic type and four accessor methods from which the correct generic delegates are generated.

We write a wrapper class that, using two generic parameters ( TIn

, TOut

), receives Type

instances that reference the corresponding (cached) dynamic types, after which it creates one object of each type and generates four generic delegates, namely

void SetSpan(Span<TIn> span)

to write the source span to the structure
Raw GetRaw()

to read the contents of a span as a Raw

structure
void SetRaw(Raw raw)

to write the modified Raw

structure to the second object
Span<TOut> GetSpan()

to return the span of the desired type with correctly set and recalculated fields.

Interestingly, dynamic type instances need to be created once. When creating a delegate, a reference to these objects is passed as an @this

parameter. Here is a violation of the rules. Activator.CreateInstance

returns object

. Apparently this is due to the fact that the dynamic type itself did not turn out ref

-like ( type.IsByRef

~~Like~~ == false

), but it was possible to create ref

-like fields. Apparently, such a restriction is present in the language, but the CLR

digests it. Perhaps it is here that the knees will be shot in the case of non-standard use. ³

So, we get an instance of a generic type containing four delegates and two implicit references to instances of dynamic classes. Delegates and structures can be reused when performing the same castes in a row. To improve performance, we cache again (already a type converter) for a pair (TIn, TOut) -> Generator<TIn, TOut>

.

The stroke is the last: we give types, `Span<TIn> -> Span<TOut>`

 public Span<TOut> Cast(Span<TIn> span) { //      if (span.IsEmpty) return Span<TOut>.Empty; // Caller   ,       if (span.Length * Unsafe.SizeOf<TIn>() % Unsafe.SizeOf<TOut>() != 0) throw new InvalidOperationException(); //      // Span<TIn> _input.Span = span; _spanSetter(span); //  Raw // Raw raw = _input.Raw; var raw = _rawGetter(); var newRaw = new Raw() { Pinnable = raw.Pinnable, //    Pinnable Pointer = raw.Pointer, //   Length = raw.Length * Unsafe.SizeOf<TIn>() / Unsafe.SizeOf<TOut>() //   }; //   Raw    // Raw _output.Raw = newRaw; _rawSetter(newRaw); //     // Span<TOut> _output.Span return _spanGetter(); }

Conclusion

Sometimes - for the sake of sports interest - you can bypass some of the limitations of the language and implement non-standard functionality. Of course, at your own peril and risk. It is worth noting that the dynamic method allows you to completely abandon pointers and unsafe / fixed

contexts, which can be a bonus. The obvious downside is the need for reflection and type generation.

For those who have read to the end.

Naive Benchmark Results

And how fast is it all?

I compared the speed of castes in a silly scenario that does not reflect the actual / potential use of such castes and spans, but at least gives an idea of speed.

Cast_Explicit

uses conversion through an explicitly declared type, as in Method 2 . Each caste requires the allocation of two small structures and accesses to the fields;
Cast_IL

implements Method 3 , but each time re-creates an instance Generator<TIn, TOut>

, which leads to constant searches in dictionaries, after the first pass generates all types;
Cast_IL_Cached

caches the converter instance directly Generator<TIn, TOut>

, which is why it turns out to be faster on average, because the entire caste boils down to the calls of four delegates;
Buffer

, , . .

— int[N]

N/2

.

, , . , . , , . , unmanaged

.

 BenchmarkDotNet=v0.11.5, OS=Windows 10.0.18362 Intel Core i7-2700K CPU 3.50GHz (Sandy Bridge), 1 CPU, 8 logical and 4 physical cores [Host] : .NET Framework 4.7.2 (CLR 4.0.30319.42000), 32bit LegacyJIT-v4.8.3815.0 Clr : .NET Framework 4.7.2 (CLR 4.0.30319.42000), 32bit LegacyJIT-v4.8.3815.0 Job=Clr Runtime=Clr InvocationCount=1 UnrollFactor=1

Method	N	Mean	Error	StdDev	Median	Ratio	RatioSD
Cast_Explicit	100	362.2 ns	18.0967 ns	52.7888 ns	400.0 ns	1.00	0.00
Cast_IL	100	1,237.9 ns	28.5954 ns	67.4027 ns	1,200.0 ns	3.47	0.51
Cast_IL_Cached	100	522.8 ns	25.2640 ns	71.2576 ns	500.0 ns	1.46	0.27
Buffer	100	300.0 ns	0.0000 ns	0.0000 ns	300.0 ns	0.78	0.11

Cast_Explicit	1000	2,628.6 ns	54.0688 ns	64.3650 ns	2,600.0 ns	1.00	0.00
Cast_IL	1000	3,216.7 ns	49.8568 ns	38.9249 ns	3,200.0 ns	1.21	0.03
Cast_IL_Cached	1000	2,484.6 ns	44.9717 ns	37.5534 ns	2,500.0 ns	0.94	0.02
Buffer	1000	2,055.6 ns	43.9695 ns	73.4631 ns	2,000.0 ns	0.78	0.03

Cast_Explicit	1000000	2,515,157.1 ns	11,809.8538 ns	10,469.1278 ns	2,516,050.0 ns	1.00	0.00
Cast_IL	1,000,000	2,263,826.7 ns	23,724.4930 ns	22,191.9054 ns	2,262,000.0 ns	0.90	0.01
Cast_IL_Cached	1,000,000	2,265,186.7 ns	19,505.5913 ns	18,245.5422 ns	2,266,300.0 ns	0.90	0.01
Buffer	1,000,000	1,959,547.8 ns	39,175.7435 ns	49,544.7719 ns	1,959,200.0 ns	0.78	0.02

Cast_Explicit	100000000	255,751,392.9 ns	2,595,107.7066 ns	2,300,495.3873 ns	255,298,950.0 ns	1.00	0.00
Cast_IL	100000000	228,709,457.1 ns	527,430.9293 ns	467,553.7809 ns	228,864,100.0 ns	0.89	0.01
Cast_IL_Cached	100000000	227,966,553.8 ns	355,027.3545 ns	296,463.9203 ns	227,903,600.0 ns	0.89	0.01
Buffer	100000000	213,216,776.9 ns	1,198,565.1142 ns	1,000,856.1536 ns	213,517,800.0 ns	0.83	0.01

Acknowledgments

JetBrains ( :-)) R# VS standalone- dotPeek , . BenchmarkDotNet

BenchmarkDotNet, youtube- NDC Conferences DotNext , , .

PS

Error handling

³ , ref

, , . ( ) . ref

structs,

 static Raw Generated_Int32.GetRaw(Span<int> span) { var inst = new Generated_Int32() { Span = span }; return inst.Raw; }

, Reflection.Emit

. , ILGenerator.DeclareLocal

.

 static Span<int> Generated_Int32.GetSpan(Raw raw);

 delegate Raw GetRaw<T>(Span<T> span) where T : unmanaged; delegate Span<T> GetSpan<T>(Raw raw) where T : unmanaged;

, , ref

— . Because ,

 var getter = type.GetMethod(@"GetRaw", BindingFlags.Static | BindingFlags.Public).CreateDelegate(typeof(GetRaw<T>), null) as GetRaw<T>;

—

 Raw raw = getter(Span<TIn>.Empty); Raw newRaw = convert(raw); Span<TOut> = setter(newRaw);

UPD01: Fighting Glasses

All Articles