On Reflection Performance in .NET

This must be one of the most covered topics in .NET; we all care about performance, and anyone familiar with .NET knows that the System.Reflection namespace sucks. I’ve written a library in the past to improve reflection performance, and recently I’ve been wondering what magic .NET 4 might make possible.

So, I decided to start from scratch with measurements. Extensive measurements on every alternative I could think of. And I learned a lot of unexpected things.

The Baseline

Throughout this post I will be using a very simple class to reflect over (I tried larger more complex classes just to make sure, but that had no significant impact on performance):

sealed public class Sample
{
    public int A { get; set; }
    public string B { get; set; }
}

The fastest implementation involves direct property accesses:

for (int repeats = 0; repeats < iterations; repeats++)
{
    sample.A = sample.A;
    sample.B = sample.B;
}

The slowest implementation involves .NET reflection:

PropertyInfo propA = typeof(Sample).GetProperty("A");
PropertyInfo propB = typeof(Sample).GetProperty("B");
for (int repeats = 0; repeats < iterations; repeats++)
{
    propA.SetValue(sample, propA.GetValue(sample, null), null);
    propB.SetValue(sample, propB.GetValue(sample, null), null);
}

Note that there are several environments that you could test this code in. Inside the IDE or directly from the Windows Explorer. As a debug build or as a release build. I would not be surprised if most performance tests are performed inside the IDE, and possibly even on a debug build. The following should explain why this is a mistake.

Baseline measurements (1m iterations)
Debug build Release build
In IDE Explorer In IDE Explorer
Direct Access 0.0532 s 0.0279 s 0.0206 s 0.0038 s
.NET Reflection 6.1271 s 4.9281 s 5.0297 s 5.0805 s

As you’d expect, running a debug build inside the IDE is clearly slower than any other environment. But even a release build inside the IDE is almost 5 times slower than running it directly from the shell. If you do performance measurements inside your IDE you’re just not doing it right.

So, realistically, the performance we’re concerned about is that 1:1337 ratio (I didn’t mess with the measurements to get that!) of the last column. I’d like to have a reflection implementation that’s significantly closer to the 1 than the 1337.

Implementation Options

When I originally implemented a very fast reflection in .NET 1.1 I ended up with IL-emitted code generated into a set of abstract methods. Since then Dynamic Methods, Expressions (including Compiled), and Function and Action delegates have been added to the environment. I had been wondering whether there now was a better way to implement fast reflection than my original method. So let’s have a look at the candidates.

Delegates

Func<Sample, int> GetterA;
Action<Sample, int> SetterA;
Func<Sample, string> GetterB;
Action<Sample, string> SetterB;

Interface

public interface IReflection<T>
{
    int GetIntProperty(T item, int index);
    void SetIntProperty(T item, int index, int value);
    string GetStringProperty(T item, int index);
    void SetStringProperty(T item, int index, string value);
}

Class

abstract public class Reflection<T>
{
    abstract int GetIntProperty(T item, int index);
    abstract void SetIntProperty(T item, int index, int value);
    abstract string GetStringProperty(T item, int index);
    abstract void SetStringProperty(
        T item, int index, string value);
}

The interface and class variants use an index property that corresponds to the properties in order (the production version obviously has methods to map property names to indexes, and vice-versa, as well as accessors for all the other property types I care about); in this case A = 0 and B = 1.

The delegate variant does not need an index, because each delegate is for a specific property of the class. Again, in production you’d have a wrapper around the delegates that acts as a lookup container for a given type.

To get a better handle on the performance possibilities, I’ll be hand-coding implementations for each of these before trying to use compiled expressions and IL-emit to see how fast I can make a run-time generated implementation.

Hand-coded Performance

The implementations for the IReflection<T> interface and the Reflection<T> class are pretty much identical, so following is only the latter.

sealed public class CodedReflection : Reflection&lt;Sample&gt;
{
    override public int GetIntProperty(
        Sample item, int index)
    {
        switch (index)
        {
           case 0: return item.A;
           case 1: throw new NotImplementedException();
           default: throw new NotImplementedException();
        }
    }

    override public string GetStringProperty(
        Sample item, int index)
    {
        switch (index)
        {
            case 0: throw new NotImplementedException();
            case 1: return item.B;
            default: throw new NotImplementedException();
        }
    }

    override public void SetIntProperty(
        Sample item, int index, int value)
    {
        switch (index)
        {
            case 0: item.A = value; break;
            case 1: throw new NotImplementedException();
            default: throw new NotImplementedException();
        }
    }

    override public void SetStringProperty(
        Sample item, int index, string value)
    {
        switch (index)
        {
            case 0: throw new NotImplementedException();
            case 1: item.B = value; break;
            default: throw new NotImplementedException();
        }
    }
}

Of course this code could be simpler, but it is purposely very regular to correspond to what an IL-emit might be able to achieve. Switch statements are very fast, and other than that it’s just a property access. This is pretty much as lean as you can possibly make a general property getter/setter.

I would put some code for my hard-coded delegate implementation, but… there is none. It turns out that we can just bind directly to the getter and setter methods for the properties and convert them to delegates. My bet was this was going to be by far the performance winner. What could be leaner than what is in essence a function pointer to the property access methods?

MethodInfo getterMethodA =
    typeof(Sample).GetProperty(&quot;A&quot;).GetGetMethod();

Func&lt;Sample, int&gt; GetterA =
    (Func&lt;Sample, int&gt;)Delegate.CreateDelegate(
        typeof(Func&lt;Sample, int&gt;), getterMethodA);

The actual test code using these access methods looks as follows for the interface/class:

Reflection&lt;Sample&gt; reflection = ...;
for (int repeats = 0; repeats &lt; iterations; repeats++)
{
    reflection.SetIntProperty(sample, 0,
        reflection.GetIntProperty(sample, 0));

    reflection.SetStringProperty(sample, 1,
        reflection.GetStringProperty(sample, 1));
}

And for delegates:

Func&lt;Sample, int&gt; getA = ...;
Action&lt;Sample, int&gt; setA = ...;
Func&lt;Sample, string&gt; getB = ...;
Action&lt;Sample, string&gt; setB = ...;
for (int repeats = 0; repeats &lt; iterations; repeats++)
{
    setA(sample, getA(sample));
    setB(sample, getB(sample));
}

The following performance data is all normalised to the fastest operation, and everything else indicates multiples in run-time for the operation.

Hand-coded performance (relative)
Relative performance
Direct Access 1.00x
Delegates 4.62x
Interface 5.36x
Class 4.31x
.NET Reflection 1336.97x

Surprisingly (to me), the delegates do not actually beat the class implementation. The class implementation may actually be a little faster, albeit not decisively. Interfaces are slower than the almost-equivalent class implementation, which doesn’t surprise since interfaces require an extra level of indirection to resolve.

The biggest lesson here is that we can in fact theoretically make a fairly fast implementation. We can get to within almost a factor 4 of hard-coded property access performance for release builds. That’s not bad at all. Now we just need to see if we can generate the required code at run-time.

Run-time Attempt 1: Compiled Expressions

I was going to try and use expression trees and compile them into code for all three of my alternative access methods.

static public Func&lt;Sample, int&gt; BuildGetterA()
{
    ParameterExpression itemParamExpr =
        Expression.Parameter(typeof(Sample), &quot;item&quot;);

    Expression&lt;Func&lt;Sample, int&gt;&gt; getterExpr =
        Expression.Lambda&lt;Func&lt;Sample, int&gt;&gt;(
             Expression.Property(itemParamExpr, &quot;A&quot;),
            itemParamExpr);

    return getterExpr.Compile();
}

There was a getterExpr.CompileToMethod(...) method that looked promising to generate code into an interface or class implementation. Alas, it turns out that this can only generate into static methods. That would just not solve this particular problem. As a result, I only have a performance measurement here for the delegate access method.

Compiled expression performance (relative)
Relative performance
Direct Access 1.00x
Delegates 23.09x
.NET Reflection 1336.97x

Aaaand… that’s like a cold shower. It turns out that something about the generated method is nowhere near as performant as directly accessing the accessor methods or going though a switch statement. I guess it doesn’t take much to throw this off, because a raw property access is about as simple a thing as you can do. It doesn’t take many IL instructions to double (or in this case, sextuple) the runtime of something that simple.

I guess this variant can go on the garbage heap… which is a shame, because expression trees are definitely more readable than IL-emit.

Run-time Attempt 2: IL-emit

This is definitely not going to be pretty. Generating raw IL is a very verbose process. To illustrate the fragments below merely implement GetIntProperty on the class and the GetterA alternative for the delegates.

Type sample = typeof(Sample);
MethodInfo getA = sample.GetProperty(&quot;A&quot;).GetGetMethod();
// ... other MethodInfos

MethodAttributes ma =
    MethodAttributes.Public |
    MethodAttributes.ReuseSlot |
    MethodAttributes.HideBySig |
    MethodAttributes.Final |
    MethodAttributes.Virtual;
Label[] labels;

var an = new AssemblyName(&quot;IlGenApiAsm&quot;);
var ab = AppDomain.CurrentDomain.DefineDynamicAssembly(
    an, AssemblyBuilderAccess.Run);
var mb = ab.DefineDynamicModule(&quot;IlGenApiMod&quot;);
// ...

var tb = mb.DefineType(&quot;Accessor&quot;,
    TypeAttributes.Class |
    TypeAttributes.Public |
    TypeAttributes.Sealed,
    typeof(Reflection&lt;Sample&gt;));
var method = tb.DefineMethod(&quot;GetIntProperty&quot;, ma,
    typeof(int),
    new Type[] { typeof(Sample), typeof(int) });

var il = method.GetILGenerator();
il.Emit(OpCodes.Ldarg_2);
labels = new Label[] { il.DefineLabel(), il.DefineLabel() };
il.Emit(OpCodes.Switch, labels);
il.MarkLabel(labels[1]);
il.Emit(OpCodes.Newobj,
    typeof(NotImplementedException).GetConstructor(...));
il.Emit(OpCodes.Throw);
il.MarkLabel(labels[0]);
il.Emit(OpCodes.Ldarg_1);
il.EmitCall(OpCodes.Callvirt, getA, null);
il.Emit(OpCodes.Ret);

// ... emitting other methods 

Reflection&lt;Sample&gt; accessor = (Reflection&lt;Sample&gt;)
    tb.CreateType().GetConstructor(Type.EmptyTypes).Invoke(...);

var dm = new DynamicMethod(&quot;Build8GetA&quot;,
    typeof(int), new Type[] { typeof(Sample) });
il = dm.GetILGenerator();
il.Emit(OpCodes.Ldarg_0);
il.EmitCall(OpCodes.Callvirt, getA, null);
il.Emit(OpCodes.Ret);

GetterA = (Func&lt;Sample, int&gt;)
    dm.CreateDelegate(typeof(Func&lt;Sample, int&gt;));

This obviously looks dreadful, but once you drive this logic from .NET Reflection to discover the properties of a given class once, most of this turns into fairly neat patterns and can be hidden from your view forever more.

The more important question is; does this get us anywhere near that factor 4.3 that the hand-crafted code achieved?

IL-emit performance (relative)
Relative performance
Direct Access 1.00x
Delegates 23.92x
Interface 6.79x
Class 4.54x
.NET Reflection 1336.97x

Again, surprise at how bad the delegates do… it appears it is merely the overhead of adding another level of indirection (remember that we can wire the delegates directly to the property accessors at about a factor 4.6). The interfaces do not do as great as the hand-coded version, but the classes are close enough to make no difference.

Excellent. And disappointing.

It turns out that the implementation I already had was pretty much optimal even with all the new features of .NET 4 at my fingertips. But at least I’ve now proven there is no better alternative among these.

We Need More Flexibility

Now, we’re not entirely home yet. There is a deficiency in the abstract class implementation I’ve shown.

abstract public class Reflection&lt;T&gt;
{
    int GetIntProperty(T item, int index);
    void SetIntProperty(T item, int index, int value);
    string GetStringProperty(T item, int index);
    void SetStringProperty(T item, int index, string value);
}

It’s all there in that single letter T. If I were to try and implement a serialisation engine on top of this abstract class, I’d run into some trouble. I cannot write any general serialisation method if all I have is a class that needs to know the type we’ll be working on up-front.

I need something more like this:

abstract public class Reflection
{
    int GetIntProperty(object item, int index);
    void SetIntProperty(object item, int index, int value);
    string GetStringProperty(object item, int index);
    void SetStringProperty(object item, int index, string value);
}

And now we raise the spectre of casts. If I were to hand-code an implementation for this, I’d need to constantly cast item to Sample to be able to access the properties on it. Casts are expensive.

There’s an evil trick available to us. Very evil. Avert your eyes now, and never come back.

var method = tb.DefineMethod(&quot;GetIntProperty&quot;, ma,
    typeof(int),
    new Type[] { typeof(object), typeof(int) });

var il = method.GetILGenerator();
il.Emit(OpCodes.Ldarg_2);
labels = new Label[] { il.DefineLabel(), il.DefineLabel() };
il.Emit(OpCodes.Switch, labels);
il.MarkLabel(labels[1]);
il.Emit(OpCodes.Newobj,
    typeof(NotImplementedException).GetConstructor(...));
il.Emit(OpCodes.Throw);
il.MarkLabel(labels[0]);
il.Emit(OpCodes.Ldarg_1);
il.EmitCall(OpCodes.Callvirt, getA, null);
il.Emit(OpCodes.Ret);

At a casual glance, it may look like I copied the IL-generation from my Reflection<Sample> example. But look closer at the DefineMethod call.

Where are the casts? Well… funny that. I basically just omit them. I know that the argument is going to be a Sample because that’s why I am generating this code in the first place. And even funnier… the CLR lets me get away with it.

As long as the first argument is actually the right type.

Passing the wrong type can and will crash .NET, and I’m not even kidding. This is an extremely sharp tool, and you can really seriously cut yourself on it if you use it wrong. This is coding without a safety net. This is EVIL.

But it is also FAST.

When benchmarked, this code is exactly as fast as the strongly typed version through Reflection<T>. This is the best of both worlds. I can write general methods using reflection whilst getting strongly-typed performance, and it’s only about four times slower than direct property accesses.

And this is the implementation I’ll be sticking with for obvious reasons.