Sunday, July 18, 2010

Fastest Way To Retrieve Custom Attributes for a Type Member

In my previous posts (Performance Issues When Comparing Strings in .NET and When string.ToLower() is Evil) string related operations were discussed.

In this post we'll examine performance issues when querying for type member's custom attributes.
Let us define two attributes and a class. Class will have its single method decorated with an attribute. Here's the code:

class FooAttribute : Attribute
{ }

class BarAttribute : FooAttribute
{ }

class Item
{
    [Bar]
    public int Action()
    {
        return 0;
    }
}
Now the question is what is the fastest way to check Action method for Bar custom attribute. Custom attributes can be queried using instance of a type that implements ICustomAttributeProvider interface. In our case we shall use Assembly class and MethodInfo.

The code below queries custom attributes using Assembly class and then using MethodInfo instance. Query operation executes 10000 times and duration is measured using Stopwatch class. Code below also measures time required to check if attribute is applied.

int count = 10000;
Type tBar = typeof(Item);
MethodInfo mInfo = tBar.GetMethod("Action");
//warm up
mInfo.IsDefined(typeof(FooAttribute), true);
object[] attribs = null;
Stopwatch sw = new Stopwatch();

sw.Start();
for (int i = 0; i < count; i++)
{
 attribs = Attribute.GetCustomAttributes(mInfo, typeof(FooAttribute), true);
}
sw.Stop();

Console.WriteLine("Attribute(specific): {0}, Found: {1}", sw.ElapsedMilliseconds, 
 attribs.Length);
sw.Reset();

sw.Start();
for (int i = 0; i < count; i++)
{
 attribs = mInfo.GetCustomAttributes(typeof(FooAttribute), true);
}
sw.Stop();
Console.WriteLine("MethodInfo: {0}, Found: {1}", sw.ElapsedMilliseconds, 
 attribs.Length);
sw.Reset();

sw.Start();
for (int i = 0; i < count; i++)
{
 attribs = Attribute.GetCustomAttributes(typeof(FooAttribute), true);
}
sw.Stop();

Console.WriteLine("Attribute(general): {0}, Found: {1}", sw.ElapsedMilliseconds, 
 attribs.Length);
sw.Reset();
   
sw.Start();
for (int i = 0; i < count; i++)
{
 Attribute.IsDefined(mInfo, typeof(FooAttribute), true);
}
sw.Stop();

Console.WriteLine("Attribute::IdDefined: {0}", sw.ElapsedMilliseconds);
sw.Reset();

sw.Start();
for (int i = 0; i < count; i++)
{
 mInfo.IsDefined(typeof(FooAttribute), true);
}
sw.Stop();

Console.WriteLine("MethodInfo::IdDefined: {0}", sw.ElapsedMilliseconds);
sw.Reset();
Code above produces the output:
Attribute(specific): 137, Found: 1
MethodInfo: 130, Found: 1
Attribute(general): 569, Found: 1
Attribute::IdDefined: 40
MethodInfo::IdDefined: 33
Results indicate that the fastest method is querying custom attributes via MethodInfo class. To generalize the results above we can say that the fastest way to get custom attributes - is to use the closest reflection equivalent of type member. (e.g. Method - MethodInfo, Property - PropertyInfo etc)

Last two results show the time of IsDefined operation. Use this operation in cases when only a check is needed whether attribute is applied to a type member.

Thursday, July 08, 2010

Type inference in generic methods

Did you know that in .NET generic methods have type inference? It can also be named as implicit typing.

Let's see how type inference looks in code. In the sample below there is a class with generic methods

class NonGenericType
    {
        public static int GenericMethod1<TValue>(TValue p1, int p2)
        {
            Console.WriteLine(p1.GetType().Name);
            return default(int);
        }

        public static TValue GenericMethod2<TValue>(int p1, TValue p2)
        {
            Console.WriteLine(p2.GetType().Name);
            return default(TValue);
        }

        public static TReturn GenericMethod3<TValue, TReturn>(int p1, TValue p2)
        {
            Console.WriteLine(p2.GetType().Name);
            Console.WriteLine(typeof(TReturn).Name);
            return default(TReturn);
        }
    }
Here's the traditional way of using the above defined methods:
NonGenericType.GenericMethod1<string>("test", 5);
NonGenericType.GenericMethod2<double>(1, 0.5);
Nothing fancy here, we specify what type to place instead of TValue type parameter.
Type inference gives us the possibility to omit directly specifying type parameters. Instead we just use methods as if they're non generic.
NonGenericType.GenericMethod1("test", 5);
NonGenericType.GenericMethod2(1, 0.5);
Type inference can become handy as it reduces typing, but in my opinion it makes code less readable.
Also type inference cannot "guess" the return type of the method:
NonGenericType.GenericMethod3(1, 0.5);
 //error CS0411: The type arguments for method 'TypeInference.NonGenericType.GenericMethod3<TValue,TReturn>(int, TValue)' cannot be inferred from the usage. Try specifying the type arguments explicitly.
Nice explanation why inference does not work in the scenario above was given by Eric Lippert
Happy coding :)

Friday, June 18, 2010

Thread Safe Collection Iteration Techniques

Under multithreaded environment every operation should be tested and analyzed from the viewpoint of thread-safety. That is check every data structure what will happen if it is accessed/changed from multiple threads

Imagine, we need to iterate over a collection of items and perform some actions over each item of the collection. Since we're talking about threading - iteration should be done in a thread safe way. That is while we are iterating over collection no other thread is allowed to add or remove items from it.
No problemo! you may think - do the iteration under a lock.

But it is not that simple.

Code sample below illustrates two approaches how to do the iteration. Both have pros and cons. More on that after the code sample.

int initialItems = 5;
ICollection<string> coll = new List<string>(initialItems);

for(int i = 1; i <= initialItems; i++)
 coll.Add("item" + i.ToString());
   
//#1 iterating with lock approach
lock(coll)
{
 foreach(string item in coll)
 {
  PerformWorkWithItem(item);    
 }
}
//

//#2 iteration over a copy 
ICollection<string> copyColl = null;
lock(coll)
{
 copyColl = new List<string>(coll);
}

foreach(string item in copyColl)
{
 PerformWorkWithItem(item);    
}
//    

void PerformWorkWithItem(string item)
{
 //
 // perform operations that can take some 
 // considereable amount of time     
 //
}

Welcome back.

Approach #1 uses global lock for iteration. That means that while iterating collection is protected by the lock.
The pros are:
  • simplicity (just put the lock and do the job)
  • Memory efficiency - no new object are constructed
The cons are:
  • if PerformWorkWithItem takes long time to complete or is blocking (i.e. reading data from the network) access to collection is blocked for considerable amount of time
  • action with a collection item is also protected by the lock

Approach #2 uses different technique. It locks access to the collection only to perform a copy (snapshot) of the original collection. Iteration and PerformWorkWithItem action is made over a snapshot and is not protected by the lock.
The pros are:
  • Operations on collection items are done without locking the collection. If PerformWorkWithItem takes long time to complete original collection is not locked as in #1
  • Allows to schedule actions on collection items using separate thread
The cons are
  • If original collection is large enough performing data copy can become inefficient
  • Add complexity. While performing actions on snapshot items of the original collection may have been already changed.

Now that we know pros and cons of these two approaches we can deduce some hints that can help choose appropriate technique.

For instance, if PerformWorkWithItem action is relatively fast and there is no problem for the rest of the application to wait for iteration process then approach #1 is the best.

On the other hand if PerformWorkWithItem can take considerable amount of time and other parts of the application frequently access the collection (i.e. it is not desirable to block access to the collection for a long time) then #2 can do.

P.S. There also exists an approach #3. It utilizes lock-free data structures. But it is a whole new story and a topic for separate post.

Tuesday, June 15, 2010

AesManaged class Key and KeySize properties issue

Today when working with AesManaged class I've encountered very strange behavior.
If you have a code like this - you're in trouble:

AesManaged aes = new AesManaged();
aes.Key = key;
aes.KeySize = key.Length; //the problem
The problem with this code is setting KeySize after setting Key value.
When you set KeySize after Key - the previously specified key is discarded and a brand new key value is generated and put into Key property

I find this behavior rather strange, especially that there is no information describing what will happen after setting KeySize.

I would expect that when Key value is set setting KeySize will throw exception if specified key's size is bigger or smaller than the new one.

Wednesday, April 28, 2010

The Big Bang Theory sitcom scientific background

Usually I do not write about TV. But the serial in the subject is one of my favorite.

Recently I've found blog of the guy who does scientific background for that sitcom.
There are a lot of interesting scientific facts on that blog in the context of the TV show.

I totally recommend reading it even if you do not watch The Big Bang Theory
The url of the blog is http://thebigblogtheory.wordpress.com/

Friday, April 16, 2010

Refactoring code with lambda expressions

Without much ado lets go straight to the code that needs to be refactored:

bool SomeMethod(long param)
{
   //
   // some prefix code
   //
   try
   {
      //do specific job here
      return DoSpecificJob(param);
   }
   finally
   {
      //
      // some suffix code
      //
   }
}

Result SomeOtherMethod(string name, int count)
{
   //
   // some prefix code
   //
   try
   {
      return DoOtherSpecificJob(name, count);
   }
   finally
   {
      //
      // some suffix code
      //
   }
}
The question is how we can bring prefix and suffix code from the example above in one place (method) without changing code logic.
The goal is to have these two methods rewritten like this:
bool SomeMethod(long param)
{
   return DoSpecificJob(param);
}

Result SomeOtherMethod(string name, int count)
{
   return DoOtherSpecificJob(name, count);
}
While another method will be created that executes prefix and suffix code.

There are several ways how to do that:
1. Create method that contains prefix and suffix code, accepts Delegate object and params object[] array
object ExecuteCommon(Delegate d, params object[] args)
{
   //
   // prefix code
   //
   try { return d.DynamicInvoke(args); }
   finally
   {
      //
      // suffix code
      //
   }
}

bool SomeMethod_First(long param)
{
   Delegate d = new Func<long, bool>((b) => SomeMethod(b));
   return (bool)ExecuteCommon(d, new object[] {param});
}

Result SomeOtherMethod_First(string name, int count)
{
   Delegate d = new Func<string, int, Result>((n, c) => SomeOtherMethod(n, c));
   return (Result)ExecuteCommon(d, new object[] { name, count });
}
The approach looks nice but has several caveats. The problems here are: boxing (wrapping value types into reference types) and casting.
2. Move repeated code up on the call stack
This approach is possible if SomeMethod and SomeOtherMethod are on the same call stack level or called from the same method.

3. Create a generic method that accepts generic delegate and defines several parameters
TResult ExecuteCommon<T1,TResult>(Func<T1, TResult> func, T1 param1)
{
   //
   // some prefix code
   //
   try { return func(param1); }
   finally
   {
      //
      // some suffix code
      //
   }
}

TResult ExecuteCommon<T1, T2, TResult>(Func<T1, T2, TResult> func, T1 param1, T2 param2)
{
   //
   // some prefix code
   //
   try { return func(param1, param2); }
   finally
   {
      //
      // some suffix code
      //
   }
}

bool SomeMethod_Third(long param)
{
   return ExecuteCommon<long, bool>((p) => SomeMethod(param), param);
}

Result SomeOtherMethod_Third(string name, int count)
{
   return ExecuteCommon<string, int, Result>(
      (n, c) => SomeOtherMethod(n, c), name, count);
}
This method does not use casting and there is no boxing present when value type parameters are specified. However, the caveat is that you need to define multiple methods with variable type parameters count. In my opinion the third approach is the best although we have to define two methods that execute prefix/suffix code.

P.S. Another way of refactoring here is code generation: emitting code on the fly or using template tools like T4 templates in Visual Studio.

kick it on DotNetKicks.com