A place for spare thoughts

17/08/2012

Implicit closures in C# lambdas, part 1

Filed under: c# — Ivan Danilov @ 16:38

In my day-to day work I was alerted once by R# warning saying ‘Implicitly captured closure: variableName’. After checking actual IL generated by compiler I found that R# warning is pretty much correct. So I did some ground-work and wrote small sample demonstrating this:

    public class Observer
    {
        public event EventHandler X = delegate { };
    }

    public class Receiver
    {
        public void Method(object o) {}
    }

    public class Program
    {
        public static void DoSomething(object a, object b, Observer observer, Receiver r)
        {
            EventHandler action1 = (s, e) => r.Method(a); //Implicitly captured closure: b
            EventHandler action2 = (s, e) => r.Method(b); //Implicitly captured closure: a
            observer.X += action1;
            observer.X += action2;
        }

        public static void Main(string[] args)
        {
            var observer = new Observer();
            var receiver = new Receiver();
            DoSomething(new object(), new object(), observer, receiver);
        }
    }

It seems the cause is simple: when several lambdas are sharing variables-need-to-be-captured (in this case r is present in both) – compiler makes only one class with unspeakable name that contains both methods and all of their variables.

Below is generated IL code. It is there just for completeness, because below I will explain these in terms of equivalent C#, so you don’t need to understand IL to get the idea. It is fairly long, so you most probably don’t want to see it all.

.class public auto ansi beforefieldinit ConsoleApplication1.Program
	extends [mscorlib]System.Object
{
	// Nested Types
	.class nested private auto ansi sealed beforefieldinit '<>c__DisplayClass2'
		extends [mscorlib]System.Object
	{
		.custom instance void [mscorlib]System.Runtime.CompilerServices.CompilerGeneratedAttribute::.ctor() = (
			01 00 00 00
		)
		// Fields
		.field public object a
		.field public object b
		.field public class ConsoleApplication1.Receiver r

		// Methods
		.method public hidebysig specialname rtspecialname 
			instance void .ctor () cil managed 
		{
			// Method begins at RVA 0x2104
			// Code size 7 (0x7)
			.maxstack 8

			IL_0000: ldarg.0
			IL_0001: call instance void [mscorlib]System.Object::.ctor()
			IL_0006: ret
		} // end of method '<>c__DisplayClass2'::.ctor

		.method public hidebysig 
			instance void '<DoSomething>b__0' (
				object s,
				class [mscorlib]System.EventArgs e
			) cil managed 
		{
			// Method begins at RVA 0x210c
			// Code size 19 (0x13)
			.maxstack 8

			IL_0000: ldarg.0
			IL_0001: ldfld class ConsoleApplication1.Receiver ConsoleApplication1.Program/'<>c__DisplayClass2'::r
			IL_0006: ldarg.0
			IL_0007: ldfld object ConsoleApplication1.Program/'<>c__DisplayClass2'::a
			IL_000c: callvirt instance void ConsoleApplication1.Receiver::Method(object)
			IL_0011: nop
			IL_0012: ret
		} // end of method '<>c__DisplayClass2'::'<DoSomething>b__0'

		.method public hidebysig 
			instance void '<DoSomething>b__1' (
				object s,
				class [mscorlib]System.EventArgs e
			) cil managed 
		{
			// Method begins at RVA 0x2120
			// Code size 19 (0x13)
			.maxstack 8

			IL_0000: ldarg.0
			IL_0001: ldfld class ConsoleApplication1.Receiver ConsoleApplication1.Program/'<>c__DisplayClass2'::r
			IL_0006: ldarg.0
			IL_0007: ldfld object ConsoleApplication1.Program/'<>c__DisplayClass2'::b
			IL_000c: callvirt instance void ConsoleApplication1.Receiver::Method(object)
			IL_0011: nop
			IL_0012: ret
		} // end of method '<>c__DisplayClass2'::'<DoSomething>b__1'

	} // end of class <>c__DisplayClass2


	// Methods
	.method public hidebysig static 
		void DoSomething (
			object a,
			object b,
			class ConsoleApplication1.Observer observer,
			class ConsoleApplication1.Receiver r
		) cil managed 
	{
		// Method begins at RVA 0x2134
		// Code size 72 (0x48)
		.maxstack 2
		.locals init (
			[0] class [mscorlib]System.EventHandler action1,
			[1] class [mscorlib]System.EventHandler action2,
			[2] class ConsoleApplication1.Program/'<>c__DisplayClass2' 'CS$<>8__locals3'
		)

		IL_0000: newobj instance void ConsoleApplication1.Program/'<>c__DisplayClass2'::.ctor()
		IL_0005: stloc.2
		IL_0006: ldloc.2
		IL_0007: ldarg.0
		IL_0008: stfld object ConsoleApplication1.Program/'<>c__DisplayClass2'::a
		IL_000d: ldloc.2
		IL_000e: ldarg.1
		IL_000f: stfld object ConsoleApplication1.Program/'<>c__DisplayClass2'::b
		IL_0014: ldloc.2
		IL_0015: ldarg.3
		IL_0016: stfld class ConsoleApplication1.Receiver ConsoleApplication1.Program/'<>c__DisplayClass2'::r
		IL_001b: nop
		IL_001c: ldloc.2
		IL_001d: ldftn instance void ConsoleApplication1.Program/'<>c__DisplayClass2'::'<DoSomething>b__0'(object, class [mscorlib]System.EventArgs)
		IL_0023: newobj instance void [mscorlib]System.EventHandler::.ctor(object, native int)
		IL_0028: stloc.0
		IL_0029: ldloc.2
		IL_002a: ldftn instance void ConsoleApplication1.Program/'<>c__DisplayClass2'::'<DoSomething>b__1'(object, class [mscorlib]System.EventArgs)
		IL_0030: newobj instance void [mscorlib]System.EventHandler::.ctor(object, native int)
		IL_0035: stloc.1
		IL_0036: ldarg.2
		IL_0037: ldloc.0
		IL_0038: callvirt instance void ConsoleApplication1.Observer::add_X(class [mscorlib]System.EventHandler)
		IL_003d: nop
		IL_003e: ldarg.2
		IL_003f: ldloc.1
		IL_0040: callvirt instance void ConsoleApplication1.Observer::add_X(class [mscorlib]System.EventHandler)
		IL_0045: nop
		IL_0046: nop
		IL_0047: ret
	} // end of method Program::DoSomething

	.method public hidebysig static 
		void Main (
			string[] args
		) cil managed 
	{
		// Method begins at RVA 0x2188
		// Code size 32 (0x20)
		.maxstack 4
		.entrypoint
		.locals init (
			[0] class ConsoleApplication1.Observer observer,
			[1] class ConsoleApplication1.Receiver receiver
		)

		IL_0000: nop
		IL_0001: newobj instance void ConsoleApplication1.Observer::.ctor()
		IL_0006: stloc.0
		IL_0007: newobj instance void ConsoleApplication1.Receiver::.ctor()
		IL_000c: stloc.1
		IL_000d: newobj instance void [mscorlib]System.Object::.ctor()
		IL_0012: newobj instance void [mscorlib]System.Object::.ctor()
		IL_0017: ldloc.0
		IL_0018: ldloc.1
		IL_0019: call void ConsoleApplication1.Program::DoSomething(object, object, class ConsoleApplication1.Observer, class ConsoleApplication1.Receiver)
		IL_001e: nop
		IL_001f: ret
	} // end of method Program::Main

	.method public hidebysig specialname rtspecialname 
		instance void .ctor () cil managed 
	{
		// Method begins at RVA 0x21b4
		// Code size 7 (0x7)
		.maxstack 8

		IL_0000: ldarg.0
		IL_0001: call instance void [mscorlib]System.Object::.ctor()
		IL_0006: ret
	} // end of method Program::.ctor

} // end of class ConsoleApplication1.Program

So, basically, what C# compiler did overall is something like this:

        // actually named <>c__DisplayClass2 in IL
        private class CompilerGeneratedClass 
        {
            public object a;
            public object b;
            public Receiver r;

            // actually named <DoSomething>b__0 in IL
            public void Method1(object s, EventArgs e)
            {
                 r.Method(a);
            }
            // actually named <DoSomething>b__1 in IL
            public void Method2(object s, EventArgs e)
            {
                r.Method(b);
            }
        }

        public static void DoSomething(object a, object b, Observer observer, Receiver r)
        {
            // actually named CS$<>8__locals3 in IL
            CompilerGeneratedClass helper = new CompilerGeneratedClass();
            helper.a = a;
            helper.b = b;
            helper.r = r;
            EventHandler action1 = helper.Method1;
            EventHandler action2 = helper.Method2;
            observer.X += action1;
            observer.X += action2;
        }

So, you see, no lambdas here. Compiler replaced every variable used in closure with nested class’ variable. That’s fine, but “what about subject and these implicit closures?” you may ask. Well, that’s simple from that point. In the original code you have two unconnected lambdas and might assume that as long as first one along with a object are not used anymore – they will be GCed regardless of current state of second lambda, right? But now it is clear that as long as CompilerGeneratedClass is alive – both a and b will live. And generated class will live at least as long as someone needs any of these two lambdas.

Essentially, we can count it as both lambdas have taken closure around both a and b variables. Thus ReSharper’s warning.

What problems can it bring to you as a developer? Suppose a is cheap object and action1 is passed to long-living service class; meanwhile b is costly object and action2 is used here once and forgotten. One assumes that latter one will be called and GCed along with costly b object almost immediately. But it will live as long as service would hold former lambda causing hard-to-find memory leak.

Why compiler can’t do better? I can’t say for certainty, so these are only my speculations on the subject. Nevertheless. Why compiler can’t create two generated classes so that one would capture r and a, while second would capture r and b? It could, and in this simple case it will be ok. But in more complicated cases it’ll lead to problems. These problems would be more severe that implicit captures. For one example, suppose method with lambdas is not only captures essentially read-only arguments, but also writes to them. How would you capture variable that has writes to it to two different generated classes? And even if you manage to do that – how would you synchronize these two values? And what about multi-threaded code? You see, a bunch of hard questions arises the very moment you allow one variable to be ‘distributed’ between several generated classes.

So, what can I do to avoid leaks and problems? Well, ReSharper has pretty good eye for these things, so watch out for its warnings. And if one arises – try to solve it. [The following advice is not correct! See part 2 for details] General solution is to copy captured variable to another variable manually – it is what compiler can’t do safely for you because of reasons above, but you know context better and in most cases can do this easily. For example, for my code it would be this:

        public static void DoSomething(object a, object b, Observer observer, Receiver r)
        {
            var receiverCopy = r;
            EventHandler action1 = (s, e) => receiverCopy.Method(a);
            EventHandler action2 = (s, e) => r.Method(b);
            observer.X += action1;
            observer.X += action2;
        }

That’s it. In some cases, though, you can eliminate some captures without much efforts at all. For example, sender in events references what you need – use it, not captures if it helps!

2 Comments »

  1. […] I’ve just found that not all I said in first part is true (I’ve already marked wrong place so readers would not be confused). Shame on […]

    Pingback by Implicit closures in C# lambdas, part 2 « A place for spare thoughts — 17/08/2012 @ 17:59

  2. […] Somewhat more detailed, in my blog: here and […]

    Pingback by Why C# compiler generates single class to capture variables of several lambdas? | Ask Programming & Technology — 08/11/2013 @ 16:40


RSS feed for comments on this post. TrackBack URI

Leave a comment

Create a free website or blog at WordPress.com.