A place for spare thoughts

21/05/2013

WPF UI Validation technique

Filed under: c#, wpf — Ivan Danilov @ 13:51

WPF provides several methods of validation, most common are implementing IDataErrorInfo on ViewModel, implementing custom ValidationRule, use ValidatesOnException whatever.

But for MVVM pattern really only IDataErrorInfor is good. The problem with other approaches – well, the VM can’t know with them if error happened. Yeah, user gets error indication, error is handled… but the VM thinks everything is good. Suppose VM controls if OK button is enabled or disabled (e.g. via CanExecute of bound command) – how could VM know when to disable the button if it doesn’t know if error happened?

Let’s try to imagine absolutely trivial example: window with TextBox and a Button. User should type integer in the TextBox and click the button. If user doesn’t enter a number at all or entered some invalid value (like ‘abc’ or non-integer like ‘1.23’) – button should be disabled and TextBox should have normal error notification (default WPF red outline is ok).

The task is trivial, so the code should better not be complicated as well. Of course we want MVVM-ish thing, so let’s write a VM:

class VM
{
    public int Value { get; set; }
    public ICommand Ok { get; set; }
}

VM most probably needs to implement INotifyPropertyChanged for its properties, but it is not the case here, so I’ve omitted these. Can you see the problem already? Value has int type. When user types ‘abc’ in the TextBox – Value won’t receive changes until value becomes parseable integer again. Binding just can’t pass them to VM – it has no means to do so. And thus we can’t notify Ok command that it can’t be executed right now. Moreover, if OK button is supposed to save something… imagine a scenario: user entered 100 to the TextBox, this value has propagated to Value property, then user accidentaly typed ‘o’ (letter o) instead of ‘0’ (zero) – they are pretty close on my keyboard – and clicked OK. Oops. We just overwritten old data with 100, when user thought it will be 1000. Definitely not good.

So, what is the solution? One apparent one is to change Value’s type: let it be a string. We can parse it in the VM and determine if it is correct, and disable Ok command. It will work, of course, but if you have two dozens of such pseudo-integers – VM becomes a pile of ToString() and Int32.Parse() calls here and there, which hides all the logic. A mess, shortly said.

What is the problem we’re trying to solve, exactly? Inability of View to notify ViewModel that error has happened. Ok. Lets pretend we found a place in the View that knows about the problem. How could it notify the VM? Well, View has very precise knowledge about its VM – it is the DataContext. But the DataContext is of type object… making View know precise type of its VM is probably not a good idea (besides it will make the solution require separate modification for each case), so we will encapsulate details of error notifications in the interface:

    public interface IUIValidationErrorSink
    {
        void AddUIValidationError(string propertyPath, string errorContent);
        void RemoveUIValidationError(string propertyPath);
    }

Don’t be amused where this precise arguments are came from – it will become clear after a couple of minutes. So, if our validation mechanism could get instanse of IUIValidationErrorSink object – it could notify interested parties that new error appeared or existing error is gone. So far so good.

VM will implement this in very straightforward way:

class VM : IUIValidationErrorSink
{
    private Dictionary<string, string> _uiErrors = new Dictionary<string, string>();

    public VM()
    {
        Ok = new RelayCommand(DoOk, CanOk); // RelayCommand is from famous John Smith's MVVM article here: http://msdn.microsoft.com/en-us/magazine/dd419663.aspx#id0090051
    }

    public int Value { get; set; }
    public ICommand Ok { get; set; }

    private void DoOk()
    {
        // Do something useful
    }

    private bool CanOk(object _)
    {
        return _uiErrors.Empty();
    }

    void IUIValidationErrorSink.AddUIValidationError(string propertyPath, string errorContent)
    {
        UIErrors[propertyPath] = errorContent;
    }

    void IUIValidationErrorSink.RemoveUIValidationError(string propertyPath)
    {
        UIErrors.Remove(propertyPath);
    }
}

Another problem is how we would find which control’s DataContext we should take? For example, if you have ItemsControl of something editable – each item has its own DataContext.

To answer this question we need to know a bit about WPF’s Binding Validation mechanism. It is implemented via Routed Events, particularly ErrorEvent in the System.Windows.Controls.Validation class (if you want to know more about Routed Events – you may start here). ErrorEvent is raised in the WPF Binding mechanism whenever error appears or disappears (actually, Binding calls Validation.AddValidationError and Validation.RemoveValidationError methods, which in turn call RaiseEvent). The target of this routed event is the DependenencyObject that owns bound DependencyProperty. As this event has Bubbling strategy, it then re-raised for each parent control in the Visual Tree from target until either the root is reached or someone marks event as Handled.

If only we could set routed event handler for ErrorEvent somewhere near the top of the View’s visual tree, where we always know our ViewModel… The good news are that we can – attached properties make this quite easy:

        public static readonly DependencyProperty IsUIValidationErrorSinkProperty =
            DependencyProperty.RegisterAttached("IsUIValidationErrorSink",
                                                typeof (bool),
                                                typeof (UIValidation),
                                                new PropertyMetadata(false, IsErrorSinkPropertyChangedCallback));

        public static bool GetIsUIValidationErrorSink(DependencyObject obj)
        {
            return (bool) obj.GetValue(IsUIValidationErrorSinkProperty);
        }

        public static void SetIsUIValidationErrorSink(DependencyObject obj, object value)
        {
            obj.SetValue(IsUIValidationErrorSinkProperty, value);
        }

        private static void IsErrorSinkPropertyChangedCallback(DependencyObject dependencyObject, DependencyPropertyChangedEventArgs e)
        {
            var control = dependencyObject as FrameworkElement;
            if (control == null)
                throw new ArgumentException("IsUIValidationErrorSink property could be attached only to FrameworkElement descendants");

            control.AddHandler(Validation.ErrorEvent, HandleErrorEvent);
        }

Here we restrict the property to FrameworkElement because we need access to FrameworkElement.DataContext. Assuming we have written UIValidationRule already (see below), our handler might look like this:

        private static void HandleErrorEvent(object sender, RoutedEventArgs e)
        {
            var args = (ValidationErrorEventArgs) e;
            if (!(args.Error.RuleInError is UIValidationRule)) return;
            
            var sinkControl = (FrameworkElement) sender;
            var sink = sinkControl.DataContext as IUIValidationErrorSink;
            if (sink == null)
            {
                PresentationTraceSources.DataBindingSource.TraceEvent(TraceEventType.Error, 0, "UIValidation.IsUIValidationErrorSink attached property's HandleError encountered an error: DataContext does not implement IUIValidationErrorSink");
                return;
            }

            // filter out rules that are not UiValidationRule - we're handling only these
            var error = Validation.GetErrors((DependencyObject)args.OriginalSource)
                .FirstOrDefault(err => err.RuleInError is UIValidationRule);
            
            var bindingExpression = (BindingExpression) args.Error.BindingInError;
            string propertyPath = bindingExpression.ParentBinding.Path.Path;

            if (error == null)
            {
                sink.RemoveUIValidationError(propertyPath);
            }
            else
            {
                var errorMessage = (string) error.ErrorContent;
                sink.AddUIValidationError(propertyPath, errorMessage);
            }

            args.Handled = true;
        }

Now we need to write last piece – UIValidationRule which will actually validate the input. The goal here is to make it as easy to use and extensible as possible: first because we don’t won’t to use clumsy long XAML code pieces and second because we may want to write many slightly different rules: for ints, doubles, DateTimes etc. Also we want error messages to be customizable and localizable. Hence UIValidationRule will be an abstract base class which will be extended then.

My first idea was to make one descendant of this abstract class for each rule I needed. Though it turned out these descendants were pretty much the same and had almost everything duplicated – the only real difference was use of Int32.TryParse vs Double.TryParse. So I’ve decided to have single descendant of UIValidationRule which takes something akin to TryParse delegate as an argument:

    public abstract class UIValidationRule : ValidationRule
    {
        public string DefaultErrorMessage { get; set; }
    }

    public sealed class RelayUIValidationRule : UIValidationRule
    {
        private readonly Func<string, CultureInfo, string> _validator;

        /// <param name="validator">Mapper from string value to error or null if validation succeeded</param>
        public RelayUIValidationRule(Func<string, CultureInfo, string> validator)
        {
            _validator = validator;
        }

        public override ValidationResult Validate(object value, CultureInfo cultureInfo)
        {
            var str = (string) value;
            string err = _validator(str, cultureInfo);
            return err == null ? ValidationResult.ValidResult : new ValidationResult(false, err);
        }
    }

Here Func<string, CultureInfo, string> takes string representation and culture info and returns null if validation succeeds and not null string with error message otherwise.

Also ‘normal’ way of using binding validation rules usually looks like this:

<TextBox>
    <Binding Path="Value" NotifyOnValidationError="True">
        <Binding.ValidationRules>
            <local:MyBindingValidationRule />
        </Binding.ValidationRules>
    </Binding>
</TextBox>

Ahem, really? 7 lines of XAML code to bind one value, not mentioning boilerplate NotifyOnValidationError="True"? No-no-no, it is totally unacceptable. Lets see if we could find a better, more terse way. Actually it is not that hard, if we assume control can’t have two UI validation rules at once (hey, this poor string shouldn’t be parsable as double and DateTime at the same time!):

    public class ValidatedBinding : Binding
    {
        private UIValidationRule _uiValidation;

        public ValidatedBinding()
        {
            NotifyOnValidationError = true;
            UpdateSourceTrigger = UpdateSourceTrigger.PropertyChanged;
        }

        public UIValidationRule UIValidation
        {
            get { return _uiValidation; }
            set
            {
                if (_uiValidation != null)
                    ValidationRules.Remove(_uiValidation);
                _uiValidation = value;
                if (_uiValidation != null)
                    ValidationRules.Add(_uiValidation);
            }
        }
    }

And final step: make instantiation of validation rules driven by MarkupExtensions:

    internal delegate bool TryParseFunc<T>(string source, CultureInfo cultureInfo, out T result);

    public abstract class UIValidationExtensionBase<T> : MarkupExtension
    {
        private readonly string _customErrorMessage;
        private readonly TryParseFunc<T> _tryParseFunc;

        internal UIValidationExtensionBase(string customErrorMessage, TryParseFunc<T> tryParseFunc)
        {
            _customErrorMessage = customErrorMessage;
            _tryParseFunc = tryParseFunc;
        }

        /// <summary>
        /// Specifies that empty string and null values should be allowed by the validation mechanism. 
        /// False by default.
        /// </summary>
        public bool AllowNulls { get; set; }

        public override object ProvideValue(IServiceProvider serviceProvider)
        {
            Func<string, CultureInfo, string> validation =
                (s, c) =>
                    {
                        if (AllowNulls && String.IsNullOrEmpty(s))
                            return null;
                        T result;
                        bool success = _tryParseFunc(s, c, out result);
                        return success ? null : _customErrorMessage;
                    };

            return new RelayUIValidationRule(validation)
                       {
                           DefaultErrorMessage = _customErrorMessage
                       };
        }
    }

    [MarkupExtensionReturnType(typeof (UIValidationRule))]
    public class IntUIValidationExtension : UIValidationExtensionBase<int>
    {
        public IntUIValidationExtension()
            : this("Incorrect number format")
        {
        }

        public IntUIValidationExtension(string customErrorMessage)
            : base(customErrorMessage, (string s, CultureInfo c, out int result) => int.TryParse(s, NumberStyles.Integer, c, out result))
        {
        }
    }

    [MarkupExtensionReturnType(typeof (UIValidationRule))]
    public class DateTimeUIValidationExtension : UIValidationExtensionBase<DateTime>
    {
        public DateTimeUIValidationExtension()
            : this("Incorrect date/time format")
        {
        }

        public DateTimeUIValidationExtension(string customErrorMessage)
            : base(customErrorMessage, (string s, CultureInfo c, out DateTime result) => DateTime.TryParse(s, c, DateTimeStyles.None, out result))
        {
        }
    }

    // I think you could come up with other extensions as well 🙂

And finally, the first clumsy syntax with 7 lines transformed into beautiful one-liner:

<TextBox Text="{WpfValidation:ValidatedBinding UIValidation={WpfValidation:IntUIValidation}, Path=Value" />

Well, don’t forget to put UIValidation.IsUIValidationErrorSink="True" somewhere in your XAML.

17/04/2013

WeakEvents catch with tests

Filed under: c#, Unit testing — Ivan Danilov @ 19:01

Some time ago I wrote about weak events including non-standard ones. While this post is not totally dependent on weak events, I will assume you’re acquainted with MakeWeak method.

So, suppose you have some service that publishes an event, like that:

        public class SomeService
        {
            public event EventHandler TestEvent;

            internal void RaiseTestEvent()
            {
                var handler = TestEvent;
                if (handler != null) handler(this, EventArgs.Empty);
            }
        }

And you’re writing new class that will subscribe to this event weakly and re-raise it (possibly under some specific conditions). Extremely simplified it could look like this:

        public class Sut
        {
            public Sut(SomeService helper)
            {
                helper.TestEvent += new EventHandler(OnEvent).MakeWeak(eh => helper.TestEvent -= eh);
            }

            public event EventHandler TestEventReraised = delegate { };

            private void OnEvent(object s, EventArgs args)
            {
                TestEventReraised(this, EventArgs.Empty);
            }
        }

So far, so good. Now naturally you want to test if your Sut does what it should:

        [Test]
        public void SutReRaisesEventCorrectly()
        {
            var helper = new SomeService();
            var sut = new Sut(helper);
            bool raised = false;
            sut.TestEventReraised += (s, e) => raised = true;
            helper.RaiseTestEvent();
            Assert.IsTrue(raised);
        }

Seems logical, isn’t it? You may be surprised, but if you run this test in Release mode, it will fail randomly. Most of the time it will pass, but if you use build server and have many tests in your project – this one will fail eventually. If you have been reading my blog for some time, you may recognize the symptoms as they are very similar to this case. Really, if you change test like below it will fail consistently in Release mode (in Debug it will still succeed always):

        [Test]
        public void SutReRaisesEventCorrectly()
        {
            var helper = new SomeService();
            var sut = new Sut(helper);
            bool raised = false;
            sut.TestEventReraised += (s, e) => raised = true;

            GC.Collect();

            helper.RaiseTestEvent();
            Assert.IsTrue(raised);
        }

Really, sut variable’s value is eligible for collection at highlighted line, so it is collected (our event has weak subscription, remember?), and event is never raised.

As it takes much attention to not make such errors, I’d suggest to make sut variable not local for test method but rather class variable. Then it will never be eligible for collection in such cases and you will not write randomly failing tests. Like that:

    [TestFixture]
    public class WeakEventsCheck
    {
        private Sut _sut;

        [Test]
        public void SutReRaisesEventCorrectly()
        {
            var helper = new SomeService();
            _sut = new Sut(helper);
            bool raised = false;
            _sut.TestEventReraised += (s, e) => raised = true;

            GC.Collect();

            helper.RaiseTestEvent();
            Assert.IsTrue(raised);
        }
    }

25/12/2012

EnumerateChain extension method

Filed under: c#, patterns — Ivan Danilov @ 19:12

Very small, simple and extremely useful method:

        public static IEnumerable<T> EnumerateChain<T>(this T start, Func<T, T> next)
            where T : class
        {
            while (start != null)
            {
                yield return start;
                start = next(start);
            }
        }

How to use it you may ask. Suppose (just for the sake of example) you have some tree structure, with Name and Parent properties in each node. And you want to get path in tree of some given node in form of node names separated by slashes. Nothing could be easier:

public static string GetPathInTree(Node node)
{
    var names = node.EnumerateChain(n => n.Parent).Reverse().Select(n => n.Name);
    return String.Join("/", names);
}

Basically with this method you can traverse any structure as if it was normal IEnumerable. Very convenient, really 🙂

12/09/2012

Property Injection for infrastructural dependencies in practice, part 2

Filed under: c# — Ivan Danilov @ 21:41

In the first part I introduced RequiredPropertyAttribute and integrated it with Windsor. And the problem of preventing manual instantiation of such classes (because manually nobody is forcing us to treat such properties as required in any way) remained open.

So, the best thing possible from my point of view would be compiler emitting warning each time we create such classes. Then you will have a clear sign you’re doing something not-intended. Or, if you have “Treat compiler warnings as errors” – it will not allow you to even compile such code.

I believe with Roslyn it is not so hard to write custom inspection for this and warn user whenever required. But for now it is not that easy unfortunately. At least I haven’t found better way than described below (if you know one – let me know, please).

The closest thing (actually, surprisingly close to our needs) is ObsoleteAttribute on the class constructor. It has several advantages:

  • compiler emits warning only if it is used, so we don’t need to analyze code ourselves;
  • it is totally ignored by IoCC and other reflection-based approaches;
  • we may customize warning text to let user know that instantiating of a class should be done via IoCC rather then manually;
  • compiler enforces “viral propagating” of the attribute.

Let me elaborate a bit on the last point. Suppose you have this class:

    public class A
    {
        [Obsolete("You should not call this contructor manually. It should be called only by IoC container to avoid missing dependencies")]
        public A() {}
    }

And someone writes that one:

    public class B : A {}

Oops. Immediate warning: “‘A.A()’ is obsolete”. In order to have such class without warning you have to write that code:

    public class B : A
    {
        [Obsolete]
        public B() {}
    }

And that is a good thing, obviously.

Now, some disadvantages clearly present as well. First off, it is a hack and using Obsolete in not-intended way. Also you have no means to avoid setting this long string line if you want clear warning message (ObsoleteAttribute class is sealed unfortunately). And finally, if you’re developing some library and making it backward-compatible – there’re good chances you need Obsolete for your own legitimate use. Thus said, if you want to incorporate this approach – you will be marking as Obsolete some internal implementations which probably will not be seen by user of your library anyway.

And obviously, this approach doesn’t defend you against reflection. But reflection is absolutely another story where you have much less guarantees by default, so I think it is ok.

Though I’m not absolutely content with this solution – it works and I can’t think out something better right now. And I can move my app to property injection more or less sure that required properties will be really required, and these requirements would be checked as early as possible.

Property Injection for infrastructural dependencies in practice, part 1

Filed under: c# — Ivan Danilov @ 21:41

Inspired by this post I tried this approach myself. If you haven’t read it yet – it’s time to spend several minutes there.

The main reason I was reluctant to use “infrastructure dependencies as properties, application-level dependencies as ctor arguments” is that it is very easy to miss the required property and find oneself one day in debugging because of some silly configuration error or forgetting to set a property. I strive to use static checking as much as possible and detect errors in compile-time or build-time (on build server where some additional checks are taking place).

On Inversion of Control Container (IoCC) side it can be solved – as Krzysztof suggested – via configuring so that it treats them as required. Here I will show things using Castle Windsor 3.0, but the approach overall is not specific to concrete container.

The default way in Windsor to make property required is at configuration time shown here. Personally I don’t like that very much, because in this way you need to find registration and look through it in order to understand if some property will be filled always or not. It is much cleaner and easier to read if it is immediately obvious right in the property declaration place, thus making custom attributes natural choice. As far as I’m aware Windsor don’t have built-in attribute for that, so let’s make our own.

It is easy if you know where to look. I have to admit I spent about an hour to find the place and another 15 minutes to write implementation and tests for it.

Here you go:

    [AttributeUsage(AttributeTargets.Property)]
    public sealed class RequiredPropertyAttribute : Attribute
    {
    }

    public class RespectRequiredPropertiesContributor : IContributeComponentModelConstruction
    {
        public void ProcessModel(IKernel kernel, ComponentModel model)
        {
            var markedProperties = model.Properties
                .Where(p => p.Property
                                .GetCustomAttributes(typeof(RequiredPropertyAttribute), true)
                                .FirstOrDefault() != null);
            foreach (var p in markedProperties)
            {
                p.Dependency.IsOptional = false;
            }
        }
    }

And here are the tests:

    public class Dependency
    {
    }

    public class ClassWithoutRequiredProperties
    {
        public string StringProperty { get; set; }
    }

    public class ClassWithRequiredProperty
    {
        [RequiredProperty]
        public Dependency Dependency { get; set; }
    }

    public class ClassWithInheritedRequiredProperty : ClassWithRequiredProperty
    {
    }

    [TestFixture]
    public class WindsorRespectRequirementPropertiesTests
    {
        private IWindsorContainer _windsor;

        [SetUp]
        public void Setup()
        {
            _windsor = new WindsorContainer();
            _windsor.Kernel.ComponentModelBuilder.AddContributor(new RespectRequiredPropertiesContributor());
        }

        [Test]
        public void NoPropertiesAreRequired_ComponentCreated()
        {
            _windsor.Register(Component.For<ClassWithoutRequiredProperties>());
            var obj = _windsor.Resolve<ClassWithoutRequiredProperties>();
            Assert.NotNull(obj);
            Assert.IsNull(obj.StringProperty);
        }

        [Test]
        public void SomePropertiesAreRequired_DependencyMissing_Throws()
        {
            _windsor.Register(Component.For<ClassWithRequiredProperty>());
            Assert.Throws<HandlerException>(() => _windsor.Resolve<ClassWithRequiredProperty>());
        }

        [Test]
        public void SomePropertiesAreRequired_DependencyExists_ResolveSuccessfull()
        {
            _windsor.Register(Component.For<ClassWithRequiredProperty>());
            _windsor.Register(Component.For<Dependency>());
            var obj = _windsor.Resolve<ClassWithRequiredProperty>();
            Assert.IsNotNull(obj);
            Assert.IsNotNull(obj.Dependency);
        }

        [Test]
        public void SomePropertiesAreRequiredInBaseClass_DependencyMissing_Throws()
        {
            _windsor.Register(Component.For<ClassWithInheritedRequiredProperty>());
            Assert.Throws<HandlerException>(() => _windsor.Resolve<ClassWithInheritedRequiredProperty>());
        }

        [Test]
        public void SomePropertiesAreRequiredInBaseClass_DependencyExists_ResolveSuccessfull()
        {
            _windsor.Register(Component.For<ClassWithInheritedRequiredProperty>());
            _windsor.Register(Component.For<Dependency>());
            var obj = _windsor.Resolve<ClassWithInheritedRequiredProperty>();
            Assert.IsNotNull(obj);
            Assert.IsNotNull(obj.Dependency);
        }
    }

Now, you may use this attribute and Windsor will treat both constructor dependencies and marked property dependencies equally required.

But still, there’s one problem. What if someone will create some class by manually calling the constructor? Well, maybe he even will be attentive enough and checks which properties are required and sets them properly. But still such code would be defenseless against change in the class being created. What happens if later another (not so attentive) developer adds some required property? Clearly, he needs to check every site where class is created manually and satisfy that new dependency. In other words, allowing to create such classes manually is a way to make IoCC’s work by hand. Clearly not a good thing – if we have IoCC – we should use it, otherwise why it clutters the code?

But how one can prevent creating some classes in code, but still allow IoCC to create them? I will describe one way in part 2 shortly. It is not perfect, but it works and makes unintended error much less likely.

29/08/2012

Bug in C#/IL optimizations logic? o_O

Filed under: c# — Ivan Danilov @ 16:59

To be short: after half a day of minimizing bug case I got this code:

    [TestFixture]
    public class Tests
    {
        private class Relay
        {
            public Action Do { get; set; }
        }
        
        [Test]
        public void OptimizerStrangeness()
        {
            var relay = new Relay();
            var indicator = 0;
            relay.Do = () => indicator++;
            var weak = new WeakReference(relay);
            
            GC.Collect();

            var relayNew = weak.Target as Relay;
            if (relayNew == null) Assert.Fail();
            relayNew.Do();
            Assert.AreEqual(1, indicator);
        }
    }

Here is StackOverflow question on this topic.

Compiled under Debug mode test passes. Under Release mode it fails on highlighted line. relayNew variable is null there. But we still have relay variable in scope and haven’t assigned null to it. We still have strong reference! So, what’s happening?

I can only speculate, but my assumption is that optimizer was designed to follow such logic:

  1. relay variable is not accessed after assigning weak variable on line 15 (it could be easily proven via analyzing IL);
  2. we are about to call GC.Collect() manually, so probably we want to reclaim as much memory as possible;
  3. optimizer can “help” us by marking relay as not accessible and GC it with other garbage;

But it changes the visible behavior of the compiled code, so I assume it is a bug.

I checked generated IL and it is the same for Debug and Release mode (at least in regard to variable eligibility for GC). The only difference seems to be assembly-level attribute:

  • Debug mode:
    [assembly: Debuggable(DebuggableAttribute.DebuggingModes.Default
                        | DebuggableAttribute.DebuggingModes.DisableOptimizations
                        | DebuggableAttribute.DebuggingModes.IgnoreSymbolStoreSequencePoints 
                        | DebuggableAttribute.DebuggingModes.EnableEditAndContinue)]
    
  • Release mode:
    [assembly: Debuggable(DebuggableAttribute.DebuggingModes.IgnoreSymbolStoreSequencePoints)]
    

According to documentation it is related to JIT-optimizations, so my final assumption is that JIT-compiler makes things he thinks of as safe, but which are not.

Another indirect proof is that if you change code like below (only highlighted line added), test will pass in both Debug and Release modes:

        [Test]
        public void OptimizerStrangeness()
        {
            var relay = new Relay();
            var indicator = 0;
            relay.Do = () => indicator++;
            var weak = new WeakReference(relay);
            
            GC.Collect();

            var relayNew = weak.Target as Relay;
            if (relayNew == null) Assert.Fail();
            relayNew.Do();
            Assert.AreEqual(1, indicator);
            Assert.IsNotNull(relay);
        }

Here, optimizer can’t deduce that relay‘s use is trivial and cannot optimize it away before last line passed.

The problem is reproducible on .NET 2.0, .NET 4.0 and .NET 4.5 equally.

P.S. Actually I found the problem in real code, where I have weak event handlers that treated this dead weak reference as a message that event’s target has been GCed already and were unsubscribing from event source. And on top of it some ObservableCollection<T> that were using these weak events. It wasn’t an easy task to localize the problem. In fact even reproducing test failure was not trivial, because I usually build in Debug mode at my box and good long time was wondering why tests pass on my machine, but were failing on three different build agents.
Phew. At least it can be fixed without re-architecturing half of the app. 🙂

Update. Here is the comment I found in .NET Framework sources for GC.KeepAlive(object obj) method:

    // This method DOES NOT DO ANYTHING in and of itself.  It's used to
    // prevent a finalizable object from losing any outstanding references 
    // a touch too early.  The JIT is very aggressive about keeping an 
    // object's lifetime to as small a window as possible, to the point
    // where a 'this' pointer isn't considered live in an instance method 
    // unless you read a value from the instance.  So for finalizable
    // objects that store a handle or pointer and provide a finalizer that
    // cleans them up, this can cause subtle ----s with the finalizer
    // thread.  This isn't just about handles - it can happen with just 
    // about any finalizable resource.
    // 
    // Users should insert a call to this method near the end of a 
    // method where they must keep an object alive for the duration of that
    // method, up until this method is called.  Here is an example: 
    //
    // "...all you really need is one object with a Finalize method, and a
    // second object with a Close/Dispose/Done method.  Such as the following
    // contrived example: 
    //
    // class Foo { 
    //    Stream stream = ...; 
    //    protected void Finalize() { stream.Close(); }
    //    void Problem() { stream.MethodThatSpansGCs(); } 
    //    static void Main() { new Foo().Problem(); }
    // }
    //
    // 
    // In this code, Foo will be finalized in the middle of
    // stream.MethodThatSpansGCs, thus closing a stream still in use." 
    // 
    // If we insert a call to GC.KeepAlive(this) at the end of Problem(), then
    // Foo doesn't get finalized and the stream says open. 
    [System.Security.SecuritySafeCritical]  // auto-generated
    [ResourceExposure(ResourceScope.None)]
    [MethodImplAttribute(MethodImplOptions.InternalCall)]
    [ReliabilityContract(Consistency.WillNotCorruptState, Cer.Success)] 
    public static extern void KeepAlive(Object obj);

Also you may be interested in this excerpt from C# Language Specification, section 3.9:

If the object, or any part of it, cannot be accessed by any possible continuation of execution, other than the running of destructors, the object is considered no longer in use, and it becomes eligible for destruction. The C# compiler and the garbage collector may choose to analyze code to determine which references to an object may be used in the future. For instance, if a local variable that is in scope is the only existing reference to an object, but that local variable is never referred to in any possible continuation of execution from the current execution point in the procedure, the garbage collector may (but is not required to) treat the object as no longer in use.

So, now I’m convinced now that it is intended behavior. Good for performance, bad for clarity. Well, now I know that in such situations I have to remember to put GC.KeepAlive(relay) in the end (instead of NotNull assertion, that is meaningless in production code of course). Better I won’t forget this 🙂

Hail the unit-tests and hail the build server running them in Release mode!

17/08/2012

Implicit closures in C# lambdas, part 2

Filed under: c# — Ivan Danilov @ 17:59

Wow. I’ve just found that not all I said in first part is true (I’ve already marked wrong place so readers would not be confused). Shame on me.

My mistake was to rely on ReSharper’s warnings totally. So, in the ending of last post I had changed code to make copy of variable for closure like that:

        public static void DoSomething(object a, object b, Observer observer, Receiver r)
        {
            var receiverCopy = r;
            EventHandler action1 = (s, e) => receiverCopy.Method(a);
            EventHandler action2 = (s, e) => r.Method(b);
            observer.X += action1;
            observer.X += action2;
        }

And assumed that it is enough to convince compiler to capture variables for two lambdas in two different compiler-generated classes – based on the fact that ReSharper warning is gone. It turned out (by experiencing very real memory leak in unit-test first and then looking at generated IL to understand) absence of warning means that ReSharper is just unable to find such implicit closure, not that implicit closure is gone. Compiler is still generates single class for all these closures. And at this time it is probably an optimization from compiler’s side – namely, try to decrease number of types in the assembly by generating one less.

What I ended up with is hand-writing explicit class very similar to that in previous post. This solved problem with memory leak, and looking at IL proved that it helped. But code is now ugly. And worse – I don’t know how to spot such problems easily without thorough analysis of the code.

UPD: Pretty much the same thing in another blog post.

UPD2: As Eric answered at SO, this behavior was implemented because it was easier (for compiler developers) and part about lambda rewriting was already pretty complicated at the time. And, hopefully with Roslyn we may have better.

Also, good alternative was offered as well.

Implicit closures in C# lambdas, part 1

Filed under: c# — Ivan Danilov @ 16:38

In my day-to day work I was alerted once by R# warning saying ‘Implicitly captured closure: variableName’. After checking actual IL generated by compiler I found that R# warning is pretty much correct. So I did some ground-work and wrote small sample demonstrating this:

    public class Observer
    {
        public event EventHandler X = delegate { };
    }

    public class Receiver
    {
        public void Method(object o) {}
    }

    public class Program
    {
        public static void DoSomething(object a, object b, Observer observer, Receiver r)
        {
            EventHandler action1 = (s, e) => r.Method(a); //Implicitly captured closure: b
            EventHandler action2 = (s, e) => r.Method(b); //Implicitly captured closure: a
            observer.X += action1;
            observer.X += action2;
        }

        public static void Main(string[] args)
        {
            var observer = new Observer();
            var receiver = new Receiver();
            DoSomething(new object(), new object(), observer, receiver);
        }
    }

It seems the cause is simple: when several lambdas are sharing variables-need-to-be-captured (in this case r is present in both) – compiler makes only one class with unspeakable name that contains both methods and all of their variables.

Below is generated IL code. It is there just for completeness, because below I will explain these in terms of equivalent C#, so you don’t need to understand IL to get the idea. It is fairly long, so you most probably don’t want to see it all.

.class public auto ansi beforefieldinit ConsoleApplication1.Program
	extends [mscorlib]System.Object
{
	// Nested Types
	.class nested private auto ansi sealed beforefieldinit '<>c__DisplayClass2'
		extends [mscorlib]System.Object
	{
		.custom instance void [mscorlib]System.Runtime.CompilerServices.CompilerGeneratedAttribute::.ctor() = (
			01 00 00 00
		)
		// Fields
		.field public object a
		.field public object b
		.field public class ConsoleApplication1.Receiver r

		// Methods
		.method public hidebysig specialname rtspecialname 
			instance void .ctor () cil managed 
		{
			// Method begins at RVA 0x2104
			// Code size 7 (0x7)
			.maxstack 8

			IL_0000: ldarg.0
			IL_0001: call instance void [mscorlib]System.Object::.ctor()
			IL_0006: ret
		} // end of method '<>c__DisplayClass2'::.ctor

		.method public hidebysig 
			instance void '<DoSomething>b__0' (
				object s,
				class [mscorlib]System.EventArgs e
			) cil managed 
		{
			// Method begins at RVA 0x210c
			// Code size 19 (0x13)
			.maxstack 8

			IL_0000: ldarg.0
			IL_0001: ldfld class ConsoleApplication1.Receiver ConsoleApplication1.Program/'<>c__DisplayClass2'::r
			IL_0006: ldarg.0
			IL_0007: ldfld object ConsoleApplication1.Program/'<>c__DisplayClass2'::a
			IL_000c: callvirt instance void ConsoleApplication1.Receiver::Method(object)
			IL_0011: nop
			IL_0012: ret
		} // end of method '<>c__DisplayClass2'::'<DoSomething>b__0'

		.method public hidebysig 
			instance void '<DoSomething>b__1' (
				object s,
				class [mscorlib]System.EventArgs e
			) cil managed 
		{
			// Method begins at RVA 0x2120
			// Code size 19 (0x13)
			.maxstack 8

			IL_0000: ldarg.0
			IL_0001: ldfld class ConsoleApplication1.Receiver ConsoleApplication1.Program/'<>c__DisplayClass2'::r
			IL_0006: ldarg.0
			IL_0007: ldfld object ConsoleApplication1.Program/'<>c__DisplayClass2'::b
			IL_000c: callvirt instance void ConsoleApplication1.Receiver::Method(object)
			IL_0011: nop
			IL_0012: ret
		} // end of method '<>c__DisplayClass2'::'<DoSomething>b__1'

	} // end of class <>c__DisplayClass2


	// Methods
	.method public hidebysig static 
		void DoSomething (
			object a,
			object b,
			class ConsoleApplication1.Observer observer,
			class ConsoleApplication1.Receiver r
		) cil managed 
	{
		// Method begins at RVA 0x2134
		// Code size 72 (0x48)
		.maxstack 2
		.locals init (
			[0] class [mscorlib]System.EventHandler action1,
			[1] class [mscorlib]System.EventHandler action2,
			[2] class ConsoleApplication1.Program/'<>c__DisplayClass2' 'CS$<>8__locals3'
		)

		IL_0000: newobj instance void ConsoleApplication1.Program/'<>c__DisplayClass2'::.ctor()
		IL_0005: stloc.2
		IL_0006: ldloc.2
		IL_0007: ldarg.0
		IL_0008: stfld object ConsoleApplication1.Program/'<>c__DisplayClass2'::a
		IL_000d: ldloc.2
		IL_000e: ldarg.1
		IL_000f: stfld object ConsoleApplication1.Program/'<>c__DisplayClass2'::b
		IL_0014: ldloc.2
		IL_0015: ldarg.3
		IL_0016: stfld class ConsoleApplication1.Receiver ConsoleApplication1.Program/'<>c__DisplayClass2'::r
		IL_001b: nop
		IL_001c: ldloc.2
		IL_001d: ldftn instance void ConsoleApplication1.Program/'<>c__DisplayClass2'::'<DoSomething>b__0'(object, class [mscorlib]System.EventArgs)
		IL_0023: newobj instance void [mscorlib]System.EventHandler::.ctor(object, native int)
		IL_0028: stloc.0
		IL_0029: ldloc.2
		IL_002a: ldftn instance void ConsoleApplication1.Program/'<>c__DisplayClass2'::'<DoSomething>b__1'(object, class [mscorlib]System.EventArgs)
		IL_0030: newobj instance void [mscorlib]System.EventHandler::.ctor(object, native int)
		IL_0035: stloc.1
		IL_0036: ldarg.2
		IL_0037: ldloc.0
		IL_0038: callvirt instance void ConsoleApplication1.Observer::add_X(class [mscorlib]System.EventHandler)
		IL_003d: nop
		IL_003e: ldarg.2
		IL_003f: ldloc.1
		IL_0040: callvirt instance void ConsoleApplication1.Observer::add_X(class [mscorlib]System.EventHandler)
		IL_0045: nop
		IL_0046: nop
		IL_0047: ret
	} // end of method Program::DoSomething

	.method public hidebysig static 
		void Main (
			string[] args
		) cil managed 
	{
		// Method begins at RVA 0x2188
		// Code size 32 (0x20)
		.maxstack 4
		.entrypoint
		.locals init (
			[0] class ConsoleApplication1.Observer observer,
			[1] class ConsoleApplication1.Receiver receiver
		)

		IL_0000: nop
		IL_0001: newobj instance void ConsoleApplication1.Observer::.ctor()
		IL_0006: stloc.0
		IL_0007: newobj instance void ConsoleApplication1.Receiver::.ctor()
		IL_000c: stloc.1
		IL_000d: newobj instance void [mscorlib]System.Object::.ctor()
		IL_0012: newobj instance void [mscorlib]System.Object::.ctor()
		IL_0017: ldloc.0
		IL_0018: ldloc.1
		IL_0019: call void ConsoleApplication1.Program::DoSomething(object, object, class ConsoleApplication1.Observer, class ConsoleApplication1.Receiver)
		IL_001e: nop
		IL_001f: ret
	} // end of method Program::Main

	.method public hidebysig specialname rtspecialname 
		instance void .ctor () cil managed 
	{
		// Method begins at RVA 0x21b4
		// Code size 7 (0x7)
		.maxstack 8

		IL_0000: ldarg.0
		IL_0001: call instance void [mscorlib]System.Object::.ctor()
		IL_0006: ret
	} // end of method Program::.ctor

} // end of class ConsoleApplication1.Program

So, basically, what C# compiler did overall is something like this:

        // actually named <>c__DisplayClass2 in IL
        private class CompilerGeneratedClass 
        {
            public object a;
            public object b;
            public Receiver r;

            // actually named <DoSomething>b__0 in IL
            public void Method1(object s, EventArgs e)
            {
                 r.Method(a);
            }
            // actually named <DoSomething>b__1 in IL
            public void Method2(object s, EventArgs e)
            {
                r.Method(b);
            }
        }

        public static void DoSomething(object a, object b, Observer observer, Receiver r)
        {
            // actually named CS$<>8__locals3 in IL
            CompilerGeneratedClass helper = new CompilerGeneratedClass();
            helper.a = a;
            helper.b = b;
            helper.r = r;
            EventHandler action1 = helper.Method1;
            EventHandler action2 = helper.Method2;
            observer.X += action1;
            observer.X += action2;
        }

So, you see, no lambdas here. Compiler replaced every variable used in closure with nested class’ variable. That’s fine, but “what about subject and these implicit closures?” you may ask. Well, that’s simple from that point. In the original code you have two unconnected lambdas and might assume that as long as first one along with a object are not used anymore – they will be GCed regardless of current state of second lambda, right? But now it is clear that as long as CompilerGeneratedClass is alive – both a and b will live. And generated class will live at least as long as someone needs any of these two lambdas.

Essentially, we can count it as both lambdas have taken closure around both a and b variables. Thus ReSharper’s warning.

What problems can it bring to you as a developer? Suppose a is cheap object and action1 is passed to long-living service class; meanwhile b is costly object and action2 is used here once and forgotten. One assumes that latter one will be called and GCed along with costly b object almost immediately. But it will live as long as service would hold former lambda causing hard-to-find memory leak.

Why compiler can’t do better? I can’t say for certainty, so these are only my speculations on the subject. Nevertheless. Why compiler can’t create two generated classes so that one would capture r and a, while second would capture r and b? It could, and in this simple case it will be ok. But in more complicated cases it’ll lead to problems. These problems would be more severe that implicit captures. For one example, suppose method with lambdas is not only captures essentially read-only arguments, but also writes to them. How would you capture variable that has writes to it to two different generated classes? And even if you manage to do that – how would you synchronize these two values? And what about multi-threaded code? You see, a bunch of hard questions arises the very moment you allow one variable to be ‘distributed’ between several generated classes.

So, what can I do to avoid leaks and problems? Well, ReSharper has pretty good eye for these things, so watch out for its warnings. And if one arises – try to solve it. [The following advice is not correct! See part 2 for details] General solution is to copy captured variable to another variable manually – it is what compiler can’t do safely for you because of reasons above, but you know context better and in most cases can do this easily. For example, for my code it would be this:

        public static void DoSomething(object a, object b, Observer observer, Receiver r)
        {
            var receiverCopy = r;
            EventHandler action1 = (s, e) => receiverCopy.Method(a);
            EventHandler action2 = (s, e) => r.Method(b);
            observer.X += action1;
            observer.X += action2;
        }

That’s it. In some cases, though, you can eliminate some captures without much efforts at all. For example, sender in events references what you need – use it, not captures if it helps!

Create a free website or blog at WordPress.com.