C# – Justification for Nullable Behavior with Implicit Conversion

.netc++implicit-conversionnullable

I encountered some interesting behavior in the interaction between Nullable and implicit conversions. I found that providing an implicit conversion for a reference type from a value type it permits the Nullable type to be passed to a function requiring the reference type when I instead expect a compilation error. The below code demonstrates this:

static void Main(string[] args)
{
    PrintCatAge(new Cat(13));
    PrintCatAge(12);
    int? cat = null;
    PrintCatAge(cat);
}

private static void PrintCatAge(Cat cat)
{
    if (cat == null)
        System.Console.WriteLine("What cat?");
    else
        System.Console.WriteLine("The cat's age is {0} years", cat.Age);
}

class Cat
{
    public int Age { get; set; }
    public Cat(int age)
    {
        Age = age;
    }

    public static implicit operator Cat(int i)
    {
        System.Console.WriteLine("Implicit conversion from " + i);
        return new Cat(i);
    }
}

Output:

The cat's age is 13 years
Implicit conversion from 12
The cat's age is 12 years
What cat?

If the conversion code is removed from Cat then you get the expected errors:

Error 3 The best overloaded method match for 'ConsoleApplication2.Program.PrintCatAge(ConsoleApplication2.Program.Cat)' has some invalid arguments

Error 4 Argument 1: cannot convert from 'int?' to 'ConsoleApplication2.Program.Cat

If you open the executable with ILSpy the code that was generated is as follows

int? num = null;
Program.PrintCatAge(num.HasValue ? num.GetValueOrDefault() : null);

In a similar experiment I removed the conversion and added an overload to PrintCatAge that takes an int (not nullable) to see if the compiler would perform a similar operation, but it does not.

I understand what is happening, but I don't understand the justification for it. This behavior is unexpected to me and seems odd. I did not have any success finding any reference to this behavior on MSDN in the documentation for conversions or Nullable<T>.

The question I pose then is, is this intentional and is there a explanation why this is happening?

Best Answer

I said earlier that (1) this is a compiler bug and (2) it is a new one. The first statement was accurate; the second was me getting confused in my haste to get to the bus on time. (The bug I was thinking of that is new to me is a much more complicated bug involving lifted conversions and lifted increment operators.)

This is a known compiler bug of long standing. Jon Skeet first brought it to my attention some time ago and I believe there's a StackOverflow question about it somewhere; I do not recall where offhand. Perhaps Jon does.

So, the bug. Let's define a "lifted" operator. If an operator converts from a non-nullable value type S to a non-nullable value type T then there is also a "lifted" operator that converts from S? to T?, such that a null S? converts to a null T? and a non-null S? converts to T? by unwrapping S? to S, converting S to T, and wrapping T to T?.

The specification says that (1) the only situation in which there is a lifted operator is when S and T are both non-nullable value types, and (2) that the lifted and non-lifted conversion operators are both considered as to whether they are applicable candidates for the conversion and if both applicable, then the source and target types of the applicable conversions, lifted or unlifted, are used to determine the best source type, best target type, and ultimately, best conversion of all the applicable conversions.

Unfortunately, the implementation thoroughly violates all of these rules, and does so in a way that we cannot change without breaking many existing programs.

First off, we violate the rule about the existence of lifted operators. A lifted operator is considered by the implementation to exist if S and T are both non-nullable value types, or if S is a non-nullable value type and T is any type to which a null could be assigned: reference type, nullable value type, or pointer type. In all those cases we produce a lifted operator.

In your particular case, we lift to nullable by saying that we convert a nullable type to the reference type Cat by checking for null. If the source is not null then we convert normally; if it is, then we produce a null Cat.

Second, we violate thoroughly the rule about how to determine the best source and target types of applicable candidates when one of those candidates is a lifted operator, and we also violate the rules about determining which is the best operator.

In short, it is a big mess that cannot be fixed without breaking real customers, and so we will likely enshrine the behaviour in Roslyn. I will consider documenting the exact behaviour of the compiler in my blog at some point, but I would not hold my breath while waiting for that day if I were you.

And of course, many apologies for the errors.

Related Solutions

C# – Curious Null-Coalescing Operator Custom Implicit Conversion Behavior

Thanks to everyone who contributed to analyzing this issue. It is clearly a compiler bug. It appears to only happen when there is a lifted conversion involving two nullable types on the left-hand side of the coalescing operator.

I have not yet identified where precisely things go wrong, but at some point during the "nullable lowering" phase of compilation -- after initial analysis but before code generation -- we reduce the expression

result = Foo() ?? y;

from the example above to the moral equivalent of:

A? temp = Foo();
result = temp.HasValue ? 
    new int?(A.op_implicit(Foo().Value)) : 
    y;

Clearly that is incorrect; the correct lowering is

result = temp.HasValue ? 
    new int?(A.op_implicit(temp.Value)) : 
    y;

My best guess based on my analysis so far is that the nullable optimizer is going off the rails here. We have a nullable optimizer that looks for situations where we know that a particular expression of nullable type cannot possibly be null. Consider the following naive analysis: we might first say that

result = Foo() ?? y;

is the same as

A? temp = Foo();
result = temp.HasValue ? 
    (int?) temp : 
    y;

and then we might say that

conversionResult = (int?) temp

is the same as

A? temp2 = temp;
conversionResult = temp2.HasValue ? 
    new int?(op_Implicit(temp2.Value)) : 
    (int?) null

But the optimizer can step in and say "whoa, wait a minute, we already checked that temp is not null; there's no need to check it for null a second time just because we are calling a lifted conversion operator". We'd them optimize it away to just

new int?(op_Implicit(temp2.Value))

My guess is that we are somewhere caching the fact that the optimized form of (int?)Foo() is new int?(op_implicit(Foo().Value)) but that is not actually the optimized form we want; we want the optimized form of Foo()-replaced-with-temporary-and-then-converted.

Many bugs in the C# compiler are a result of bad caching decisions. A word to the wise: every time you cache a fact for use later, you are potentially creating an inconsistency should something relevant change. In this case the relevant thing that has changed post initial analysis is that the call to Foo() should always be realized as a fetch of a temporary.

We did a lot of reorganization of the nullable rewriting pass in C# 3.0. The bug reproduces in C# 3.0 and 4.0 but not in C# 2.0, which means that the bug was probably my bad. Sorry!

I'll get a bug entered into the database and we'll see if we can get this fixed up for a future version of the language. Thanks again everyone for your analysis; it was very helpful!

UPDATE: I rewrote the nullable optimizer from scratch for Roslyn; it now does a better job and avoids these sorts of weird errors. For some thoughts on how the optimizer in Roslyn works, see my series of articles which begins here: https://ericlippert.com/2012/12/20/nullable-micro-optimizations-part-one/

C# – Implicit Casting of Null-Coalescing Operator Result

To make the second case work with the ternary operator, you could use the following:

int result = input != null ? input.Value : 10;

The Value property of the Nullable<T> type returns the T value (in this case, the int).

Another option is to use Nullable<T>.HasValue:

int result = input.HasValue ? input.Value : 10;

The myNullableInt != null construct is only syntactic sugar for the above HasValue call.

Best Answer

Related Solutions

C# – Curious Null-Coalescing Operator Custom Implicit Conversion Behavior

C# – Implicit Casting of Null-Coalescing Operator Result

Related Question