(1 item) |
|
(1 item) |
|
(5 items) |
|
(1 item) |
|
(1 item) |
|
(2 items) |
|
(2 items) |
|
(4 items) |
|
(1 item) |
|
(6 items) |
|
(2 items) |
|
(4 items) |
|
(1 item) |
|
(4 items) |
|
(2 items) |
|
(1 item) |
|
(1 item) |
|
(1 item) |
|
(1 item) |
|
(1 item) |
|
(1 item) |
|
(1 item) |
|
(1 item) |
|
(2 items) |
|
(2 items) |
|
(5 items) |
|
(3 items) |
|
(1 item) |
|
(1 item) |
|
(1 item) |
|
(3 items) |
|
(1 item) |
|
(1 item) |
|
(2 items) |
|
(8 items) |
|
(2 items) |
|
(7 items) |
|
(2 items) |
|
(2 items) |
|
(1 item) |
|
(2 items) |
|
(1 item) |
|
(2 items) |
|
(4 items) |
|
(1 item) |
|
(5 items) |
|
(1 item) |
|
(3 items) |
|
(2 items) |
|
(2 items) |
|
(8 items) |
|
(7 items) |
|
(3 items) |
|
(7 items) |
|
(6 items) |
|
(1 item) |
|
(2 items) |
|
(5 items) |
|
(5 items) |
|
(7 items) |
|
(3 items) |
|
(7 items) |
|
(16 items) |
|
(10 items) |
|
(27 items) |
|
(15 items) |
|
(15 items) |
|
(13 items) |
|
(16 items) |
|
(15 items) |
I really like C#. I wanted to say that up front because this blog entry is about a feature of C# I don't like
much, and I'd hate you to get the wrong idea... And if you're afflicted by the all too common disorder of
disliking anything produced by Microsoft regardless of merit, then I'd better point out that Java has a closely
related inconsistency, and that the C# and Java manifestations of this issue are but shadows of the
equivalent problem that afflicts standard C++. (And yes, I know about the new(ish) cast operators in
C++ like static_cast
and so on, and they only offer a partial solution.)
Also, my apologies to anyone subscribed both to this blog and also to the DevelopMentor DOTNET-WEB mailing list, where I just posted something very similar. It occurred to me after I posted there that this topic seems to come up pretty often in various places. I reckoned that if I blog about it, next time it comes up all I will need to write is the URL of this entry!
The C# language provides a cast operator which, superficially, seems pretty straightforward. You use it when you
have a variable of one kind, and you'd like to store its value in a variable of a different kind. You don't always
need it - sometimes the C# compiler will happily let you perform certain kinds of assignments without a cast.
The rule it uses is, roughly speaking: if the assignment is safe, no cast is required. For example, it's always
safe to assign the value of an int
(32-bit signed integer) into a variable of
type long
(64-bit signed integer) because there can never be any loss of
information. (A long
can accurately represent any value that an int
can
hold.)
Another example of a 'safe' conversion is where you have a reference to an object and assign that into
a variable of 'compatible' type. Compatible types are any interfaces the source type implements,
its base class, and any types its base class is compatible with. For example, everything derives
(directly or indirectly) from object
, so it's always safe to assign
something into a variable of type object
. Such assignments require
no cast.
These two examples are really quite different. The first involves a change in
representation - a conversion from int
to long
.
The compiler has to generate a special IL instruction to do this: conv.i8
. The second example required
no such conversion. Consider the following code:
Foo fooRef = new Foo() object objRef = fooRef;
All I'm doing is changing the type of reference I'm using to refer to a particular object - I have a reference of
type Foo
and I am assigning it into a variable of type object
. The
underlying representation does not change - the two variables refer to the exact same object. (And as it happens,
in the current CLR implementation, the references look identical too - they are both pointers pointing to the same
address.) The C# compiler doesn't have to generate any code to convert the reference here - it just stores
a copy without conversion (typically using the same stloc
instruction it would have used
if the two variables were of the same type).
So it seems that assignment can mean two different things. It might mean "make an exact copy", or it might mean "convert to a different representation". The same ambiguity applies to casts.
Casts are used for assignments which are not necessarily safe. For example, if I have a
long
value, and wish to store it in a variable of
type int
, the compiler will not allow me to do so with a simple
assignment, because information may be lost. I have to opt in to the lossy conversion by casting
the value:
int i = (int) someLongValue;
This causes the compiler to generate another conversion instruction. (conv.i4
)
Alternatively, I may have a variable of some reference type, and I would like to store that reference
in a variable of some more specific type, such as a class derived from the existing variable's type.
For example, if I have a variable of type object
, I may have good
reason to believe that the variable refers to an object which is really of type Foo
. I
have to use the cast operator to declare this belief in order to assign the reference into a variable
of the appropriate type:
Foo f = (Foo) someVariableOfTypeObject;
This assignment is not implicitly safe - it's possible that I'm wrong, and that the variable in fact
contains a reference to some unrelated type, Bar
. To store a reference to an
object in an incompatible variable is against the .NET type safety rules. The C# compiler
therefore keeps me honest - before performing the assignment, it generrates a
castclass
opcode that checks that the objects really is compatible with
Foo
. If this fails, it will throw an exception, preventing the assignment from
occurring, and maintaining the type safety of the system. (The CLR verifies code when it loads
it and will check that the necessary castclass
operations are present whenever
such assignments are performed.)
In other words, this cast has caused a runtime type check to be performed. That's a very different kind of a thing from the previous cast, which performed a numeric conversion. In fact there are several different things that the cast operator might mean. Depending on the context, it could mean any of the following:
int
to an int
.)
The C# compiler won't complain - it just ignores the cast.
castclass
opcode).Specify scope. Sometimes it is necessary to specify the scope of a member as well as its name. For example, a class that implements a particular interface may choose not to place the methods for that interface in the class's public API - it can instead use explicit interface implementation. The methods of the interface are then usually only available when you access the object through a variable whose type is that interface. However, you can also use this syntax:
((IDisposable) someObj).Dispose();
The meaning of this case depends on the type of someObj
. If the
type of someObj
is compatible with IDisposable
, the
compiler will not generate a runtime type check (i.e. it will not generate the IL
castclass
instruction) because the check is unnecessary. It will just
generate the appropriate IL to call the method. (In IL, method calls are always scoped
with the defining type. In this case it would appear as callvirt instance void IDisposable::Dispose()
.) If
the compiler cannot be sure that the object is compatible with IDisposable
,
then this syntax causes a runtime type check - it then becomes equivalent to the
previous item in this list.
object
(or of any interface type) to a value type (such as int
). A
cast in this context will cause the compiler to generate an unbox
opcode.implicit
, meaning that even a simple assignment
can invoke the conversion function!)This wide array of context-sensitive behaviours can lead to some confusing results. For example this compiles and runs just fine:
long longVal = 42L; int intVal = (int) longVal;
but this almost identical code does not work - it compiles, but fails at runtime with an InvalidCastException:
object longVal = 42L; int intVal = (int) longVal;
This tends to confuse beginners. They look at the second line and believe that they have
asked the compiler to change longVal
into an int
, and do not
understand why it doesn't work even though the first example does. The reason is that these
two examples invoke different cast behaviours. The first invokes the numeric conversion
behaviour, while the second invokes the unbox behaviour. Unboxing will never perform
numeric conversions - if you attempt to unbox to the wrong type, you simply get an
InvalidCastException
. The 'right' thing to do here looks even more odd:
object longVal = 42L; int intVal = (int) ((long) longVal);
Here the two casts are operating in different modes. The (long
) cast
is performing an unbox, and the (int
) cast is performing a numeric conversion
on the resulting unboxed int.
So why on earth does C# pack so many different meanings into a single operator? Mainly because
that's what C++ does. (As does Java.) It's much worse in C++ because there are yet more things that a
cast might mean. In fact C++ introduced four new keywords into the language to replace the old-style
cast. (And even one of those can mean different things in different contexts! static_cast
can be used for disambiguation, unsafe downcasts (without a runtime type check) and numeric
conversions. The fact that these new keywords were deemed necessary, and that the old-style
cast operator is discouraged in C++ makes it all the more odd that C# chose to stick with the old
C-style cast syntax and overload its meanings so heavily.
*sigh*
But apart from that I really like C#. :-)