User-Defined Casts
Earlier, this chapter examined how you can convert
values between predefined data types. You saw that this is done
through a process of casting. You also saw
that C# allows two different types of casts: implicit and
explicit.
For an explicit cast, you explicitly mark the cast in your code by writing the
destination data type inside parentheses:
For the predefined data types, explicit casts are
required where there is a risk that the cast might fail or some
data might be lost. The following are some examples:
-
When converting from an int to a short, because
the short might not be large enough to hold the value of the
int.
-
When converting from signed to unsigned data
types will return incorrect results if the signed variable holds a
negative value.
-
When converting from floating-point to
integer data types, the fractional part of the number will be
lost.
-
When converting from a nullable type to a
non-nullable type, a value of null will
cause an exception.
By making the cast explicit in your code, C# forces
you to affirm that you understand there is a risk of data loss, and
therefore presumably you have written your code to take this into
account.
Because C# allows you to define your own data types
(structs and classes), it follows that you will need the facility
to support casts to and from those data types. The mechanism is
that you can define a cast as a member operator of one of the
relevant classes. Your cast operator must be marked as either
implicit or explicit to indicate how you are intending it to be
used. The expectation is that you follow the same guidelines as for
the predefined casts: if you know the cast is always safe whatever
the value held by the source variable, then you define it as
implicit. If, on the other hand, you
know there is a risk of something going wrong for certain values -
perhaps some loss of data or an exception being thrown - then you
should define the cast as explicit.
|
|
Important |
You should define any custom casts you write
as explicit if there are any source data values for which the cast
will fail or if there is any risk of an exception being thrown.
|
The syntax for defining a cast is similar to that
for overloading operators discussed earlier in this chapter. This
is not a coincidence, because a cast is regarded as an operator
whose effect is to convert from the source type to the destination type. To illustrate the
syntax, the following is taken from an example struct named Currency,
which is introduced later in this section:
The return type of the operator defines the target
type of the cast operation, and the single parameter is the source
object for the conversion. The cast defined here allows you to
implicitly convert the value of a Currency into a float.
Note that if a conversion has been declared as implicit, the compiler will permit its use either
implicitly or explicitly. If it has been declared as explicit, the compiler will only permit it to be
used explicitly. In common with other operator overloads, casts
must be declared as both public and
static.
|
|
Tip |
C++ developers will notice that this is
different from what they are used to with C++, in which casts are
instance members of classes.
|
Implementing User-Defined Casts
This section illustrates the use of implicit
and explicit user-defined casts in an example called SimpleCurrency (which, as usual, is found in the
code download). In this example, you define a struct, Currency, which holds a positive USD ($) monetary
value. C# provides the decimal type for
this purpose, but it is possible you might still want to write your
own struct or class to represent monetary values if you want to
perform sophisticated financial processing and therefore want to
implement specific methods on such a class.
|
|
Tip |
The syntax for casting is the same for
structs and classes. This example happens to be for a struct, but
would work just as well if you declared Currency as a class.
|
Initially, the definition of the Currency struct is:
The use of unsigned data types for the Dollar and Cents fields
ensures that a Currency instance can
only hold positive values. It is restricted this way in order to
illustrate some points about explicit casts later on. You might want to use a
class like this to hold, for example, salary information for
employees of a company (people’s salaries tend not to be
negative!). To keep the class simple, the fields are public, but
usually, you would make them private and
define corresponding properties for the dollars and cents.
Start off by assuming that you want to be able to
convert Currency instances to
float values, where the integer part of
the float represents the dollars. In
other words, you would like to be able to write code like this:
To be able to do this, you need to define a cast.
Hence, you add the following to your Currency definition:
The cast above is implicit. It is a sensible choice
in this case, because, as should be clear from the definition of
Currency, any value that can be stored
in the currency can also be stored in a float. There’s no way that anything should ever go
wrong in this cast.
|
|
Tip |
There is a slight cheat here - in fact, when
converting a uint to a float, there can be a loss in precision, but
Microsoft has deemed this error sufficiently marginal to count the
uint-to-float cast as implicit.
|
However, if you have a float that you would like to be converted to a
Currency, the conversion is not
guaranteed to work: a float can store
negative values, which Currency
instances can’t, and a float can store
numbers of a far higher magnitude than can be stored in the
(uint) Dollar
field of Currency. So if a float contains an inappropriate value, converting it
to a Currency could give unpredictable
results. As a result of this risk, the conversion from float to Currency should
be defined as explicit. Here is the first attempt, which won’t give
quite the correct results, but it is instructive to examine
why:
The following code will now successfully
compile:
However, the following code, if you tried it, would
generate a compilation error, because it attempts to use an
explicit cast implicitly:
By making the cast explicit, you warn the developer
to be careful because data loss might occur. However, as you soon
see, this isn’t how you want your Currency struct to behave. Try writing a test
harness and running the sample. Here is the Main() method, which instantiates a Currency struct and attempts a few conversions. At
the start of this code, you write out the value of balance in two
different ways (because this will be needed to illustrate something
later on in the example):
Notice that the entire code is placed in a
try block to catch any exceptions that
occur during your casts. Also, the lines that test converting an
out-of-range value to Currency are
placed in a checked block in an attempt
to trap negative values. Running this code gives this output:
SimpleCurrency
This output shows that the code didn’t quite work
as expected. First, converting back from float to Currency gave a
wrong result of $50.34 instead of
$50.35. Second, no exception was
generated when you tried to convert an obviously out-of-range
value.
The first problem is caused by rounding errors. If
a cast is used to convert from a float
to a uint, the computer will truncate the number rather than rounding it. The computer stores numbers in binary
rather than decimal, and the fraction 0.35 cannot be exactly
represented as a binary fraction (just like 1/3 cannot be
represented exactly as a decimal fraction; it comes out as 0.3333
recurring). The computer ends up storing a value very slightly
lower than 0.35 and that can be represented exactly in binary
format. Multiply by 100 and you get a number fractionally less than
35, which gets truncated to 34 cents. Clearly in this situation,
such errors caused by truncation are serious, and the way to avoid
them is to ensure that some intelligent rounding is performed in
numerical conversions instead. Luckily, Microsoft has written a
class that will do this: System.Convert.
The System.Convert object contains a
large number of static methods to perform various numerical
conversions, and the one that we want is Convert.ToUInt16(). Note that the extra care taken
by the System.Convert methods does come
at a performance cost. You should only use them when you need
them.
Let’s examine why the expected overflow exception
wasn’t thrown. The problem here is this: the place where the
overflow really occurs isn’t actually in the Main() routine at all - it is inside the code for
the cast operator, which is called from the Main() method. The code in this method wasn’t marked
as checked.
The solution here is to ensure that the cast itself
is computed in a checked context too.
With both of these changes, the revised code for the conversion
looks like the following:
Note that you use Convert.ToUInt16() to calculate the cents, as
described earlier, but you do not use it for calculating the dollar
part of the amount. System.Convert is
not needed when working out the dollar amount because truncating
the float value is what you want
there.
|
|
Tip |
It is worth noting that the System.Convert methods also carry out their own
overflow checking. Hence, for the particular case we are
considering, there is no need to place the call to Convert.ToUInt16() inside the checked context. The
checked context is still required, however, for the explicit
casting of value to dollars.
|
You won’t see a new set of results with this new
checked cast just yet, because you have
some more modifications to make to the SimpleCurrency example later in this section.
|
|
Tip |
If you are defining a cast that will be used
very often, and for which performance is at an absolute premium,
you may prefer not to do any error checking. That’s also a
legitimate solution, provided that the behavior of your cast and
the lack of error checking are very clearly documented.
|
Casts between Classes
The Currency
example involves only classes that convert to or from float - one of the predefined data types. However,
it is not necessary to involve any of the simple data types. It is
perfectly legitimate to define casts to convert between instances
of different structs or classes that you have defined. You need to
be aware of a couple of restrictions, however:
-
You cannot define a cast if one of the
classes is derived from the other (these types of cast already
exist, as you will see).
-
The cast must be defined inside the
definition of either the source or destination data type.
To illustrate these requirements, suppose that you
have the class hierarchy shown in Figure 6-1.
In other words, classes C and D are indirectly
derived from A. In this case, the only
legitimate user-defined cast between A,
B, C, or
D would be to convert between classes
C and D,
because these classes are not derived from each other. The code to
do so might look like this (assuming that you want the casts to be
explicit, which is usually the case when defining casts between
user-defined casts):
For each of these casts, you have a choice of where
you place the definitions - inside the class definition of
C or inside the class definition of
D, but not anywhere else. C# requires
you to put the definition of a cast inside either the source class
(or struct) or the destination class (or struct). A side effect of
this is that you can’t define a cast between two classes unless you
have access to edit the source code for at least one of them. This
is sensible because it prevents third parties from introducing
casts into your classes.
Once you have defined a cast inside one of the
classes, you can’t also define the same cast inside the other
class. Obviously, there should be only one cast for each conversion
- otherwise, the compiler wouldn’t know which one to pick.
Casts between Base and Derived Classes
To see how these casts work, start by
considering the case where the source and destination are both
reference types, and consider two classes, MyBase and MyDerived,
where MyDerived is derived directly or
indirectly from MyBase.
First, from MyDerived to
MyBase; it is always possible (assuming
the constructors are available) to write:
In this case, you are casting implicitly from
MyDerived to MyBase. This works because of the rule that any
reference to a type MyBase is allowed to
refer to objects of class MyBase or to
objects of anything derived from MyBase.
In OO programming, instances of a derived class are, in a real
sense, instances of the base class, plus something extra. All the
functions and fields defined on the base class are defined in the
derived class too.
Alternatively, you can write:
This code is perfectly legal C# (in a syntactic
sense, that is) and illustrates casting from a base class to a
derived class. However, the final statement will throw an exception
when executed. What happens when you perform the cast is that the
object being referred to is examined. Because a base class
reference can in principle refer to a derived class instance, it is
possible that this object is actually an instance of the derived
class that you are attempting to cast to. If that’s the case, the
cast succeeds, and the derived reference is set to refer to the
object. If, however, the object in question is not an instance of
the derived class (or of any class derived from it), the cast fails
and an exception is thrown.
Notice the casts that the compiler has supplied,
which convert between base and derived class, do not actually do
any data conversion on the object in question. All they do is set
the new reference to refer to the object if it is legal for that
conversion to occur. To that extent, these casts are very different
in nature from the ones that you will normally define yourself. For
example, in the SimpleCurrency example
earlier, you defined casts that convert between a Currency struct and a float. In the float-to-Currency cast,
you actually instantiated a new Currency
struct and initialized it with the required values. The predefined
casts between base and derived classes do not do this. If you
actually want to convert a MyBase
instance into a real MyDerived object
with values based on the contents of the MyBase instance, you would not be able to use the
cast syntax to do this. The most sensible option is usually to
define a derived class constructor that takes a base class instance
as a parameter and have this constructor perform the relevant
initializations:
Boxing and Unboxing Casts
The previous discussion focused on casting
between base and derived classes where both participants were
reference types. Similar principles apply when casting value types,
although in this case it is not possible to simply copy references
- some copying of data must take place.
It is not, of course, possible to derive from
structs or primitive value types. Casting between base and derived
structs invariably means casting between a primitive type or a
struct and System.Object (theoretically,
it is possible to cast between a struct and System.ValueType, though it is hard to see why you
would want to do this).
The cast from any struct (or primitive type) to
object is always available as an
implicit cast - because it is a cast from a derived to a base type
- and is just the familiar process of boxing. For example, with the
Currency struct:
When this implicit cast is executed, the contents
of balance are copied onto the heap into
a boxed object, and the baseCopy object
reference set to this object. What actually happens behind the
scenes is this: When you originally defined the Currency struct, the .NET Framework implicitly
supplied another (hidden) class, a boxed Currency class, which contains all the same fields
as the Currency struct, but is a
reference type, stored on the heap. This happens whenever you
define a value type - whether it is a struct or enum, and
similar boxed reference types exist corresponding to all the
primitive value types of int,
double, uint,
and so on. It is not possible, or necessary, to gain direct
programmatic access to any of these boxed classes in source code,
but they are the objects that are working behind the scenes
whenever a value type is cast to object.
When you implicitly cast Currency to
object, a boxed Currency instance gets instantiated and initialized
with all the data from the Currency
struct. In the preceding code, it is this boxed Currency instance that baseCopy will refer to. By these means, it is
possible for casting from derived to base type to work
syntactically in the same way for value types as for reference
types.
Casting the other way is known as unboxing. Just as for casting between a base
reference type and a derived reference type, it is an explicit
cast, because an exception will be thrown if the object being cast
is not of the correct type:
The code above works analogously to the similar
code presented earlier for reference types. Casting derivedObject to Currency
works fine because derivedObject
actually refers to a boxed Currency
instance - the cast will be performed by copying the fields out of
the boxed Currency object into a new
Currency struct. The second cast fails
because baseObject does not refer to a
boxed Currency object.
When using boxing and unboxing, it is important to
understand both processes actually copy the data into the new boxed
or unboxed object. Hence, manipulations on the boxed object, for
example, will not affect the contents of the original value
type.
Multiple Casting
One thing you will have to watch for when you
are defining casts is that if the C# compiler is presented with a
situation in which no direct cast is available to perform a
requested conversion, it will attempt to find a way of combining
casts to do the conversion. For example, with the Currency struct, suppose the compiler encounters a
couple of lines of code like this:
You first initialize a Currency instance, and then you attempt to convert
it to a long. The trouble is that you
haven’t defined the cast to do that. However, this code will still
compile successfully. What will happen is that the compiler will
realize that you have defined an implicit cast to get from
Currency to float, and the compiler already knows how to
explicitly cast a float to a
long. Hence, it will compile that line
of code into IL code that converts balance first to a float,
and then converts that result to a long.
The same thing happens in the final line of the code, when you
convert balance to a double. However, because the cast from Currency to float and the
predefined cast from float to
double are both implicit, you can write
this conversion in your code as an implicit cast. If you’d
preferred, you could have specified the casting route
explicitly:
However, in most cases, this would be seen as
needlessly complicating your code. The following code by contrast
would produce a compilation error:
The reason is that the best match for the
conversion that the compiler can find is still to convert first to
float then to long. The conversion from float to long needs to be
specified explicitly, though.
All this by itself shouldn’t give you too much
trouble. The rules are, after all, fairly intuitive and designed to
prevent any data loss from occurring without the developer knowing
about it. However, the problem is that if you are not careful when
you define your casts, it is possible for the compiler to figure
out a path that leads to unexpected results. For example, suppose
that it occurs to someone else in the group writing the
Currency struct that it would be useful
to be able to convert a uint containing
the total number of cents in an amount into a Currency (cents not dollars because the idea is not
to lose the fractions of a dollar). So, this cast might be written
to try to achieve this:
Note the u after the
first 100 in this code to ensure that value/100u is interpreted as a uint. If you’d written value/100, the compiler would have interpreted this
as an int, not a uint.
Don’t do this is clearly
commented in this code, and here’s why. Look at the following code
snippet; all it does is convert a uint
containing 350 into a Currency and back again. What do you think
bal2 will contain after executing
this?
The answer is not 350
but 3! And it all follows logically. You
convert 350 implicitly to a Currency, giving the result balance.Dollars = 3, balance.Cents = 50. Then the compiler does its usual
figuring out of best path for the conversion back. Balance ends up getting implicitly converted to a
float (value 3.5), and this is converted explicitly to a
uint with value 3.
Of course, other instances exist in which
converting to another data type and back again causes data loss.
For example, converting a float
containing 5.8 to an int and back to a float
again will lose the fractional part, giving you a result of
5, but there is a slight difference in
principle between losing the fractional part of a number and
dividing an integer by more than 100! Currency has suddenly become a rather dangerous
class that does strange things to integers!
The problem is that there is a conflict between how
your casts interpret integers. The casts between Currency and float
interpret an integer value of 1 as
corresponding to one dollar, but the latest uint-to-Currency cast
interprets this value as one cent. This is an example of very poor
design. If you want your classes to be easy to use, you should make
sure all your casts behave in a way that is mutually compatible, in
the sense that they intuitively give the same results. In this
case, the solution is obviously to rewrite the uint-to-Currency cast so
that it interprets an integer value of 1
as one dollar:
Incidentally, you might wonder whether this new
cast is necessary at all. The answer is that it could be useful.
Without this cast, the only way for the compiler to carry out a
uint-to-Currency conversion would be via a float. Converting directly is a lot more efficient
in this case, so having this extra cast provides performance
benefits, but you need to make sure it gives the same result as you
would get going via a float, which you
have now done. In other situations, you may also find that
separately defining casts for different predefined data types
allows more conversions to be implicit rather than explicit, though
that’s not the case here.
A good test of whether your casts are compatible is
to ask whether a conversion will give the same results (other than
perhaps a loss of accuracy as in float-to-int
conversions), irrespective of which path it takes. The Currency class provides a good example of this. Look
at this code:
At present, there is only one way that the compiler
can achieve this conversion: by converting the Currency to a float
implicitly, then to a ulong explicitly.
The float-to-ulong conversion requires an explicit conversion,
but that’s fine because you have specified one here.
Suppose, however, that you then added another cast,
to convert implicitly from a Currency to
a uint. You will actually do this by
modifying the Currency struct by adding
the casts both to and from uint. This
code is available as the SimpleCurrency2
example:
Now the compiler has another possible route to
convert from Currency to ulong: to convert from Currency to uint
implicitly then to ulong implicitly.
Which of these two routes will it take? C# does have some precise
rules (which are not detailed in this site; if you are interested,
details are in the MSDN documentation) to say how the compiler
decides which is the best route if there are several possibilities.
The best answer is that you should design your casts so that all
routes give the same answer (other than possible loss of
precision), in which case it doesn’t really matter which one the
compiler picks. (As it happens in this case, the compiler picks the
Currency-to-uint-to-ulong route in
preference to Currency-to-float-to-ulong.)
To test the SimpleCurrency2 sample, add this code to the test
code for SimpleCurrency:
Running the sample now gives you these results:
SimpleCurrency2
The output shows that the conversion to
uint has been successful, though as
expected, you have lost the cents part of the Currency in making this conversion. Casting a
negative float to Currency has also produced the expected overflow
exception now that the float-to-Currency cast
itself defines a checked context.
However, the output also demonstrates one last
potential problem that you need to be aware of when working with
casts. The very first line of output has not displayed the balance
correctly, displaying 50 instead of
$50.35. Consider these lines:
Only the last two lines correctly display the
Currency as a string. So what’s going
on? The problem here is that when you combine casts with method
overloads, you get another source of unpredictability. We will look
at these lines in reverse order.
The third Console.WriteLine() statement explicitly calls the
Currency.ToString() method ensuring that
the Currency is displayed as a string.
The second does not do so. However, the string literal “balance is” passed to Console.WriteLine() makes it clear to the compiler
that the parameter is to be interpreted as a string. Hence, the
Currency.ToString() method will be
called implicitly.
The very first Console.WriteLine() method, however, simply passes a
raw Currency struct to Console.WriteLine(). Now, Console.WriteLine() has many overloads, but none of
them takes a Currency struct. So the
compiler will start fishing around to see what it can cast the
Currency to in order to make it match up
with one of the overloads of Console.WriteLine(). As it happens, one of the
Console.WriteLine() overloads is
designed to display uints quickly and
efficiently, and it takes a uint as a
parameter, and you have now supplied a cast that converts
Currency implicitly to uint.
In fact, Console.WriteLine() has another overload that takes
a double as a parameter and displays the
value of that double. If you look
closely at the output from the first SimpleCurrency example, you will find the very first
line of output displayed Currency as a
double, using this overload. In that
example, there wasn’t a direct cast from Currency to uint, so the
compiler picked Currency-to-float-to-double as its
preferred way of matching up the available casts to the available
Console.WriteLine() overloads. However,
now that there is a direct cast to uint
available in SimpleCurrency2, the
compiler has opted for this route.
The upshot of this is that if you have a
method call that takes several overloads, and you attempt to pass
it a parameter whose data type doesn’t match any of the overloads
exactly, then you are forcing the compiler to decide not only what
casts to use to perform the data conversion, but which overload,
and hence which data conversion, to pick. The compiler always works
logically and according to strict rules, but the results may not be
what you expected. If there is any doubt, you are better off
specifying which cast to use explicitly.
|