Lazer Lazer - 3 months ago 13
C++ Question

Why can't a derived class pointer point to a base class object without casting?

I have seen few Pet and Dog type examples for this type of basic question here and here, but they do not make sense to me, here is why.

Suppose we have the following class structure

class Pet {};
class Dog : public Pet {};


then the following statement


a (Dog) is a (Pet)



might be true in real life, but is NOT true in C++, in my opinion. Just look at the logical representation of a Dog object, it looks like this:

enter image description here

It is more appropriate to say


a (Dog) has a (Pet)



or


a (Pet) is a subset of (Dog)



which if you notice is a logical opposite of "a Dog is a Pet"




Now the problem is that #1 below is allowed while #2 is not:

Pet* p = new Dog; // [1] - allowed!
Dog* d = new Pet; // [2] - not allowed without explicit casting!


My understanding is that
[1]
should not be allowed without warnings because there is no way a pointer should be able to point to an object of its superset's type (Dog object is a superset of Pet) simply because Pet does not know anything about the new members that Dog might have declared (the Dog - Pet subset in the diagram above).

[1]
is equivalent of an
int*
trying to point to a
double
object!

Very obviously, I am missing a key point here which would turn my whole reasoning upside down. Can you please tell me what it is?

I believe making parallels to real world examples only complicate things. I would prefer to understand this in terms of technical details. Thanks!

Answer

Edit: Re-reading your question and my answer leads me to say this at the top:

Your understanding of is a in C++ (polymorphism, in general) is wrong.

A is B means A has at least the properties of B, possibly more, by definition.

This is compatible with your statements that a Dog has a Pet and that [the attributes of] a Pet is[are] a subset of [attributes] of Dog.


It's a matter of definition of polymorphism and inheritance. The diagrams you draw are aligned with the in-memory representation of instances of Pet and Dog, but are misleading in the way you interpret them.

Pet* p = new Dog;

The pointer p is defined to point to any Pet-compatible object, which in C++, is any subtype of Pet (Note: Pet is a subtype of itself by definition). The runtime is assured that, when the object behind p is accessed, it will contain whatever a Pet is expected to contain, and possibly more. The "possibly more" part is the Dog in your diagram. The way you draw your diagram lends to a misleading interpretation.

Think of the layout of class-specific members in memory:

Pet: [pet data]
Dog: [pet data][dog data]
Cat: [pet data][cat data]

Now, whenever Pet *p points to, is required to have the [pet data] part, and optionally, anything else. From the above listing, Pet *p may point to any of the three. As long you use Pet *p to access the objects, you may only access the [pet data], because you don't know what, if anything, is afterwards. It's a contract that says This is at least a Pet, maybe more.

Whatever Dog *d points to, must have the [pet data] and [dog data]. So the only object in memory it may point to, above, is the dog. Conversely, through Dog *d, you may access both [pet data] and [dog data]. Similar for the Cat.


Let's interpret the declarations you are confused about:

Pet* p = new Dog;  // [1] - allowed!
Dog* d = new Pet;  // [2] - not allowed without explicit casting!

My understanding is that 1 should not be allowed without warnings because there is no way a pointer should be able to point to an object of its superset's type (Dog object is a superset of Pet) simply because Pet does not know anything about the new members that Dog might have declared (the Dog - Pet subset in the diagram above).

The pointer p expects to find [pet data] at the location it points to. Since the right-hand-side is a Dog, and every Dog object has [pet data] in front of its [dog data], pointing to an object of type Dog is perfectly okay.

The compiler doesn't know what else is behind the pointer, and this is why you cannot access [dog data] through p.

The declaration is allowed because the presence of [pet data] can be guaranteed by the compiler at compile-time. (this statement is obviously simplified from reality, to fit your problem description)

1 is equivalent of an int* trying to point to a double object!

There is no such subtype relationship between int and double, as is between Dog and Pet in C++. Try not to mix these into the discussion, because they are different: you cast between values of int and double ((int) double is explicit, (double) int is implicit), you cannot cast between pointers to them. Just forget this comparison.

As to [2]: the declaration states "d points to an object that has [pet data] and [dog data], possibly more." But you are allocating only [pet data], so the compiler tells you you cannot do this.

In fact, the compiler cannot guarantee whether this is okay and it refuses to compile. There are legitimate situations where the compiler refuses to compile, but you, the programmer, know better. That's what static_cast and dynamic_cast are for. The simplest example in our context is:

d = p; // won't compile
d = static_cast<Dog *>(p); // [3]
d = dynamic_cast<Dog *>(p); // [4]

[3] will succeed always and lead to possibly hard-to-track bugs if p is not really a Dog.
[4] will will return NULL if p is not really a Dog.

I warmly suggest trying these casts out to see what you get. You should get garbage for [dog data] from the static_cast and a NULL pointer for the dynamic_cast, assuming RTTI is enabled.