meJustAndrew meJustAndrew - 20 days ago 6
C# Question

What exactly is a reference in C#

From what I understand by now, I can say that a reference in C# is a kind of pointer to an object which has reference count and knows about the type compatibility. My question is not about how a value type is different than a reference type, but more about how a reference is implemented.

I have read this post about what differences are between references and pointers, but that does not cover that much about what a reference is but it it's describing more it's properties compared with a pointer in C++. I also understand the differences between passing by reference an passing by value (as in C# objects are by default passed by value, even references), but it is hard for me to understand what really is a reference when I have tried to explain to my colleagues why a parameter sent by reference can not be stored inside a closure as in the Eric Lippert blog entry about the stack as an implementation detail.

Can somebody provide me with a complete, but hopefully simple explanation about what references really are in C# and a bit about how they are imlemented?

Edit: this is not a duplicate, because in the Reference type in C# it is explained how a reference works and how is it different of a value, but what am I asking is how a reference is defined at a low level.

Answer

From what I understand by now, I can say that a reference in C# is a kind of pointer to an object

If by "kind of" you mean "is conceptually similar to", yes. If you mean "could be implemented by", yes. If you mean "has the is-a-kind-of relationship to", as in "a string is a kind of object" then no. The C# type system does not have a subtyping relationship between reference types and pointer types.

which has reference count

Implementations of the CLR are permitted to use reference counting semantics but are not required to do so, and most do not.

and knows about the type compatibility.

I'm not sure what this means. Objects know their own actual type. References have a static type which is compatible with the actual type in verifiable code. Compatibility checking is implemented by the runtime's verifier when the IL is analyzed.

My question is not about how a value type is different than a reference type, but more about how a reference is implemented.

How references are implemented is, not surprisingly, an implementation detail.

Can somebody provide me with a complete, but hopefully simple explanation about what references really are in C#

References are things that act as references are specified to act by the C# language specification. That is:

  • objects (of reference type) have identity independent from the values of their fields
  • any object may have a reference to it
  • such a reference is a value which may be passed around like any other value
  • equality comparison is implemented for those values
  • two references are equal if and only if they refer to the same object; that is, references reify object identity
  • there is a unique null reference which refers to no object and is unequal to any valid reference to an object
  • A static type is always known for any reference value, including the null reference
  • If the reference is non-null then the static type of the reference is always compatible with the actual type of the referent. So for example, if we have a reference to a string, the static type of the reference could be string or object or IEnumerable, but it cannot be Giraffe. (Obviously if the reference is null then there is no referent to have a type.)

There are probably a few rules that I've missed, but that gets across the idea. References are anything that behaves like a reference. That's what you should be concentrating on. References are a useful abstraction because they are the abstraction which enables object identity independent of object value.

and a bit about how they are implemented?

In practice, objects of reference type in C# are implemented as blocks of memory which begin with a small header that contains information about the object, and references are implemented as pointers to that block. This simple scheme is then made more complicated by the fact that we have a multigenerational mark-and-sweep compacting collector; it must somehow know the graph of references so that it can move objects around in memory when compacting the heap, without losing track of referential identity.

As an exercise you might consider how you would implement such a scheme. It builds character to try to figure out how you would build a system where references are pointers and objects can move in memory. How would you do it?