Dmitrii Bundin Dmitrii Bundin - 1 month ago 5
C++ Question

Is segmentation fault actual undefined behavior when we refer to a non-static data-member

I had read the following rule and I've been trying to write an example, which reflects one.
The rule is from 3.8/5 N3797:


Before the lifetime of an object has started but after the storage
which the object will occupy has been allocated or, after the lifetime
of an object has ended and before the storage which the object
occupied is reused or released, any pointer that refers to the storage
location where the object will be or was located may be used but only
in limited ways. For an object under construction or destruction, see
12.7. Otherwise, such a pointer refers to allocated storage (3.7.4.2), and using the pointer as if the pointer were of type
void*
is well-defined. Indirection through such a pointer is permitted but the resulting lvalue may only be used in limited ways, as described
below. The program has undefined behavior if:

[...]

— the pointer is used to access a non-static data member or call a
non-static member function of the object, or

[...]


The example I've written for:

#include <iostream>
#include <typeinfo>

using std::cout;
using std::endl;

struct A
{
int b = 5;
static const int a = 5;
};

int main()
{
A *p = (A*)0xa31a3442;
cout << p -> a; //1, Well-fromed, there is no compile-time error
cout << p -> b; //2, Segmentation fault is producing
}


Is it true that in the case
//1
is well-formed and doesn't cause any
UB
, but
//2
produced segmentation fault, which is
UB
?

Answer

Undefined behavior means that anything can happen with a standard conforming implementation. Really anything. (and your point 2 is UB)

An implementation could

  • explode your computer and harm you physically
  • make a black-hole which swallows the entire solar system
  • do nothing serious
  • light some LED on your keyboard
  • make some time-travel and kill all your grandparents before the birth of your own parents
  • etc....

and be conforming (in the event of UB); read also about the more familiar idea of nasal demons.

So what happens on UB is not predictable and is not reproducible (in general).

More seriously, think a bit about what UB could mean in the computer connected to the ABS brakes of your car, or in some artificial heart, or driving some nuclear power plant.

In particular, it might work sometimes. Since most OSes have ASLR your code has a tiny chance to work (e.g. if 0xa31a3442 happens to point to some valid location, e.g. on the stack, but you won't reproduce that on the next run!)

UB is a way to give freedom to implementors (e.g. of compilers or of OSes) and to computers to do whatever they "want", in other words to not care about consequences. This enables e.g. clever optimizations or nice implementation tricks. But you should care (and consequences are different if you are coding the embedded flight control system of a airplane, or just some hacky demo lighting LEDs with a RasberryPi, or a simple example for some C++ course running on Linux).

Recall that languages standards don't even require any computer (or any hardware) in the implementation: you might "run" your C++ code with a team of human slaves, but that would be highly unethical (and costly, and unreliable).

See also here for more references.


(added in december 2015 & june 2016)

NB. The valgrind tool and various -fsanitize= debugging options for recent GCC or Clang/LLVM are quite useful. Also, enable all warnings and debug info in your compiler (e.g. g++ -Wall -Wextra -g), and use appropriate instrumentation options such as -fsanitize=undefined. Be aware that it is impossible to detect statically and exhaustively at compile time all cases of UB (that would be equivalent to the Halting Problem).

PS. The above answer is not specific to C++; it also fits for C!