SimplyKnownAsG SimplyKnownAsG - 2 months ago 11
C++ Question

c++ static unorderd_map used in static method is not initialized

I have some code where a static method is called, and the static

std::unordered_map
within the same file is not initialized. I understand the static initialization between two compile units is "undefined" and there are many
SO
questions on the topic; however, when I use an
std::vector
the issue does not occur. Also, the code can execute, but I am confused as to why these specific compile orders do not work.
SO
, my questions are:


  1. There is another
    SO
    question (which I've been unable to find!) about static initialization and dynamic initialization of static variables. Is this error due to
    std::undored_map
    actually being a dynamic initialization?

  2. is there a way to get this code to initialize the
    std::unordered_map
    as I expected? I'm actually trying to create a static library
    .lib
    or
    .a
    . When I link the static library, it generally needs to come last, and so the error occurs.

  3. are there any workarounds for this? One option I've thought of is to create both an
    std::vector
    and an
    std::unordered_map
    . Use the
    std::vector
    while the
    std::unordered_map
    is uninitialized (via
    bool _map_is_initialized
    ). Change the initialization of the
    std::unordered_map
    to be explicitly dynamic by calling a function which iterates over the values in the
    std::vector
    to produce the
    std::unordered_map
    .



Linux

g++ -std=c++1y -g -c thing.cpp
g++ -std=c++1y -g -c main.cpp
g++ -g main.o thing.o -o main
./main


This results in a
Floating point exception (core dumped)
error. Through
gdb
, I was able to figure out that
hashtable_policy.h
trys
__num % __den;
where
__den==0
. Also using
gdb
, it appears as though
Thing::Things
is uninitialized.

(gdb) break thing.cpp:12
(gdb) run
(gdb) print Thing::Things
No symbol "Things" in specified context.
(gdb) print thing
$1 = (Thing *) 0x618c20


Windows

cl /EHsc /Zi /c main.cpp
cl /EHsc /Zi /c thing.cpp
link /debug main.obj thing.obj
main


In my actual code, this resulted in a very clear segmentation fault; however, this example just opens a popup that says the application failed. ... I have not done better diagnostics.

Code


thing.cpp


#include<iostream>

#include "thing.hpp"

std::vector<Thing*> Before; // EDIT: added

std::unordered_map<std::string, Thing*> Thing::Things;

std::vector<Thing*> After; // EDIT: added

Thing::Thing(std::string name) : name(name) {

}

bool Thing::Register(Thing *thing) {
std::cout << "no worries, vectors initialized..." << std::endl;
Thing::Before.push_back(thing); // EDIT: added
Thing::After.push_back(thing); // EDIT: added
std::cout << "added to vectors, about to fail..." << std::endl;
Thing::Things[thing->name] = thing;
return true;
}


thing.hpp


#pragma once
#include <string>
#include <unordered_map>

class Thing {
public:
static std::vector<Thing*> Before; // EDIT: added

static std::unordered_map<std::string, Thing*> Things;

static std::vector<Thing*> After; // EDIT: added

static bool Register(Thing* thing);

std::string name;

Thing(std::string name);
};

#define ADD_THING(thing_name) \
static bool thing_name## _is_defined = Thing::Register(new Thing( #thing_name ));


main.cpp


#include "thing.hpp"
#include <iostream>

ADD_THING(obligatory);
ADD_THING(foo);
ADD_THING(bar);

int main(int argc, char* argv[]) {
std::cout << "before loop" << std::endl;
for (auto thing : Thing::Things) {
std::cout << "thing.name: " << thing.first << std::endl;
}
return 0;
}



EDIT

If the order within a given compile unit is guaranteed, why do
static std::vector<Thing*> Thing::Before
and
static std::vector<Thing*> Thing::After
get initialized, but
static std::unordered_map<std::string, Thing*> Thing::Things
does not?

Answer

Static initialization is tricky. As this answer states, the standard provides some guarantees as to the order of initialization within a single "translation unit" (normally a .cpp source file), but none whatsoever concerning what order initializations in different translation units will follow.

When you added the Before and After vectors to the code, you observed that unlike the calls to ordered_map::operator[], the calls to vector::push_back() did not crash the process and concluded that the objects were being initialized out of order within a single translation unit, contrary to the standard's guarantees. There is a hidden assumption there, namely that since push_back() did not cause a crash, the vector must therefore have been initialized. This turns out not to be the case: that method call on an uninitialized object is almost certainly corrupting memory somewhere, but won't necessarily cause a crash. A better way of checking whether or not the constructor is being called would be to run the code in a debugger, and set breakpoints on the lines which contain the objects' definitions, for instance std::vector<Thing*> Before in thing.cpp. This will show that initialization will occur as predicted in the standard.

The best option for avoiding the "fiasco", as described here, is "construct on first use". In the case of your example code, this would involve changing any direct use of Thing::Things, such as this line:

Thing::Things[thing->name] = thing;

To a method, say Thing::GetThings(), which initializes the object and returns a reference to it. lcs' answer provides an example of this, but beware: although it solves the static initialization problem, using a scoped static object may introduce an even more pernicious problem: crashes on program exit due to static deinitialization order. For that reason, allocating the object with the new keyword is preferred:

std::unordered_map<std::string, Thing*>& Thing::GetThings()
{
    static std::unordered_map<std::string, Thing*>* pThings =
        new std::unordered_map<std::string, Thing*>();
    return *pThings;
}

That instance will of course never be delete'd, which feels an awful lot like a memory leak. But even if it weren't a pointer, de-initialization would only occur at program shutdown. So, unless the object's destructor performs some important function like flushing a file's contents to disk, the only difference that matters is the fact that using a pointer avoids the possibility of a crash on exit.