frost frost - 2 months ago 8
C++ Question

String split into vector<char*> overwrites vector elements

Using the following code mentioned in http://stackoverflow.com/a/236803/6361644, I wrote the following code to parse a string into a vector, where each element is separated by white space.

std::string line = "ls -l -a";
std::string cmd;
std::vector<char*> argv;
std::stringstream ss;
ss.str(line);
std::string tmp;
getline(ss, cmd, ' ');
argv.push_back( const_cast<char*>(cmd.c_str() ) );
while(getline(ss, tmp, ' '))
argv.push_back( const_cast<char*>(tmp.c_str() ) );
argv.push_back(NULL);


Printing argv after this code gives

{gdb) print argv
$22 = std::vector of length 3, capacity 4 = {0x26014 "ls", 0x2602c "-a", 0x2602c "-a", 0x0}


I'm not sure why the second element is being overwritten. Any tips would be appreciated.

Answer

You're storing dangling pointers (in an ill-formed way no less! the proper way to store pointers to c-style strings is const char*, not char*).

In this (const-corrected) loop:

std::vector<const char*> argv;
// ...
while(getline(ss, tmp, ' '))
    argv.push_back(tmp.c_str());

every subsequent iteration will clear tmp, invalidating the previous pointer that you had stored. Every tmp.c_str() you pushed back is immediately freed by getline(). So all subsequent accesses are undefined.

You have to take ownership of all the strings, you can do so by instead storing the full string:

std::vector<std::string> argv;
// ...
while(getline(ss, tmp, ' '))
    argv.push_back(std::move(tmp));

And now argv actually owns all of its own resources.

Comments