FrozenHeart FrozenHeart - 2 months ago 5x
C++ Question

Will C++11 std::string::operator[] return null-terminated buffer

I have an object of the

class that I need to pass to C function that operates the
buffer by iterating over it and searching for the null terminated symbol.

So, I have something like this:

// C function
void foo(char* buf);

// C++ code
std::string str("str");

Suppose that we use C++11, so we have a guarantee that
representation will have contiguously stored characters.

But I wonder is there any guarantee that
will point to the buffer that ends with
? Yeah, there's
member function but I'm talking about

Can somebody quote the standard please?


In practice, yes. There are exactly zero implementations of std::string that are standards-comforming that do not store a NUL character at the end of the buffer.

So if you aren't wondering for wondering sake, you are done.

However, if you are wondering about the standard being abtruse:

In C++14, yes. There is a clear requirement that [] return a contiguous set of elements, and [size()] must return a NUL character, and const methods may not modify state. So *((&str[0])+size()) must be the same as str[size()], and str[size()] must be a NUL, thus game over.

In C++11, almost certainly. There are rules that const methods may not modify state. There are guarantees that data() and c_str() return a null-terminated buffer that agrees with [] at each point.

A convoluted reading of C++11 standard would state that prior to any call of data() or c_str(), [size()] doesn't return the NUL terminator at the end of the buffer but rather a static const CharT that is stored separately, and the buffer has an unitialized (or even a trap value) where NUL should be. Due to the requirement that const methods not modify state I believe this reading is incorrect.

This requires &str[str.size()] change between calls to .data(), which is an observable change in state in string over a const call, which I would read as being illegal.

An alternative way to get around the standard might be to not initialize str[str.size()] until you legally access it via calling .data(), .c_str() or actually passing str.size() to operator[]. As there are no defined ways to access that element other than those 3 in the standard, you could stretch things and say lazy initialization of the NUL is legal.

I'd question this, as the definition of .data() implies that the return value of [] is contiguous, so &[0] is the same address as .data(), and .data()+.size() is guaranteed to point to a NUL CharT so must (&[0])+.size(), and with no non-const methods called the state of the std::string may not change between the calls.

But, what if the fact the compiler can look and see you'll never call .data() or .c_str(), does the requirement of contiguity hold if it can be proven you never call them?

At which point I'd throw my hands up and shoot the hostile compiler.

The standard is very passively voiced about this. So there may be a way to make an arguably standards conforming std::string that doesn't follow these rules. And because the guarantees get closer and closer to explicitly requiring that NUL terminator there, the odds against a new compiler showing up that uses a tortured reading of C++ to claim this is standards compliant is low.