Jens Jens - 2 months ago 12
C++ Question

Check if a stream ends with a newline

I want to check whether a stream (in practice an ifstream) ends with a newline. I have come up with this:

bool StreamEndsWithNewline(std::basic_istream<char> & the_stream)
{
if (the_stream.peek() == EOF) {
the_stream.clear(); //clear flags set by peek()
return false;
}
std::string line = "blah";
while (std::getline(the_stream, line)) {
// ...
}
return line.empty();
}


The idea being that if the last line of the stream has a
\n
ending character, the while loop will do one additional iteration (because eof has not been reached) in which the empty string will be assigned to the line argument.

The special case of an "empty" stream has to be treated separately.

It seems to work on windows (vs2010). Can I do it this way in general?

Answer

tldr; Yes, this is guaranteed to work, unless the stream is initially empty.


There are two bits to consider: the fail bit and the eof bit. std::getline does, from [string.io]:

After constructing a sentry object, if the sentry converts to true, calls str.erase() and then extracts characters from is and appends them to str as if by calling str.append(1, c) [...] If the function extracts no characters, it calls is.setstate(ios::failbit)

And sentry does, from [istream::sentry]:

Effects: If is.good() is false, calls is.setstate(failbit). Otherwise, prepares for formatted or unformatted input. [...] If is.rdbuf()->sbumpc() or is.rdbuf()->sgetc() returns traits::eof(), the function calls setstate(failbit | eofbit)

So given all of that, let's walk through two examples:


Case 1: "hello\n". The first call to getline(), the_stream.good() is true, we extract characters up through the \n, the stream is still good(), and we enter the body of the loop with line set to "hello".

The second call to getline(), the stream is still good(), so the sentry object converts to true, and we call str.erase(). Attempting to extract subsequent characters fails, since we're done with the stream, so the failbit is set. This causes the return getline() to convert to false so we don't enter the body of the loop a second time. At the end of the loop, line is empty.


Case 2: "goodbye", no newline. The first call to getline(), the_stream.good() is true, we extract characters until we hit eof(). The stream failbit isn't set yet, so we still enter the body of the loop, with line set to "goodbye".

The second call to getline(), the construction of the sentry object fails because is.good() is false (is.good() checks both the eofbit and the failbit). Because of this failure, we don't go into the first step of getline() which calls str.erase(). And because of this failure, the failbit is set so we again do not enter the body of the loop.

At the end of the loop, line is still "goodbye".


Case 3: "". Here, getline() will extract no characters, so the failbit is set and the loop is never entered, and line is always empty. There are several ways to differentiate this case from case 1:

  • You could, up front, peek() to see if the first character is traits::eof() before doing anything else.
  • You could count how many times you enter the loop and check that it's nonzero.
  • You could initialize line to some sentinel non-empty value. At the end of the loop, the line will only be empty if the stream ends with the delimeter.