syed saad syed saad - 4 months ago 14
C++ Question

Unusual behaviour of get() (reading from a file in c++)

// Print the last n lines of a file i.e implement your own tail command
#include <iostream>
#include <fstream>
#include <string>
int main()
{
std::ifstream rd("D:\\BigFile.txt");
int cnt = 0;char c;
std::string data;
rd.seekg(0,rd.end);
int pos=rd.tellg();

while(1)
{

rd.seekg(--pos,std::ios_base::beg);

rd.get(c);
if(c=='\n')
{
cnt++;
// std::cout<<pos<<"\t"<<rd.tellg()<<"\n";

}

if(cnt==10)
break;

}
rd.seekg(pos+1);
while(std::getline(rd,data))
{
std::cout<<data<<"\n";
}



}


So, I wrote this program to print the last 10 lines of a text file. However it prints only the last 5 , for some reason every time it encounters an actual '\n' the next get() also gives a \n leading to incorrect output . Here is my input file:

Hello
Trello
Capello
Morsello
Odello
Othello
HelloTrello
sdasd
qerrttt
mkoilll
qwertyfe


I am using notepad on Windows and this is my output:

HelloTrello
sdasd
qerrttt
mkoilll
qwertyfe


I cant figure out why this is happening , Please help.

Answer

Do not use arithmetic on file positions if file is opened in text mode. It will not give you correct result.

If file is opened in text mode, 1 character does not always mean 1 byte. And how file position is implemented (if it points to specific character or byte) is unspecified.

In your case problem is that on Windows a newline symbol is two bytes long. Text streams converts it into single-byte symbol '\n' so you wouldn't need to worry about difference between platforms and actual byte sequences used.

So your first read reads last byte of two-byte endline symbol which happens to have same value as '\n' in ASCII. Next read lands in the beginning of two-byte endline symbol and stream correctly converts it into '\n'.