FelisPhasma FelisPhasma - 1 year ago 128
C++ Question

Quickly parse tab-separated strings and ints in c++

I have a file which is a couple gigabytes large, and has millions of lines. Each line has data separated like so:

string TAB int TAB int TAB int NEWLINE

My previous attempts to read this line by line have bottle necked as a result of the CPU instead of my SSD's write speed.

How can I quickly parse a massive file line by line?


Note: The files can't be parsed into a vector all at once because they are too large.

In my original code I was parsing the data into vector of structs like this

struct datastruct {
std::string name;
int year;
int occurences;
int volcount;
std::vector<datastruct> data;

Answer Source

Using your datastruct, you could do

std::ifstream file;
datastruct data;
while (file >> data.name >> data.year >> data.occurences >> data.volcount)
    // do what you want with data, its contents will be replaced during next iteration

Is that that slow?

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download