Dinesh Dinesh - 3 months ago 24
C++ Question

decompress multiple files in to one single file using boost

I have set of compressed files. I have to decompress all the files and create one big file. below code is working fine, but I don't want to use std::stringstream because the files are big and I don't want to create intermediate copies of the file content.

If I try to use

boost::iostreams::copy(inbuf, tempfile);
directly, it is closing the file(tmpfile) automatically. Is there any better way to copy the content ? or at least, can I avoid closing of this file automatically?

std::ofstream tempfile("/tmp/extmpfile", std::ios::binary);
for (set<std::string>::iterator it = files.begin(); it != files.end(); ++it)
{
string filename(*it);
std::ifstream gzfile(filename.c_str(), std::ios::binary);

boost::iostreams::filtering_streambuf<boost::iostreams::input> inbuf;
inbuf.push(boost::iostreams::gzip_decompressor());
inbuf.push(gzfile);

//closes tempfile automatically!!
//boost::iostreams::copy(inbuf, tempfile);

std::stringstream out;
boost::iostreams::copy(inbuf, out);
tempfile << out.str();
}
tempfile.close();

Answer

I know there are ways to let Boost IOStreams know it shouldn't close streams. I suppose it requires you use boost::iostream::stream<> instead of std::ostream though.

My simple workaround that appears to work was to use a temp std::ostream associated with a single std::filebuf object:

#include <boost/iostreams/stream.hpp>
#include <boost/iostreams/copy.hpp>
#include <boost/iostreams/filtering_streambuf.hpp>
#include <boost/iostreams/filter/gzip.hpp>
#include <set>
#include <string>
#include <iostream>
#include <fstream>

int main() {
    std::filebuf tempfilebuf;
    tempfilebuf.open("/tmp/extmpfile", std::ios::binary|std::ios::out);

    std::set<std::string> files { "a.gz", "b.gz" };
    for (std::set<std::string>::iterator it = files.begin(); it != files.end(); ++it)
    {
        std::string filename(*it);
        std::ifstream gzfile(filename.c_str(), std::ios::binary);

        boost::iostreams::filtering_streambuf<boost::iostreams::input> inbuf;
        inbuf.push(boost::iostreams::gzip_decompressor());
        inbuf.push(gzfile);

        std::ostream tempfile(&tempfilebuf);
        boost::iostreams::copy(inbuf, tempfile); 
    }
    tempfilebuf.close();
}

Live On Coliru

With sample data like

echo a > a
echo b > b
gzip a b

Generates extmpfile containing

a
b