zpontikas zpontikas - 7 months ago 33
Java Question

Read tar.gz in Java with Commons-compression

Ok so I want to read the contents of a tar.gz file (or a xy) but that's the same thing.
What I am doing is more or less this:

TarArchiveInputStream tarInput = new TarArchiveInputStream(new GzipCompressorInputStream(new FileInputStream("c://temp//test.tar.gz")));
TarArchiveEntry currentEntry = tarInput.getNextTarEntry();
BufferedReader br = null;
StringBuilder sb = new StringBuilder();
while (currentEntry != null) {
File f = currentEntry.getFile();
br = new BufferedReader(new FileReader(f));
System.out.println("For File = " + currentEntry.getName());
String line;
while ((line = br.readLine()) != null) {
System.out.println("line="+line);
}
}
if (br!=null) {
br.close();
}


But I get null when I call the
getFile
method of
TarArchiveEntry
.

I am using Apache commons compress 1.8.1

Answer

You can't use the getFile of TarArchiveEntry. That getter is there only for the opposite operation, when you are compressing files inside a tar file.

Instead, you should read directly from TarArchiveInputStream. It will take care of returning you the content of the "file" decompressing it on the fly.

For example (untested code, YMMV) :

TarArchiveInputStream tarInput = new TarArchiveInputStream(new GzipCompressorInputStream(new FileInputStream("c://temp//test.tar.gz")));
TarArchiveEntry currentEntry = tarInput.getNextTarEntry();
BufferedReader br = null;
StringBuilder sb = new StringBuilder();
while (currentEntry != null) {
    br = new BufferedReader(new InputStreamReader(tarInput)); // Read directly from tarInput
    System.out.println("For File = " + currentEntry.getName());
    String line;
    while ((line = br.readLine()) != null) {
        System.out.println("line="+line);
    }
    currentEntry = tarInput.getNextTarEntry(); // You forgot to iterate to the next file
}
Comments