kukuk1de kukuk1de - 1 month ago 17
R Question

Check existence of file in archive (zip)

I'm using unz to extract data from a file within an archive. This actually works pretty well but unfortunately I've a lot of zip files and need to check the existence of a specific file within the archive. I could not manage to get a working solution with if exists or else.

Has anyone an idea how to perform a check if a file exists in an archive without extracting the whole archive before?

Example:

read.table(unz(D:/Data/Test.zip, "data.csv"), sep = ";")[-1,]


This works pretty well if
data.csv
exists but gives an error if the file is not available in the archive
Test.zip
.

Error in open.connection(file, "rt") : cannot open the connection
In addition: Warning message:
In open.connection(file, "rt") :
cannot locate file 'data.csv' in zip file 'D:/Data/Test.zip'


Any comments are welcome!

Answer

You could use unzip(file, list = TRUE)$Name to get the names of the files in the zip without having to unzip it. Then you can check to see if the files you need are in the list.

## character vector of all file names in the zip
fileNames <- unzip("D:/Data/Test.zip", list = TRUE)$Name

## check if any of those are 'data.csv' (or others)
check <- basename(fileNames) %in% "data.csv"

## extract only the matching files
if(any(check)) {
    unzip("D:/Data/Test.zip", files = fileNames[check], junkpaths = TRUE)
}

You could probably put another if() statement to run unz() in cases where there is only one matched file name, since it's faster than running unzip() on a single file.

Comments