Faber Faber - 1 month ago 6
Python Question

Can't copy compressed files

I need to copy various files for several times with a specific name and I wanted to make the process automate. This is my python code:

number_of_copies = int(raw_input("enter number of copies "))

copy_number = 1

infile = raw_input("file to be copied ")
new_file = raw_input("What's the name of the new file?")
extension = ".fastq"
indata = open(infile)

file_to_copy = str(indata.read())

while copy_number < number_of_copies:

copy = open(new_file + "-" + str(copy_number) + extension, 'w')
copy.write(file_to_copy)
copy_number = copy_number + 1

indata.close()
copy.close()


In this case I know the extension of my file so I have hard coded it and I just change that variable in the script accordingly.
The script works fine with my .fastq files (basically text files) but as soon as I try it on a fastq.gz file (compressed) the copy has size 1kb (from original >300 Mb size). I believe the problem is with the fact that the .gz is a compressed file but I don't know how to solve this. Any help is greatly appreciated.

p.s. of course, when I try it with the .gz files I change the "extension" variable as well.

Thank you in advance!

Answer

As noted in the comments, using shutil is more efficient.

You're getting errors because your OS does newline translation on text files. So to use the above code correctly on all files you need to open them in binary mode, eg

open(infile, 'rb') 

and

open(new_file + "-" + str(copy_number) + extension, 'wb')

Here are the Python 2 docs for open. And this answer has a handy table of the standard file modes.

Comments