Roman Podlinov Roman Podlinov - 19 days ago 12
Python Question

How to download big file in python via ftp (with monitoring & reconnect)?

UPDATE #1

The code in the question works pretty good for stable connection (like local network or intranet).

UPDATE #2

I implemented the

FTPClient
class with ftplib which can:


  1. monitor a download progress

  2. reconnect in case of timeout or disconnect

  3. makes several attempts to download file

  4. shows current download speed.



After reconnect it continues the download process from disconnect point (if FTP server support it). For details see my answer below.




Question

I have to implement task on python which daily downloads a bunch of big files (0.3-1.5Gb per file * 200-300 files) via FTP and then makes some processing with the files. I did it via ftplib. But from time to time it hangs and it cannot complete the download for some files. To fix the issue I started to play with KEEPALIVE settings, but I still haven't received good result

with closing(ftplib.FTP()) as ftp:
try:
ftp.connect(self.host, self.port, 30*60) #30 mins timeout
# print ftp.getwelcome()
ftp.login(self.login, self.passwd)
ftp.set_pasv(True)
ftp.sock.setsockopt(socket.SOL_SOCKET, socket.SO_KEEPALIVE, 1)
ftp.sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPINTVL, 75)
ftp.sock.setsockopt(socket.IPPROTO_TCP, socket.TCP_KEEPIDLE, 60)
with open(local_filename, 'w+b') as f:
res = ftp.retrbinary('RETR %s' % orig_filename, f.write)

if not res.startswith('226 Transfer complete'):
logging.error('Downloaded of file {0} is not compile.'.format(orig_filename))
os.remove(local_filename)
return None

os.rename(local_filename, self.storage + filename + file_ext)
ftp.rename(orig_filename, orig_filename + '.copied')

return filename + file_ext

except:
logging.exception('Error during download from FTP')


Details


  • Usually it takes 7-15 minutes to download a file.

  • FTP server always shows me in the logs that files are fully downloaded, but the client part hangs. Not every time but from time to time.



Questions


  • May it be because of a disconnect?

  • How to implement a monitor for the download process and reconnect it in case if it's disconnected


Answer

Because I couldn't find any good suggestions or code samples, I implemented my own solution. Thank you so much to the Stackoverflow community for some ideas which I used in my code. I put the code to GitHub (pyFTPclient) due to the size of the code(~ 120 lines).

I tested the solution on bad quality network (include 3G mobile internet) and it was work ok for me. But of course it may have some bugs.

I will appreciate any comments or suggestions. Thank you in advance.