Nikolai Nikolai - 2 months ago 10x
Python Question

Ignore missing file while downloading with Python urllib2

Issue: As the title states I am downloading data via ftp from NOAA based on the year and the day. I have configured my script to go through a range of years and download data for each day. However the script is getting hung up on days where no file exists. What happens is it just keeps reloading the same line saying that the file does not exist. Without the time.sleep(5) the script prints to the log like crazy.

Solution: Somehow skip the missing day and move onto the next one. I have explored continue (maybe I am placing it in the wrong spot), making an empty directory (not elegant and still will not move past missing day). I am at a loss, what have I overlooked?

Here is the script:

##Working 24km
import urllib2
import time
import os
import os.path

flink = '{year}/ims{year}{day}_24km_v1.1.asc.gz'
days = [str(d).zfill(3) for d in range(1,365,1)]
years = range(1998,1999)
flinks = [flink.format(year=year,day=day) for year in years for day in days]

from urllib2 import Request, urlopen, URLError

for fname in flinks:
dl = False
while dl == False:
# req = urllib2.Request(fname)
req = urllib2.urlopen(fname)
with open('/Users/username/Desktop/scripts_hpc/scratch/'+fname.split('/')[-1], 'w') as dfile:
print 'file downloaded'
dl = True

except URLError, e:
#print 'sleeping'
print e.reason
print 'skipping day: ', fname.split('/')[-1],' was not processed for ims'
if not os.path.isfile(fname):
f = open('/Users/username/Desktop/scripts_hpc/empty/'+fname.split('/')[-1], 'w')
print 'day was skipped'


#everything is fine

Research: I have browsed through other questions and they get close, but don't seem to hit the nail on the head. Ignore missing files Python ftplib ,how to skip over a lines of a file if they aempty Any help would be greatly appreciated!

Thank you!


On the except, use pass instead of continue, since it can only be used inside loops(for, while).

With that you won't need to do handle the missing files, since Python will just ignore the error and keep going.