SpicyClubSauce SpicyClubSauce - 2 months ago 10
Python Question

understanding multiple python 'with open' file functions

i'm having a difficult time understanding what the second 'with open' function does here.

so, in the first 'with open' part, we've essentially said

out = open(save_as_file, 'wb+')
, right? (still new to using 'with open'). we later write to it and then 'with open' automatically closes the 'out' file.That part i get - we're writing this response object from Requests as a binary in a specified save_as_file location until we hit the 81920th character aka our buffer #.

what's going on in the second 'with open'? breaking it down the same way as above, it's pretty much
fp = open(save_as_file, 'r')
, right? What does that make fp, which was already assigned the request response object earlier? We're just opening the save_as_file to use it for reading but not reading or extracting anything from it, so I don't see the reason for it. If someone could explain in english just what's taking place and the purpose of the second 'with open' part, that would be much appreciated.

(don't worry about the load_from_file function at the end, that's just another function under the class)

def load_from_url(self, url, save_as_file=None):

fp = requests.get(url, stream=True,
headers={'Accept-Encoding': None}).raw

if save_as_file is None:
return self.load_from_file(fp)

else:
with open(save_as_file, 'wb+') as out:
while True:
buffer = fp.read(81920)
if not buffer:
break
out.write(buffer)
with open(save_as_file) as fp:
return self.load_from_file(fp)

Answer

You are correct that the second with statement opens the file for reading.

What happens is this:

  1. Load the response from the URL
  2. If save_as_file is None:
    1. Call load_from_file on the response and return the result
  3. Else:
    1. Store the contents of the response to save_as_file
    2. Call load_from_file on the contents of the file and return the result

So essentialy, if save_as_file is set it stores the response body in a file, processes it and then returns the processed result. Otherwise it just processes the response body and returns the result.

The way it is implemented here is likely because load_from_file expects a file-like object and the easiest way the programmer saw of obtaining that was to read the file back.

It could be done by keeping the response body in memory and using Python 3's io module or Python 2's StringIO to provide a file-like object that uses the response body from memory, thereby avoiding the need to read the file again.

fp is reassigned in the second with statement in the same way as any other variable would be if you assigned it another value.