Anirudh Ganesh Anirudh Ganesh - 3 months ago 12
Python Question

Does read(size) have a built in pointer?

I found this code here which monitors the progress of downloads. -

import urllib2

url = "http://download.thinkbroadband.com/10MB.zip"

file_name = url.split('/')[-1]
u = urllib2.urlopen(url)
f = open(file_name, 'wb')
meta = u.info()
file_size = int(meta.getheaders("Content-Length")[0])
print "Downloading: %s Bytes: %s" % (file_name, file_size)

file_size_dl = 0
block_sz = 8192
while True:
buffer = u.read(block_sz)
if not buffer:
break

file_size_dl += len(buffer)
f.write(buffer)
status = r"%10d [%3.2f%%]" % (file_size_dl, file_size_dl * 100. / file_size)
status = status + chr(8)*(len(status)+1)
print status,

f.close()


I do not see the block size being modified at any point in the
while
loop, so, to me,
buffer = u.read(block_sz)
should keep reading the same thing over and over again.

We know this doesn't happen, is it because
read()
in
while
loop has a built in pointer that starts reading from where you left off last time?

What about
write()
? Does it keep appending after where it last left off, even though the file is not opened in append mode?

Answer

File objects and network sockets and other forms of I/O communication are data streams. Reading from them always produces the next section of data, calling .read() with a buffer size does not re-start the stream from the beginning.

So yes, there is a virtual 'position' in streams where you are reading from.

For files, that position is called the file pointer, and this pointer advances both for reading and writing. You can alter the position by using seeking, or by simply re-opening the file. You can ask a file object to tell you the current position, too.

For network sockets however, you can only go forward; the data is received from outside your computer and reading consumes that data.