Hinton Hinton - 5 months ago 16
Python Question

Python 3: write method vs. os.write number of bytes returned

I wanted to create a text file containing a number of ''pages'' and log the byte offset of each page in a separate file. To do that, I printed strings to the main output file and counted bytes using

bytes_written += file.write(str)
. However, the byte offset was often wrong.

I switched to
bytes_written += os.write(fd, bytes(str, 'UTF-8'))
and it works now. What is the difference between
write()
and
os.write()
? Or is the difference in the return value simply due to my manual conversion of the string to UTF-8?

Aya Aya
Answer

What is the difference between write() and os.write()?

It's analogous to the difference between the C functions fwrite(3) and write(2).

The latter is a thin wrapper around an OS-level system call, whereas the former is part of the standard C library, which does some additional buffering, and ultimately calls the latter when it actually needs to write its buffered data to a file descriptor.

Python 3.x adds some additional logic to a file object's write() method which does automatic character-encoding conversion for Python str objects, whereas Python 2.x does not.

Or is the difference in the return value simply due to my manual conversion of the string to UTF-8?

In Python 3.x, the difference is more related to the way in which you opened the file.

If you opened the file in binary mode, e.g. f = open(filename, 'wb') then f.write() expects a bytes object, and will return the number of bytes written.

If, instead, you opened the file in text mode, e.g. f = open(filename, 'w') then f.write() expects a str object, and will return the number of characters written, which for multi-byte encodings such as UTF-8 may not match the number of bytes written.

Note that the os.write() method always expects a bytes object, regardless of whether or not the O_BINARY flag was used when calling os.open().

Comments