mastro35 mastro35 - 1 month ago 5
Python Question

Windows: iterate on a file in python using Win32API

I'm trying to read a file in Python using Win32Api so as to be able to open the file without locking it on a Windows system.

I've been able to open the file and even to read from it but when I try to implement the iterator protocol I get an error message that I can't understand.

Here's an example script that reproduce the problem

#!/usr/bin/env python

import os


class FileTail(object):
def __init__(self, file):
self.open(file)

def open(self, file):
"""Open the file to tail and initialize our state."""
fh = None

import win32file
import msvcrt

handle = win32file.CreateFile(file,
win32file.GENERIC_READ,
win32file.FILE_SHARE_DELETE |
win32file.FILE_SHARE_READ |
win32file.FILE_SHARE_WRITE,
None,
win32file.OPEN_EXISTING,
0,
None)
file_descriptor = msvcrt.open_osfhandle(
handle, os.O_TEXT | os.O_RDONLY)

fh = open(file_descriptor, encoding='utf-8',
errors='ignore', newline="\n")

self.reopen_check = "time"

self.fh = fh
self.file = file

# Uncommenting this code demonstrate that there's no problem reading the file!!!!
# -------------------------------------------------------------------------------
# line = None
# self.wait_count = 0

# while not line:
# line = self.fh.readline()

def __iter__(self):
return self

def __next__(self):
line = None
self.wait_count = 0

while not line:
line = self.fh.readline()

return line

# ##############################
# ENTRY POINT
# ##############################
if __name__ == "__main__":
my_file = FileTail('C:\LOGS\DANNI.WEB\PROVA.LOG')

for line in my_file:
print(line)


Now, if you try to execute this script, you will receive this error message:

Traceback (most recent call last):
File "C:\Users\me\Desktop\prova.py", line 63, in <module>
for line in my_file:
File "C:\Users\me\Desktop\prova.py", line 53, in __next__
line = self.fh.readline()
OSError: [Errno 9] Bad file descriptor


If I uncomment the commented code in the "open" method I can read the whole file, so I don't think the problem is in the usage of the win32 api to open the file... so... what I'm missing?

Why using the iterator protocol I get the error message? Is it a thread related problem? How can I fix it?

I know that there will be probably a thousand of work-around but I want to understand why this code is not working...

Thank you all for the help you will provide and sorry for my very bad english... :(

Dave

Answer

The problem is that the objects handle and file_descriptor might get garbage collected after the function open returns. When you call __next__ the objects might have been freed which raises the OSError: [Errno 9] Bad file descriptor. That's also why it works when you read the file in the open function itself, because there the objects are still present.

To solve this simply store the objects as instance attributes so there is at least one reference to them.

def open(...)
    ...
    self.handle = CreateFile(...)
    ...
    self.file_descriptor = msvcrt.open_osfhandle(self.handle, ...)
    ...
    self.fh = open(self.file_descriptor, ...)
    ...

It might be sufficient to only store one of them but I am not sure which one. Storing both is the save way.