splinter splinter - 25 days ago 8
Python Question

Incorrect file reading when using os.walk in python3

I am crawling through folders using the

os.walk()
method. In one of the folders, there is a large number of files, around 100,000 of them. The files look like:
p_123_456.zip
. But they are read as
p123456.zip
. Indeed, when I open windows explorer to browse the folder, for the first several seconds the files look like
p123456.zip
, but then change their appearance to
p_123_456.zip
. This is a strange scenario.

Now, I can't use
time.sleep()
because all folders and and files are being read into python variables in the looping line. Here is a snippet of the code:

for root, dirs, files in os.walk(srcFolder):
os.chdir(root)
for file in files:
shutil.copy(file, storeFolder)


In the last line, I get a file not found exception, saying that the file
p123456.zip
does not exist. Has anyone run into this mysterious issue? Anyway to bypass this? What is the cause of this? Thank you.

Answer

You don't seem to be concatenating the actual folder name with the filenames. Try changing your code to:

for root, dirs, files in os.walk(srcFolder):
    for file in files:
        shutil.copy(os.path.join(root, file), storeFolder)

os.chdir should be avoided like the plague. For one thing - if the changes suceeeds, it won't be the directory from which you are running your os.walk anymore - and then, a second chdir on another folder will fail (either stop your porgram or change you to an unexpected folder).

Just add the folder name as prefixes, and don't try using chdir.

Moreover, as for the comment from ShadowRanger above, os.walk officially breaks if you chdir inside its iteration - https://docs.python.org/3/library/os.html#os.walk - that is likely the root of the problem you had.

Comments