Baf Baf - 3 months ago 23
Python Question

Do I understand os.walk right?

The loop for root, dir, file in

os.walk(startdir)
works through these steps?

for root in os.walk(startdir)
for dir in root
for files in dir



  1. get root of start dir : C:\dir1\dir2\startdir

  2. get folders in C:\dir1\dir2\startdir and return list of folders "dirlist"

  3. get files in first dirlist item and return list of files "filelist" as the first item of a list of filelists.

  4. move to second item in dirlist and return list of files in this folder "filelist2" as the second item of a list of filelists. etc.

  5. move to next root in foldertree and start from 2. etc.



Right? Or does it just get all roots first, then all dirs second and all files third?

Answer

os.walk returns a generator, that creates a tuple of values (current_path, directories in current_path, files in current_path), every time the generator is applied it will follow each directory recursively and until no further sub-directories have being explored from the initial directory that walk was called upon.

as such

os.walk('C:\dir1\dir2\startdir').next()[0] # returns 'C:\dir1\dir2\startdir' os.walk('C:\dir1\dir2\startdir').next()[1] # returns all the dirs in 'C:\dir1\dir2\startdir' os.walk('C:\dir1\dir2\startdir').next()[2] # returns all the files in 'C:\dir1\dir2\startdir'

import os.path
....
for path, directories, files in os.walk('C:\dir1\dir2\startdir'):
     if file in files:
          print 'found %s' % os.path.join(path, file)

or this

def search_file(directory = None, file = None):
    assert os.path.isdir(directory)
    for cur_path, directories, files in os.walk(directory):
        if file in files:
            return os.path.join(directory, cur_path, file)
    return None

or if you want to look for file you can do this:

import os
def search_file(directory = None, file = None):
    assert os.path.isdir(directory)
    current_path, directories, files = os.walk(directory).next()
    if file in files:
        return os.path.join(directory, file)
    elif directories == '':
        return None
    else:
        for new_directory in directories:
            result = search_file(directory = os.path.join(directory, new_directory), file = file)
            if result:
                return result
        return None