Balint Balint - 5 months ago 47
Python Question

Python: how could I access tarfile.add()'s 'name' parameter in add()'s filter method?

I would like to filter subdirectories (skip them) while creating tar(gz) file with tarfile (python 3.4).

Files on disk:

  • /home/myuser/temp/test1/

  • /home/myuser/temp/test1/home/foo.txt

  • /home/myuser/temp/test1/thing/bar.jpg

  • /home/myuser/temp/test1/lemon/juice.png

  • /home/myuser/temp/test1/

Tried to compress

I use with- and without-path modes. With full path it's OK, but with short path I have this problem:
directory exclusion doesn't work because tarfile.add() passes the
parameter to filter method - not

archive.add(entry, arcname=os.path.basename(entry),


arcname = test1/thing/bar.jpg

So because of
element in
, the filter method should exclude this file, but it can not because filter method gets

How could I access tarfile.add()'s 'name' parameter in filter method?

def filter_general(item):
exclude_dir_fullpath = ['/home/myuser/temp/test1/thing', '/home/myuser/temp/test1/lemon']
if any(dirname in for dirname in exclude_dir_fullpath):
print("Exclude fullpath dir matched at: %s" % # DEBUG
return None
return item

def compress_tar():
filepath = '/tmp/test.tar.gz'
include_dir = '/home/myuser/temp/test1/'
archive =, mode="w:gz")
archive.add(include_dir, arcname=os.path.basename(include_dir), filter=filter_general)



You want to create a general/re-useable function to filter out files given their absolute path name. I understand that filtering on the archive name is not enough since sometimes it would be OK to include a file or not depending on where it is originated.

First, add a parameter to your filter function

def filter_general(item,root_dir):
    full_path = os.path.join(root_dir,

Then, replace your "add to archive" code line by:

archive.add(include_dir, arcname=os.path.basename(include_dir), filter=lambda x: filter_general(x,os.path.dirname(include_dir)))

the filter function has been replaced by a lambda which passes the directory name of the include directory (else, root dir would be repeated)

Now your filter function knows the root dir and you can filter by absolute path, allowing you to reuse your filter function in several locations in your code.