Luke H Luke H - 3 months ago 20
Python Question

How to see if a string contains all substrings from a list? [Python]

Here's the scenario:

I have a long list of time-stamped file names with characters before and after the time-stamp.

Something like this:

prefix_20160817_suffix


What I want is a list (which will ultimately be a subset of the original list) that contains file names with specific prefixes, suffixes, and parts of the timestamp. These specific strings are already given in a list. Note: this "contains" list might vary in size.

For example:
['prefix1', '2016', 'suffix']
or
['201608', 'suffix']


How can I easily get a list of file names that contain every element in the "contains" array?

Here's some pseudo code to demonstrate what I want:

for each fileName in the master list:
if the fileName contains EVERY element in the "contains" array:
add fileName to filtered list of filenames

Answer

I'd compile the list into a fnmatch pattern:

import fnmatch

pattern = '*'.join(contains)
filetered_filenames = fnmatch.filter(master_list, pattern)

This basically concatenates all strings in contains into a glob pattern with * wildcards in between. This assumes the order of contains is significant. Given that you are looking for prefixes, suffixes and (parts of) dates in between, that's not that much of a stretch.

It is important to note that if you run this on an OS that has a case-insensitive filesystem, that fnmatch matching is also case-insensitive. This is usually exactly what you'd want in that case.