Thejesh PR Thejesh PR - 4 months ago 8
Python Question

Python Regular expression to find latest file in directory

I have one directory which contains below files for example purpose.

Directory:
ERROR_AM_INMAG_Export_2016-07-25.csv
AM_INMAG_Export_2016-07-26_done.csv
ERROR_AM_INMAG_Export_2016-07-27.csv
AM_INMAG_Export_2016-07-28_done.csv
AM_INMAG_Export_2016-07-29.csv
file1
file2
fileN


Here how can i retrieve file which starts with ""AM_INMAG_Export_" and it should have latest timestamp using Python.
for example: "AM_INMAG_Export_2016-07-29.csv" is the file I want to retrieve.
BUT "fileN" is the latest modified file in the directory.

Answer

Filter the files that match your desired prefix and then sort.

>>> files = """ERROR_AM_INMAG_Export_2016-07-25.csv
... AM_INMAG_Export_2016-07-26_done.csv
... ERROR_AM_INMAG_Export_2016-07-27.csv
... AM_INMAG_Export_2016-07-28_done.csv
... AM_INMAG_Export_2016-07-29.csv
... file1
... file2
... fileN""".split('\n')
>>> files
['ERROR_AM_INMAG_Export_2016-07-25.csv', 'AM_INMAG_Export_2016-07-26_done.csv ', 'ERROR_AM_INMAG_Export_2016-07-27.csv', 'AM_INMAG_Export_2016-07-28_done.csv ', 'AM_INMAG_Export_2016-07-29.csv', 'file1', 'file2', 'fileN']
>>> filtered_files = [ x for x in files if x.startswith('AM_INMAG_Export_')]
>>> sorted_files = sorted(filtered_files,reverse=True)
>>> sorted_files[0]
'AM_INMAG_Export_2016-07-29.csv'

Update

Filter filenames with a regexp and then sort.

>>> import re
>>>
>>> files = [
...   'ERROR_AM_INMAG_Export_2016-07-25.csv',
...   'AM_INMAG_Export_2016-07-26_done.csv',
...   'ERROR_AM_INMAG_Export_2016-07-27.csv',
...   'AM_INMAG_Export_2016-07-28_done.csv',
...   'AM_INMAG_Export_2016-07-21.csv',
...   'AM_INMAG_Export_2016-07-25.csv',
...   'AM_INMAG_Export_2016-07-29.csv',
...   'file1',
...   'file2',
...   'fileN'
... ]
>>>
>>> file_re = re.compile(r'^AM_INMAG_Export_\d{4}-\d{2}-\d{2}.csv$')
>>> filtered_files = [ x for x in files if file_re.match(x)]
>>> sorted_files = sorted(filtered_files,reverse=True)
>>> sorted_files[0]
'AM_INMAG_Export_2016-07-29.csv'
Comments