David Page David Page - 23 days ago 7
Python Question

How to get file extension correctly?

I know that this question is asked many times on this website. But I found that they missed an important point: only file extension with one period was taken into consider like

*.png *.mp3
, but how do I deal with these filename with two period like
.tar.gz
.

The basic code is:

filename = '/home/lancaster/Downloads/a.ppt'
extention = filename.split('/')[-1]


But obviously, this code do not work with the file like
a.tar.gz
.
How to deal with it? Thanks.

Answer

The role of a file extension is to tell the viewer (and sometimes the computer) which application to use to handle the file.

Taking your worst-case example in your comments (a.ppt.tar.gz), this is a PowerPoint file that has been tar-balled and then gzipped. So you need to use a gzip-handling program to open it. Using PowerPoint or a tarball-handling program wouldn't work. OK, a clever program that knew how to handle both .tar and .gz files could understand both operations and work with a .tar.gz file - but note that it would do that even if the extension was simply .gz.

The fact that both tar and gzip add their extensions to the original filename, rather than replace them (as zip does) is a convenience. But the base name of the gzip file is still a.ppt.tar.