Avión Avión - 1 month ago 8
Java Question

Removing duplicate extension files on Java

Having a folder with the following files

myoutput.pdf
hello.pdf.pdf
byebye.txt
hey.txt.txt


How can I remove the extension of ONLY the files with a duplicated extension? In the example there are only pdf and txt, but in the real example can be any type of extension.

I got this far:

String dir = "C:/Users/test";

File[] files = new File(dir).listFiles();

for (File file : files) {
if (file.isFile()) {
System.out.println(file.getName());
}
}


But I don't know how can I now detected the ones with "duplicated extension" and rename it to a "single extension"

My desired output would be:

myoutput.pdf
hello.pdf
byebye.txt
hey.txt


Thanks in advance.

Answer

You may use

file.getName().replaceAll("(\\.\\w+)\\1+$", "$1")

See the regex demo

Pattern details:

  • (\\.\\w+) - Group 1 capturing a dot and 1 or more word chars (\\w+ might be replaced with [^.]+ to match 1 or more chars other than a dot)
  • \\1+ - one or more occurrences of the same value captured in Group 1
  • $ - end of string.

The replacement pattern is just the backreference to Group 1 ($1) to only have 1 single occurrence of the extension.