I'm trying to replace the last occurrence of a substring from a string using re.sub in Python but stuck with the regex pattern. Can someone help me to get the correct pattern?
String = "cr US TRUMP DE NIRO 20161008cr_x080b.wmv"
or String = "crcrUS TRUMP DE NIRO 20161008cr.xml"
I want to replace the last occurrence of "cr" and anything before the extension.
desired output strings are -
"cr US TRUMP DE NIRO 20161008.wmv"
"crcrUS TRUMP DE NIRO 20161008.xml"
I'm using re.sub to replace it.
re.sub('pattern', '', String)
You can use this negative lookahead regex:
repl = re.sub(r"cr((?!cr)[^.])*(?=\.[^.]+$)", "", input);
cr # match cr (?: # non-capturing group start (?! # negative lookahead start cr # match cr ) # negative lookahead end [^.] # match anything but DOT ) # non-capturing group end * # match 0 or more of matching character that doesn't have cr at next postion (?= # positive lookahead start \. # match DOT [^.]+ # followed by 1 or more anything but DOT $ # end of input ) # postive lookahead end