xralf xralf - 1 month ago 10
Python Question

re.sub correcting slashes in URL

The input is a URL that may contain two or more sucessive slashes. I corrected it with the following two commands, which seems to be quite satisfying readable solution.

I wonder if you could achieve the same with only one re.sub() command.

url = re.sub("/[/]+", "/", url) # two or more slashes replace with one slash
url = re.sub("http:/", "http://", url) # correct one mistake of the previous command

Answer

Yes you can. Use the negative lookbehind markup ?<!:

print(re.sub('(?<!http:)//+', '/', 'http://httpbin.org//ip'))
# http://httpbin.org/ip