numpy numpy - 1 month ago 8
Python Question

Why is this regex expression not working?

I want to separate out the links from the string which don't have ':' in between and do not end with '.jpg' or '.svg', and also start with '/wiki/'.

So these are wrong -

"https://boomerrang.com"
"/wiki/sbsbs:kjanw"
"/wiki/aswaa:asawsa.jpg"
"/wiki/awssa.random.jpg"
"/wiki/boom.jpg"


How the final result should look like -

"/wiki/justthis"


What I tried -

r'^/wiki/.*[^:](?!jpg|svg)$'


But its not evaluating properly, infact its giving all the result which I do not want... I'm kind of new to regex, so please tell me why this is not working, and how should I correct it.

Thanks

Answer

You can try that:

r'^/wiki/[^:]*(?<!\.jpg)(?<!\.svg)$'

The two negative lookbehinds at the end ensure that the string doesn't end with .svg or .jpg.

[^:]* avoids any : in the string.

Comments