thumbtackthief thumbtackthief -4 years ago 162
Python Question

Regex for deleting two patterns in a string

I'm using regex to parse HTML. So, confessing that sin right off the bat. If you have a better way, answer it here because I feel dirty and wrong.

Nonetheless, I can't find the answer to this regex question which can apply to non-HTML.

I have a string like:

tag ='style="width: 2010px; background-color: red; height: 200px; font-size: 12px"'


and want to remove the width and height elements only, so I tried:

r = r'style="(width:\s?\d+px;?)|(height:\s?\d+px;?)'
tag = re.sub(r, "", tag)


The pattern seems to match in regex101 here but I'm getting a
TypeError: 'expected string or buffer
.

Answer Source

Try using the following regex :

(?:width|height):\s?\d+px;?\s?

DEMO

python

import re
regex = r"(?:width|height):\s?\d+px;?\s?"
test_str = '<div id="attachment_9565" class="wp-caption aligncenter" style="width: 2010px;background-color:red;height:200px">'
subst = ""
result = re.sub(regex, subst, test_str, 0)
if result:
    print (result)
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download