Basically I have html similar to this:
If your general pattern is to ignore
<b> tags, you could use w3lib to remove those tags and construct new response from the result. Something like:
import w3lib import scrapy new_body = w3lib.html.remove_tags(response.body, which_ones=('b')) new_response = scrapy.http.HtmlResponse(url=response.url, body=new_body)
new_response now contains the original response but with
<b> tags removed. You can then use extraction logic without the need to consider them.