Basically I have html similar to this:
<div>
<p>
<b>1</b> Communication
</p>
<p>
<b>2</b> Errors
</p>
...
</div>
response.xpath("//div//p//text()")
[
"1",
"Communication",
"2",
"Errors"
]
[
"1 Communication",
"2 Errors"
]
If your general pattern is to ignore <b>
tags, you could use w3lib to remove those tags and construct new response from the result. Something like:
import w3lib
import scrapy
new_body = w3lib.html.remove_tags(response.body, which_ones=('b'))
new_response = scrapy.http.HtmlResponse(url=response.url, body=new_body)
new_response
now contains the original response but with <b>
tags removed. You can then use extraction logic without the need to consider them.