I'm trying to add a ' ' into a Beautifulsoup tag. BS converts the
import bs4 as Beautifulsoup
html = "<td><span></span></td>"
soup = Beautifulsoup(html)
tag = soup.find("td")
tag.string = " "
minimal output formatter and converts HTML entities.
The solution is to set output formatter to
None, quote from BS source (
# There are five possible values for the "formatter" argument passed in # to methods like encode() and prettify(): # # "html" - All Unicode characters with corresponding HTML entities # are converted to those entities on output. # "minimal" - Bare ampersands and angle brackets are converted to # XML entities: & < > # None - The null formatter. Unicode characters are never # converted to entities. This is not recommended, but it's # faster than "minimal".
from bs4 import BeautifulSoup html = "<td><span></span></td>" soup = BeautifulSoup(html, 'html.parser') tag = soup.find("span") tag.string = ' ' print soup.prettify(formatter=None)
<td> <span> </span> </td>
Hope that helps.