I am processing HTML using Python and the BeautifulSoup 4 library and I can't find an obvious way to replace
See Entities in the documentation. BeautifulSoup 4 produces proper Unicode for all entities:
An incoming HTML or XML entity is always converted into the corresponding Unicode character.
is turned into a non-breaking space character. If you really want those to be space characters instead, you'll have to do a unicode replace.