Age Age - 3 months ago 15
Python Question

Google Cloud Storage API write files with special characters vs regular python files

I am using Google App Engine to write a new file to a Google Cloud Storage bucket for eventual serving in the browser. Normally on my local computer this writes a nice text file which I can open and see the test character as expected:

with open('new_file.txt', 'w') as f:
f.write(u'é'.encode('utf-8'))


When I open
new_file.txt
in Notepad it's properly displayed as
é
.

But when I try the analogous process on Google Cloud Storage:

with gcs.open('/mybucket/newfile.txt', 'w', content_type='text/html') as f:
f.write(u'é'.encode('utf-8'))


My files are served in the browser with special characters all messed up, in this case it outputs
é
.

enter image description here

Answer

The default charset for HTTP 1.1 is ISO-8859-1.

If you want the browser to interpret your text as UTF-8, you should set the content-type header to include the charset, like this:

with gcs.open('/mybucket/newfile.txt', 'w', content_type='text/html; charset=utf-8') as f:
Comments