Krowar Krowar - 29 days ago 29
Python Question

How to solve this encoding issue in with Spyder in Anaconda (Python 3)?

I'm trying to run the following:

import json
path = 'ch02/usagov_bitly_data2012-03-16-1331923249.txt'
records = [json.loads(line) for line in open(path)]


But I get the following error :


UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position
6987: ordinal not in range(128)


From the internet I've found that it should be because the encoding needs to be set to utf-8, but my issue is that it's already in utf-8.

sys.getdefaultencoding()
Out[43]: 'utf-8'


Also, it looks like my file is in utf-8, so I'm really confused
Also, the following code works :

In [15]: path = 'ch02/usagov_bitly_data2012-03-16-1331923249.txt'
In [16]: open(path).readline()


Is there a way to solve this ?

Thanks !

EDIT:

When I run the code in my console it works, but not when I run it in Spyder provided by Anaconda (https://www.continuum.io/downloads)

Do you know what can go wrong ?

Answer

The text file contains some non-ascii characters on a line somewhere. Somehow on your setup the default file encoding is set to ascii instead of utf-8 so do the following and specify the file's encoding explicitly:

import json
path = 'ch02/usagov_bitly_data2012-03-16-1331923249.txt' 
records = [json.loads(line.strip()) for line in open(path, encoding="utf-8"))]

(Doing this is a good idea anyway even when the default works)

Comments