Robert Robert - 2 months ago 14
JSON Question

JSON parsing encoding causes a unicode encode error

I need to parse some simple JOSN in bash which contains non-ascii characters without external dependencies, so I used a python solution from this answer

cat $JSON_FILE | python -c "import sys, json; print json.load(sys.stdin)['$KEY']"


This works for ascii values but other values throws this error:


'ascii' codec can't encode character u'\u2019' in position 1212: ordinal not in range(128)


Looking at this answer I think I need to cast to the
unicode
type, but I don't know how.

Answer

You already have unicode, but encoding when printing fails.

That's either because you don't have a locale set, have your locale set to ASCII, or you are piping the Python result to something else (but did not include that in your question). In the latter case Python refuses to guess what codec to use when connected to a pipe (it can use your terminal locale otherwise).

Set the PYTHONIOENCODING environment variable to a suitable codec; if your terminal uses UTF-8 for example:

cat $JSON_FILE  | PYTHONIOENCODING=UTF-8 python -c "import sys, json; print json.load(sys.stdin)['$KEY']"