Cooper Cooper - 1 month ago 5
Python Question

How to get the position where UnicodeDecodeError occurred?

How can I get a position of where did UnicodeDecodeError occurred?
I found material over here and tried to implement it below. But I just get an error

NameError: name 'err' is not defined


I searched all over the internet already and here on StackOverflow, but cannot find any hint how to use it. In python docs it says that this particular exception has start attribute, so it must be possible.

Thank you.

data = buffer + data
try:
data = data.decode("utf-8")
except UnicodeDecodeError:
#identify where did the error occure?
#chunk that piece off -> copy troubled piece into buffer and
#decode the good one -> then go back, receive the next chunk of
#data and concatenate it to the buffer.

buffer = err.data[err.start:]
data = data[0:err.start]
data = data.decode("utf-8")

Answer

That information is stored in the exception itself. You can get the exception object with the as keyword, and use the start attribute:

while True:
    try:
        data = data.decode("utf-8")
    except UnicodeDecodeError as e:
        data = data[:e.start] + data[e.end:]
    else:
        break