Lenin Mishra Lenin Mishra - 7 months ago 41
Python Question

Adding delimiters to a text file using python

I have recently started my job as an ETL Developer and as a part of my exercise, I am extracting data from a text file containing raw data. My raw data looks like this as shown in the image.
My Raw Data

Now I want to add delimiters to my data file. Basically after every line, I want to add a comma (

). My code in Python looks like this.

with open ('new_locations.txt', 'w') as output:
with open('locations.txt', 'r') as input:
for line in input:
new_line = line+','

is the output text file,
is the raw data.

However, it throws me error all the time.

UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 3724: character maps to

Where exactly am I going wrong?

Note: The characters in raw data are not all ASCII characters. Some are Latin characters as well.


When you open a file in python 3 in "text" mode then reading and writing convert the bytes in the file to python (unicode) strings. The default encoding is platform dependent, but is usually UTF-8.

If you file uses latin-1 encoding, you should open with

with open('locations.txt', 'r', encoding='latin_1') as input

You should probably also do this with the output if you want the output also to be in latin-1.

In the longer term, you should probably consider converting all your data to a unicode format in the data files.