flowerflower flowerflower - 4 months ago 27
Python Question

How to convert convert csv to list of dictionaries (UTF-8)?

I have a csv file (in.csv)

col1, col2, col3
Kapitän, Böse, Füller

and I want to create a list of dictionaries:

a = [{'col1': 'Kapitän', 'col2': 'Böse', 'col3': 'Füller'},{...}]

With Python 3 it's working with

import codecs
with codecs.open('in.csv', encoding='utf-8') as f:
a = [{k: v for k, v in row.items()}
for row in csv.DictReader(f, skipinitialspace=True)]

(I've got this code from convert csv file to list of dictionaries).

Unfortunately I need this for Python 2, but I don't come along with it.

I tried to understand https://docs.python.org/2.7/howto/unicode.html, but I think I'm too stupid, because

import codecs
f = codecs.open('in.csv', encoding='utf-8')
for line in f:
print repr(line)

gives me


Do you have a solution for Python 2?

There is a similar problem solved here: Creating a dictionary from a csv file? But with the marked solution I get
('K\xc3\xa4pten', 'B\xc3\xb6se', 'F\xc3\xbcller')
. Maybe it's easy to edit it for getting
[{u'col1': u'K\xe4pten', u'col2': u'B\xf6se', u'col3': u'F\xfcller'}]


for print use print line instead print repr(line)

and for dict i use this solution


The csv module doesn’t directly support reading and writing Unicode

import codecs
import csv

def utf_8_encoder(unicode_csv_data):
    for line in unicode_csv_data:
        yield line.encode('utf-8')

def unicode_csv_reader(unicode_csv_data, dialect=csv.excel, **kwargs):
    # csv.py doesn't do Unicode; encode temporarily as UTF-8:
    csv_reader = csv.reader(utf_8_encoder(unicode_csv_data),
                            dialect=dialect, **kwargs)
    for row in csv_reader:
        # decode UTF-8 back to Unicode, cell by cell:
        yield [unicode(cell, 'utf-8') for cell in row]

with codecs.open('in.csv', encoding='utf-8') as f:
    reader = unicode_csv_reader(f)
    keys = [k.strip() for k in reader.next()]
    result = []
    for row in reader:
        d=dict(zip(keys, row))

    for d in result:
        for k, v in d.iteritems():
            print k, v
    print result