xblackbytesx xblackbytesx - 4 months ago 20
Python Question

Python: Transform a CSV with one value per line to multiple values per line with first value as key

I've been searching all over the interwebs for an answer to this one.
I have found answers pretty similar and close to this one but not similar enough for me to get the job done.

I've been trying to transform and/or re-order a CSV file that has a single value per line into a CSV with multiple values (comma separated) per line.

The input CSV looks something like this:

id,image
001,han.jpg
001,leia.jpg
001,chewie.jpg
002,stewie.jpg
002,meg.jpg
003,marty.jpg
003,doc_brown.jpg
003,biff.jpg


I need the output to be like this:

id,image
001,han.jpg,leia.jpg,chewie.jpg
002,stewie.jpg,meg.jpg
003,marty.jpg,doc-brown.jpg,biff.jpg


I know this might end up as an invalid CSV format but this isn't the final form of my CSV, this is just a step in between. There'll be more magic after this ;-).

I'd like a way to do this using Python for this will be part of a script I'm writing that's already written in Python.

I've been researching this for quite some time now so if anyone here is willing to help out that'd be awesome!

Answer

Here you go,

import csv
lines =[]
with open(inputFile,'rb') as f:
    for line in csv.reader(f):
        lines.append(line)
def groupUp (lines):
    finalList = []
    for row in lines:
        for i,otherRow in enumerate(finalList):
            if row[0]==otherRow[0]:
                finalList[i] += row[1:]
                break
        else:
            finalList.append(row)
    return finalList

with open(newFile,'wb') as f:
    a = csv.writer(f)
    a.writerows(groupUp(lines))