My data is a csv that looks like:
First, your input csv file is not really a csv. It's more a file that can be parsed using
Now, I'll get the tokens and use
itertools.groupby using first column as key to group items with same first column.
Once you have that, filter out the lists with one 1 item, and apply a combination on the rest.
Write as a proper csv file:
import csv, itertools with open("test.csv") as f: with open("output.csv","w",newline="") as f2: # with open("output.csv","wb") as f2: # uncomment for python 2 (comment above!) cw = csv.writer(f2,delimiter=";") for l in itertools.groupby((l.split() for l in f),lambda x : x): grouped = [x for x in l] if len(grouped)>1: for c in itertools.combinations(grouped,2): cw.writerow(c)
result (corrected, yours is not correct):
abc;def jkl;mno jkl;pqr mno;pqr