S. 16 S. 16 - 9 days ago 6
Python Question

for loop in python to connect the data from 2 columns

I have a huge file(20,000 rows) with 2 columns (id and value). Some ids have different value. I want to write a for loop to give me all values for ids.

By the way I am using pandas and importing the data as data frame.

for example:
The file is:

id value
a 2
a 3
b 2
c 4
b 5


I want the result to be like:

a 2,3
b 2,5
c 4


Thanks

Answer

IIUC:
you want a list of values

df.groupby('id').value.apply(list)

id
a    [2, 3]
b    [2, 5]
c       [4]
Name: value, dtype: object

if you want strings... this is @jezrael's answer, just modified to my tastes

df.astype(str).groupby('id').value.apply(','.join)

id
a    2,3
b    2,5
c      4
Name: value, dtype: object

experimental numpy solution

u, i = np.unique(df.id.values, return_inverse=True)
g = np.arange(len(u))[:, None] == i

def slc(r):
    return df.value.values[r].tolist()

pd.Series(list(map(slc, g)), u)

a    [2, 3]
b    [2, 5]
c       [4]
dtype: object