Kemal Diri Kemal Diri - 5 months ago 62
Python Question

Concatenate Pandas Dataframes Rows side by side / top and bottom in same time

I've problem. I want to create a new dataframe from another one. I want to avoid duplicate rows. It mean if there is same mails, I should concatenate them side-by-side otherwise top and bottom. But the problem is I'm getting value indexing error every time.

pandas.indexes.base.InvalidIndexError: Reindexing only valid with uniquely valued Index objects


And here is what I did :

if not self.data.empty:
if data_frame_['Email'][0] in self.data['Email'].get_values():
self.data = pd.concat([self.data, data_frame_], axis=1)
else:
self.data = pd.concat([self.data,data_frame_], axis=0)
else:
self.data = data_frame_.copy()

end = time.time()


data_frame_ has only one row this is why I'm using

data_frame_['Email'][0]


Exemple of data (which is in data_frame_ ):

Email Project1 Target1 Projetc2 Target2
-------------------------------------------------------------
kemaldiri@gmail.com 1 5000 NaN NaN
abc@abc.com 7 5000 NaN NaN
kemaldiri@gmail.com 7 4000 NaN NaN


What I desire is :

Email Project1 Target1 Projetc2 Target2
-------------------------------------------------------------
kemaldiri@gmail.com 1 5000 7 4000
abc@abc.com 7 5000 NaN NaN


Ps : I could do it using dicts but to protect code integrity, I'd like to use dataframes.

Thank you in advance.

Answer

You can use pivot_table, but first create groups by cumcount:

#rename columns
df.rename(columns={'Project1':'Project','Target1':'Target'}, inplace=True)

print (df)
                 Email  Project  Target
0  kemaldiri@gmail.com        1    5000
1          abc@abc.com        7    5000
2  kemaldiri@gmail.com        7    4000

df['g'] = (df.groupby('Email').cumcount() + 1).astype(str)

df1 = df.pivot_table(index='Email', columns='g', values=['Project', 'Target'])
#Sort multiindex in columns 
df1 = df1.sort_index(axis=1, level=1)
#'reset' multiindex in columns
df1.columns = [''.join(col) for col in df1.columns]
print (df1)
                     Project1  Target1  Project2  Target2
Email                                                    
abc@abc.com               7.0   5000.0       NaN      NaN
kemaldiri@gmail.com       1.0   5000.0       7.0   4000.0
Comments