elksie5000 elksie5000 - 1 month ago 17
Python Question

How to form tuple column from two columns in Pandas

I've got a Pandas DataFrame and I want to combine the 'lat' and 'long' columns to form a tuple.

<class 'pandas.core.frame.DataFrame'>
Int64Index: 205482 entries, 0 to 209018
Data columns:
Month 205482 non-null values
Reported by 205482 non-null values
Falls within 205482 non-null values
Easting 205482 non-null values
Northing 205482 non-null values
Location 205482 non-null values
Crime type 205482 non-null values
long 205482 non-null values
lat 205482 non-null values
dtypes: float64(4), object(5)


The code I tried to use was:

def merge_two_cols(series):
return (series['lat'], series['long'])

sample['lat_long'] = sample.apply(merge_two_cols, axis=1)


However, this returned the following error:

---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
<ipython-input-261-e752e52a96e6> in <module>()
2 return (series['lat'], series['long'])
3
----> 4 sample['lat_long'] = sample.apply(merge_two_cols, axis=1)
5


...

AssertionError: Block shape incompatible with manager


How can I solve this problem?

Answer

Get comfortable with zip. It comes in handy when dealing with column data.

df['new_col'] = list(zip(df.lat, df.long))

It's less complicated and faster than using apply or map. Something like np.dstack is twice as fast as zip, but wouldn't give you tuples.