W R - 1 year ago 87
Python Question

# Pandas Latitude-Longitude to distance between successive rows

I have the following in a Pandas DataFrame in Python 2.7:

``````Ser_Numb        LAT      LONG
1  74.166061 30.512811
2  72.249672 33.427724
3  67.499828 37.937264
4  84.253715 69.328767
5  72.104828 33.823462
6  63.989462 51.918173
7  80.209112 33.530778
8  68.954132 35.981256
9  83.378214 40.619652
10 68.778571 6.607066
``````

I am looking to calculate the distance between successive rows in the dataframe. The output should look something like this:

``````Ser_Numb          LAT        LONG   Distance
1    74.166061   30.512811          0
2    72.249672   33.427724          d_between_Ser_Numb2 and Ser_Numb1
3    67.499828   37.937264          d_between_Ser_Numb3 and Ser_Numb2
4    84.253715   69.328767          d_between_Ser_Numb4 and Ser_Numb3
5    72.104828   33.823462          d_between_Ser_Numb5 and Ser_Numb4
6    63.989462   51.918173          d_between_Ser_Numb6 and Ser_Numb5
7    80.209112   33.530778   .
8    68.954132   35.981256   .
9    83.378214   40.619652   .
10   68.778571   6.607066    .
``````

Attempt

This post looks somewhat similar but it is calculating the distance between fixed points. I need the distance between successive points.

I tried to adapt this as follows:

``````df['LAT_rad'], df['LON_rad'] = np.radians(df['LAT']), np.radians(df['LONG'])
df['distance'] = 6367 * 2 * np.arcsin(np.sqrt(np.sin(df['dLAT']/2)**2 + math.cos(df['LAT_rad'].astype(float).shift(-1)) * np.cos(df['LAT_rad']) * np.sin(df['dLON']/2)**2))
``````

However, I get the following error:

``````Traceback (most recent call last):
File "C:\Python27\test.py", line 115, in <module>
df['distance'] = 6367 * 2 * np.arcsin(np.sqrt(np.sin(df['dLAT']/2)**2 + math.cos(df['LAT_rad'].astype(float).shift(-1)) * np.cos(df['LAT_rad']) * np.sin(df['dLON']/2)**2))
File "C:\Python27\lib\site-packages\pandas\core\series.py", line 78, in wrapper
"{0}".format(str(converter)))
TypeError: cannot convert the series to <type 'float'>
[Finished in 2.3s with exit code 1]
``````

This error was fixed from MaxU's comment. With the fix, the output of this calculation is not making sense - the distance is nearly 8000 km:

``````   Ser_Numb        LAT       LONG   LAT_rad   LON_rad      dLON      dLAT     distance
0         1  74.166061  30.512811  1.294442  0.532549       NaN       NaN          NaN
1         2  72.249672  33.427724  1.260995  0.583424  0.574129  1.238402  8010.487211
2         3  67.499828  37.937264  1.178094  0.662130  0.651947  1.156086  7415.364469
3         4  84.253715  69.328767  1.470505  1.210015  1.198459  1.449943  9357.184623
4         5  72.104828  33.823462  1.258467  0.590331  0.569212  1.232802  7992.087820
5         6  63.989462  51.918173  1.116827  0.906143  0.895840  1.094862  7169.812123
6         7  80.209112  33.530778  1.399913  0.585222  0.569407  1.380421  8851.558260
7         8  68.954132  35.981256  1.203477  0.627991  0.617777  1.179044  7559.609520
8         9  83.378214  40.619652  1.455224  0.708947  0.697986  1.434220  9194.371978
9        10  68.778571   6.607066  1.200413  0.115315  0.102942  1.175014          NaN
``````

According to:

• this online calculator: If I use Latitude1 = 74.166061,
Longitude1 = 30.512811, Latitude2 = 72.249672, Longitude2 = 33.427724
then I get 233 km

• haversine function found
here as:
`print haversine(30.512811, 74.166061, 33.427724, 72.249672)`
then I
get 232.55 km

The answer should be 233 km, but my approach is giving ~8000 km. I think there is something wrong with how I am trying to iterate between successive rows.

Question:
Is there a way to do this in Pandas? Or do I need to loop through the dataframe one row at a time?

To create the above DF, select it and copy to clipboard. Then:

``````import pandas as pd
print df
``````

you can use this great solution (c) @ballsatballsdotballs (don't forget to upvote it ;-):

``````def haversine_np(lon1, lat1, lon2, lat2):
"""
Calculate the great circle distance between two points
on the earth (specified in decimal degrees)

All args must be of equal length.

"""
lon1, lat1, lon2, lat2 = map(np.radians, [lon1, lat1, lon2, lat2])

dlon = lon2 - lon1
dlat = lat2 - lat1

a = np.sin(dlat/2.0)**2 + np.cos(lat1) * np.cos(lat2) * np.sin(dlon/2.0)**2

c = 2 * np.arcsin(np.sqrt(a))
km = 6367 * c
return km

df['dist'] = \
haversine_np(df.LONG.shift(), df.LAT.shift(),
df.ix[1:, 'LONG'], df.ix[1:, 'LAT'])
``````

Result:

``````In [566]: df
Out[566]:
Ser_Numb        LAT       LONG         dist
0         1  74.166061  30.512811          NaN
1         2  72.249672  33.427724   232.549785
2         3  67.499828  37.937264   554.905446
3         4  84.253715  69.328767  1981.896491
4         5  72.104828  33.823462  1513.397997
5         6  63.989462  51.918173  1164.481327
6         7  80.209112  33.530778  1887.256899
7         8  68.954132  35.981256  1252.531365
8         9  83.378214  40.619652  1606.340727
9        10  68.778571   6.607066  1793.921854
``````

UPDATE: this will help to understand the logic:

``````In [573]: pd.concat([df['LAT'].shift(), df.ix[1:, 'LAT']], axis=1, ignore_index=True)
Out[573]:
0          1
0        NaN        NaN
1  74.166061  72.249672
2  72.249672  67.499828
3  67.499828  84.253715
4  84.253715  72.104828
5  72.104828  63.989462
6  63.989462  80.209112
7  80.209112  68.954132
8  68.954132  83.378214
9  83.378214  68.778571
``````
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download