user1718097 user1718097 - 2 months ago 8
Python Question

Converting between projections using pyproj in Pandas dataframe

This is undoubtedly a bit of a "can't see the wood for the trees" moment. I've been staring at this code for an hour and can't see what I've done wrong. I know it's staring me in the face but I just can't see it!

I'm trying to convert between two geographical co-ordinate systems using Python.

I have longitude (x-axis) and latitude (y-axis) values and want to convert to OSGB 1936. For a single point, I can do the following:

import numpy as np
import pandas as pd
import shapefile
import pyproj

inProj = pyproj.Proj(init='epsg:4326')
outProj = pyproj.Proj(init='epsg:27700')

x1,y1 = (-2.772048, 53.364265)

x2,y2 = pyproj.transform(inProj,outProj,x1,y1)

print(x1,y1)
print(x2,y2)


This produces the following:

-2.772048 53.364265
348721.01039783185 385543.95241055806


Which seems reasonable and suggests that longitude of -2.772048 is converted to a co-ordinate of 348721.0103978.

In fact, I want to do this in a Pandas dataframe. The dataframe contains columns containing longitude and latitude and I want to add two additional columns that contain the converted co-ordinates (called newLong and newLat).

An exemplar dataframe might be:

latitude longitude
0 53.364265 -2.772048
1 53.632481 -2.816242
2 53.644596 -2.970592


And the code I've written is:

import numpy as np
import pandas as pd
import shapefile
import pyproj

inProj = pyproj.Proj(init='epsg:4326')
outProj = pyproj.Proj(init='epsg:27700')

df = pd.DataFrame({'longitude':[-2.772048,-2.816242,-2.970592],'latitude':[53.364265,53.632481,53.644596]})

def convertCoords(row):
x2,y2 = pyproj.transform(inProj,outProj,row['longitude'],row['latitude'])
return pd.Series({'newLong':x2,'newLat':y2})

df[['newLong','newLat']] = df.apply(convertCoords,axis=1)

print(df)


Which produces:

latitude longitude newLong newLat
0 53.364265 -2.772048 385543.952411 348721.010398
1 53.632481 -2.816242 415416.003113 346121.990302
2 53.644596 -2.970592 416892.024217 335933.971216


But now it seems that the newLong and newLat values have been mixed up (compared with the results of the single point conversion shown above).

Where have I got my wires crossed to produce this result? (I apologise if it's completely obvious!)

Answer

When you do df[['newLong','newLat']] = df.apply(convertCoords,axis=1), you are indexing the columns of the df.apply output. However, the column order is arbitrary because your series was defined using a dictionary (which is inherently unordered).

You can opt to return a Series with a fixed column ordering:

return pd.Series([x2, y2])

Alternatively, if you want to keep the convertCoords output labelled, then you can use .join to combine results instead:

return pd.Series({'newLong':x2,'newLat':y2})
...
df = df.join(df.apply(convertCoords, axis=1))
Comments