dedpo dedpo - 7 months ago 11
Python Question

writing out after comparing columns from 2 csv/dataframes

I feel like there is a much intuitive way to do this then with for loops but i am currently stuck. i got a portion of it but filled the rest with psuedocode. Would greatly appreciate some help with this.

import pandas as pd
first_df= pd.read_csv('city_states.csv')

second_df = pd.read_csv('locations.csv);

for i in first_df[:,0]: #city column
for j in first_df[:,1] #state column


psuedocode

if i == second_df[:1]: # state column in second data frame
and if j == second_df[:2]: # city column in second data frame


write the longitude and latitude if a match occures to text file delimted by tab

This is what i am trying to do Currently i am trying to generate longitude and latitude from one csv if it's contained in another csv. Maybe data frames. i am looking in to best option

location.csv

zip state city lat lng
35004 AL Acmar 33.584132 -86.51557
35005 AL Adamsville 33.588437 -86.959727
35006 AL Adger 33.434277 -87.167455
35007 AL Keystone 33.236868 -86.812861
35010 AL New Site 32.941445 -85.951086
5014 AL Alpine 33.331165 -86.208934
35016 AL Arab 34.328339 -86.489638


if city and state match the city and state of this second csv. We need to create a txt file with those long and latitides delimited by tab, as written in the above psudeocode. First data frame refers to city_state.csv and second data frame referes to locations.csv which contains the long and lat we are trying to extract if there is match

city_state.csv

City State
Burlington VT
Minneapolis MN
Bloomington IN
Irvine CA


expected output

95.64652295 36.70384646
72.6535921 23.0519796
-86.2651028 43.222406
55.29835975 25.25033149
-117.87059452 34.11669299
-80.26491717 25.80180169

Answer

I think you need merge and to_csv:

print df1
     zip state        city        lat        lng
0  35004    AL       Acmar  33.584132 -86.515570
1  35005    AL  Adamsville  33.588437 -86.959727
2  35006    AL       Adger  33.434277 -87.167455
3  35007    AL    Keystone  33.236868 -86.812861
4  35010    AL     NewSite  32.941445 -85.951086
5   5014    AL      Alpine  33.331165 -86.208934
6  35016    AL        Arab  34.328339 -86.489638

print df2
          City State
0       Alpine    AL
1  Minneapolis    MN
2  Bloomington    IN
3       Irvine    CA

df = pd.merge(df1, df2, left_on=['state','city'], right_on=['State','City'])
print df
    zip state    city        lat        lng    City State
0  5014    AL  Alpine  33.331165 -86.208934  Alpine    AL

df[['lat','lng']].to_csv('filename.txt', sep='\t', header=False, index=False)