Trexion Kameha Trexion Kameha - 2 months ago 13
Python Question

Python: Lookup value between two other columns in another dataframe

I have a dataframe with IP addresses:

user_id ip_address
0 22058 7.327584e+08
1 333320 3.503114e+08
2 1359 2.621474e+09
3 150084 3.840542e+09
4 221365 4.155831e+08

I have a lookup dataframe where the ip_address as a low and a high, mapped to a country:

lower_bound_ip_address upper_bound_ip_address country
0 16777216.0 16777471 Australia
1 16777472.0 16777727 China
2 16777728.0 16778239 China
3 16778240.0 16779263 Australia
4 16779264.0 16781311 China

I would like to add 'country' into the above dataframe as a new column by looking up the ip_address between the lower and upper bound.

for i in range(len(ip_data)):
tmp=df_ip[(df_ip['lower_bound_ip_address']<=ip) & (df_ip['upper_bound_ip_address']>=ip)]

However this returns:

user_id ip_address country
0 22058 7.327584e+08 [Japan]
1 333320 3.503114e+08 [United States]
2 1359 2.621474e+09 [United States]
3 150084 3.840542e+09 []
4 221365 4.155831e+08 [United States]

The country column has those brackets in there. What am I doing wrong here?

Anyone have any idea what I am doing wrong? In SQL this is a very easy join with "between" as the operator. Thank you.


apply is my usual go to solution in cases like this:

def lookup_country(ip):

    return df_ip[(df_ip['lower_bound_ip_address']<ip) & (df_ip['upper_bound_ip_address']>=ip)]['country']

df_users['country'] = df_users.apply(lambda row: lookup_country(row['ip_address']), axis=1)

There is probably a join, merge or concatenation solution to this, but this is the more obvious way to solve the problem, I think.

Note: this solution relies on the conditions returning only a single entry.