Deepak M Deepak M - 27 days ago 8
Python Question

Filtering multiple conditions from a Dataframe in Python

I want to filter out data from a dataframe using multiple conditions using multiple columns. I tried doing so like this:

arrival_delayed_weather = [[[flight_data_finalcopy["ArrDelay"] > 0]] & [[flight_data_finalcopy["WeatherDelay"]>0]]]
arrival_delayed_weather_filter = arrival_delayed_weather[["UniqueCarrier", "AirlineID"]]
print arrival_delayed_weather_filter


However i get this error message:

TypeError: unsupported operand type(s) for &: 'list' and 'list'


How do i solve this.
Thanks in advanced

Answer

You need () instead []:

arrival_delayed_weather = (flight_data_finalcopy["ArrDelay"] > 0) & 
                           (flight_data_finalcopy["WeatherDelay"]>0)

But it seems you need ix for selecting columns UniqueCarrier and AirlineID by mask - a bit modified boolean indexing:

mask = (flight_data_finalcopy["ArrDelay"] > 0) & 
        (flight_data_finalcopy["WeatherDelay"]>0)
arrival_delayed_weather_filter=flight_data_finalcopy.ix[mask, ["UniqueCarrier","AirlineID"]]

Sample:

flight_data_finalcopy = pd.DataFrame({'ArrDelay':[0,2,3],
                                      'WeatherDelay':[0,0,6],
                                      'UniqueCarrier':['s','a','w'],
                                      'AirlineID':[1515,3546,5456]})

print (flight_data_finalcopy)
   AirlineID  ArrDelay UniqueCarrier  WeatherDelay
0       1515         0             s             0
1       3546         2             a             0
2       5456         3             w             6

mask = (flight_data_finalcopy["ArrDelay"] > 0) & (flight_data_finalcopy["WeatherDelay"]>0)
print (mask)
0    False
1    False
2     True
dtype: bool

arrival_delayed_weather_filter=flight_data_finalcopy.ix[mask, ["UniqueCarrier","AirlineID"]]
print (arrival_delayed_weather_filter)
  UniqueCarrier  AirlineID
2             w       5456