Eduardo Zacour Eduardo Zacour - 2 months ago 9
Python Question

I'm having trouble understanding how is this line of code filtering a pandas dataframe's rows

I understand that first i'm assigning the csv file's content to the dataframe, but i dont understand what exactly the lambda function is doing to not select the rows that have the value of 'None' in the 'Fat' column.

data = pd.read_csv('data.csv',delimiter=';')

filtered_data = data[lambda row:row.Fat != 'None']


It is using the selection by callable feature of dataframes. You can pass a callable (such as a function) as the index to select a subset.

The lambda is just a shorthand to create a function, ie. you could also write:

def is_fat(row):
    return row.Fat != 'None'

and use that function for indexing:

filtered_data = data[is_fat]

As you can see, the lambda function basically returns False for rows that has 'None' in the column Fat, and True otherwise.