covariance covariance - 6 months ago 23
JSON Question

Read_json populates with empty lists; how to remove those rows

I've got a Pandas dataframe created with pd.read_json(). When I read it in, I get a few cells that have just an empty list or None, and I want to detect the rows with those [], None in certain columns. For example:

feat 1 feat 2 feat 3
0 [] [] 5
1 6 8 3
2 None 10 NaN


I want to remove rows 0 and 2 because they have None/NaN/empty lists. How can I do this with Pandas?

Answer

You can applymap the [] and None to NaN:

Note: replace works for the None but not the []... this solution seems to be a little sensitive (hence the use of negation ~)...

In [11]: df.applymap(lambda x: x == [] or x is None)
Out[11]:
  feat 1 feat 2 feat 3
0   True   True  False
1  False  False  False
2   True  False  False

In [12]: df.where(~df.applymap(lambda x: x == [] or x is None))
Out[12]:
  feat 1 feat 2  feat 3
0    NaN    NaN       5
1      6      8       3
2    NaN     10     NaN
Comments