hdy hdy - 2 years ago 134
SQL Question

pyspark sql dataframe keep only null

I have a sql dataframe

and there is a column
, how do I filter the dataframe and keep only
is actually null for further analysis? From the pyspark module page here, one can drop na rows easily but did not say how to do the opposite.

df.filter(df.user_id == 'null')
, but the result is 0 column. Maybe it is looking for a string "null". Also
df.filter(df.user_id == null)
won't work as it is looking for a variable named 'null'

Answer Source


df.filter(df.user_id == None)
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download