GBR24 GBR24 - 3 months ago 23
Python Question

Iteratively concatenate columns in pandas with NaN values

I have a

pandas.DataFrame
data frame:

import pandas as pd

df = pd.DataFrame({"x": ["hello there you can go home now", "why should she care", "please sort me appropriately"],
"y": [np.nan, "finally we were able to go home", "but what about meeeeeeeeeee"],
"z": ["", "alright we are going home now", "ok fine shut up already"]})

cols = ["x", "y", "z"]


I want to iteratively concatenate these columns, as opposed to writing something like:

df["concat"] = df["x"].str.cat(df["y"], sep = " ").str.cat(df["z"], sep = " ")


I know that three columns seems trivial to put together, but I actually have 30. so, I would like to do something like:

df["concat"] = df[cols[0]]
for i in range(1, len(cols)):
df["concat"] = df["concat"].str.cat(df[cols[i]], sep = " ")


Right now, the initial
df["concat"] = df[cols[0]]
line works fine, but the
NaN
value in location
df.loc[1, "y"]
messes up the concatenation. Ultimately, the entire
1
st row ends up as
NaN
in
df["concat"]
due to this one null value. How can I get around this? Is there some option with
pd.Series.str.cat
I need to specify?

Answer

Option 1

pd.Series(df.fillna('').values.tolist()).str.join(' ')

0                    hello there you can go home now  
1    why should she care finally we were able to go...
2    please sort me appropriately but what about me...
dtype: object

Option 2

df.fillna('').add(' ').sum(1).str.strip()

0                      hello there you can go home now
1    why should she care finally we were able to go...
2    please sort me appropriately but what about me...
dtype: object