xApple - 1 year ago 203

Python Question

The simple task of adding a row to a

`pandas.DataFrame`

Here is what I'm trying to do. I have a DataFrame of which I already know the shape as well as the names of the rows and columns.

`>>> df = pandas.DataFrame(columns=['a','b','c','d'], index=['x','y','z'])`

>>> df

a b c d

x NaN NaN NaN NaN

y NaN NaN NaN NaN

z NaN NaN NaN NaN

Now, I have a function to compute the values of the rows iteratively. How can I fill in one of the rows with either a dictionary or a

`pandas.Series`

`>>> y = {'a':1, 'b':5, 'c':2, 'd':3}`

>>> df['y'] = y

AssertionError: Length of values does not match length of index

Apparently it tried to add a column instead of a row.

`>>> y = {'a':1, 'b':5, 'c':2, 'd':3}`

>>> df.join(y)

AttributeError: 'builtin_function_or_method' object has no attribute 'is_unique'

Very uninformative error message.

`>>> y = {'a':1, 'b':5, 'c':2, 'd':3}`

>>> df.set_value(index='y', value=y)

TypeError: set_value() takes exactly 4 arguments (3 given)

Apparently that is only for setting individual values in the dataframe.

`>>> y = {'a':1, 'b':5, 'c':2, 'd':3}`

>>> df.append(y)

Exception: Can only append a Series if ignore_index=True

Well, I don't want to ignore the index, otherwise here is the result:

`>>> df.append(y, ignore_index=True)`

a b c d

0 NaN NaN NaN NaN

1 NaN NaN NaN NaN

2 NaN NaN NaN NaN

3 1 5 2 3

It did align the column names with the values, but lost the row labels.

`>>> y = {'a':1, 'b':5, 'c':2, 'd':3}`

>>> df.ix['y'] = y

>>> df

a b \

x NaN NaN

y {'a': 1, 'c': 2, 'b': 5, 'd': 3} {'a': 1, 'c': 2, 'b': 5, 'd': 3}

z NaN NaN

c d

x NaN NaN

y {'a': 1, 'c': 2, 'b': 5, 'd': 3} {'a': 1, 'c': 2, 'b': 5, 'd': 3}

z NaN NaN

That also failed miserably.

So how do you do it ?

Recommended for you: Get network issues from **WhatsUp Gold**. **Not end users.**

Answer Source

`df['y']`

will set a column

since you want to set a row, use `.loc`

Note that `.ix`

is equivalent here, yours failed because you tried to assign a dictionary
to each element of the row `y`

probably not what you want; converting to a Series tells pandas
that you want to align the input (for example you then don't have to to specify all of the elements)

```
In [7]: df = pandas.DataFrame(columns=['a','b','c','d'], index=['x','y','z'])
In [8]: df.loc['y'] = pandas.Series({'a':1, 'b':5, 'c':2, 'd':3})
In [9]: df
Out[9]:
a b c d
x NaN NaN NaN NaN
y 1 5 2 3
z NaN NaN NaN NaN
```