mane mane - 4 months ago 43
Python Question

python pandas add column in dataframe from list

[EDIT: Wrong subject/title of post corrected]

I have a dataframe with some columns like this:

A B C
0
4
5
6
7
7
6
5


The possible range of values in A are only from 0 to 7.

Also, I have a list of 8 elements like this:

List=[2,5,6,8,12,16,26,32] //There are only 8 elements in this list


If the element in column A is n, I need to insert the n th element from the List in a new column, say 'D'.

How can I do this in one go without looping over the whole dataframe?

The resulting dataframe would look like this:

A B C D
0 2
4 12
5 16
6 26
7 32
7 32
6 26
5 16


(Note: The dataframe is huge and iteration is the last option option. But I can also arrange the elements in 'List' in any other data structure like dict if necessary)

DSM DSM
Answer

IIUC, if you make your (unfortunately named) List into an ndarray, you can simply index into it naturally.

>>> m = np.arange(16)*10
>>> m[df.A]
array([  0,  40,  50,  60, 150, 150, 140, 130])
>>> df["D"] = m[df.A]
>>> df
    A   B   C    D
0   0 NaN NaN    0
1   4 NaN NaN   40
2   5 NaN NaN   50
3   6 NaN NaN   60
4  15 NaN NaN  150
5  15 NaN NaN  150
6  14 NaN NaN  140
7  13 NaN NaN  130

Here I built a new m, but if you use m = np.asarray(List), the same thing should work: the values in df.A will pick out the appropriate elements of m.


Note that if you're using an old version of numpy, you might have to use m[df.A.values] instead-- in the past, numpy didn't play well with others, and some refactoring in pandas caused some headaches. Things have improved now.