clg4 - 2 months ago 23

Python Question

I am looking to create a column in a pandas dataframe that is the function of a variable/dynamic list of column names.

Typical column creation would be:

`df['new']=(df['one']*x)+(df['two']*y)+(df['3']*z)`

where x,y,z are variables from another df.

`x 1.1`

y 1.2

z 1.3

a 1.4

b 1.5

c 1.6

I want to create a column which would be a function of a variable list of columns.

So for instance if:

`cols=['one','two']`

then the formula would be created as:

`df['new']=(df['one']*x)+(df['two']*y)`

But if cols changes to:

`cols=['one','two','three','four']`

then the formula would change to:

`df['new']=(df['one']*x)+(df['two']*y)+(df['3']*z)+(df['four']*a)`

I know I must be missing something easy here.

Answer

Using zip will return the truncated pairs, so `[(a, b) for a, b in zip([1, 2], [3, 4, 5, 6])]`

will return return `[(1, 3), (2, 4)]`

.

```
df = pd.DataFrame(np.random.randn(5, 5), columns=list('ABCDE'))
x = 1.1
y = 1.2
z = 1.3
a = 1.4
b = 1.5
c = 1.6
var = [x, y, z, a, b, c]
cols = ['A', 'B', 'C']
>>> sum(df[col] * v for col, v in zip(cols, var))
0 0.729284
1 2.671124
2 1.804285
3 0.791489
4 1.818327
dtype: float64
```

Source (Stackoverflow)

Comments