user36729 user36729 - 3 months ago 18
Python Question

How to add a column of random numbers based on two conditions?

I have a data frame in python containing the following information:

Day Type
Weekday 1
Weekday 2
Weekday 3
Weekday 1
Weekend 2
Weekend 1


I want to add a new column by generating a Weibull random number but each pair of "Day" and "Type" has a unique Weibull distributions.

For example, I have tried the following codes but they did not work:

df['Duration'][ (df['Day'] == "Weekend") & (df['Type'] == 1) ] = int(random.weibullvariate(5.6/math.gamma(1+1/6),6))

df['Duration'] = df['Day','Type'].map(lambda x,y: int(random.weibullvariate(5.6/math.gamma(1+1/10),10)) if x == "Weekday" and y == 1 if x == "Weekend" and y == 1 int(random.weibullvariate(5.6/math.gamma(1+1/6),6)))

Answer

Define a function that generates the random number you want and apply it to the rows.

import io
import random
import math
import pandas as pd

data = io.StringIO('''\
Day         Type    
Weekday      1    
Weekday      2    
Weekday      3    
Weekday      1    
Weekend      2    
Weekend      1
''')
df = pd.read_csv(data, delim_whitespace=True)

def duration(row):
    if row['Day'] == 'Weekend' and row['Type'] == 1:
        return int(random.weibullvariate(5.6/math.gamma(1+1/6),6))
    if row['Day'] == 'Weekday' and row['Type'] == 1:
        return int(random.weibullvariate(5.6/math.gamma(1+1/10),10))

df['Duration'] = df.apply(duration, axis=1)
Comments