Thirst for Knowledge Thirst for Knowledge - 1 month ago 6
R Question

R - allocate a share of a number over different columns using an ifelse statement

I have the following data set:

observation <- c(1:10)
pop.d.rank <- c(1:10)
cost.1 <- c(101:110)
cost.2 <- c(102:111)
cost.3 <- c(103:112)
all <- data.frame(observation,pop.d.rank,cost)


And I want to allocate the following amount of money over three years:

annual.investment <- 500


I can do this for the first year with the following script:

library(dplyr)

all <- all %>%
mutate(capital_allocated.5G = diff(c(0, pmin(cumsum(cost), annual.investment)))) %>%
mutate(capital_percentage.5G = capital_allocated.5G / cost * 100) %>%
mutate(year = ifelse(capital_percentage.5G >= 50, "Year.1",0))


But when I try to do this for the second year, taking into account the previous year's investment, the code does not work. Here is my attempt at putting an ifelse statement in the mutate loop so that it does not overwrite the money allocated in the previous year:

all <- all %>%
mutate(capital_allocated.5G = ifelse(year == 0, diff(c(0, pmin(cumsum(cost), annual.investment))), 0) %>%
mutate(capital_percentage.5G = capital_allocated.5G / cost * 100) %>%
mutate(year = ifelse(capital_percentage.5G >= 50, "Year.2",0))


I want the data to look like the following, where the amount allocated goes first to any row that hasn't been 100% completed from the previous year.

capital_allocated.5G <- c(101, 102, 103, 104, 105, 106, 107, 108, 109, 55)
capital_percentage.5G <- c(100, 100, 100, 100, 100, 100, 100, 100, 100, 50)
year <- c("Year.1", "Year.1","Year.1", "Year.1","Year.1", "Year.2", "Year.2","Year.2", "Year.2","Year.2")
example.output <- data.frame(observation,pop.d.rank,cost, capital_allocated.5G, capital_percentage.5G, year)


Edit: cost.1 is the cost variable for year 1, cost.2 is the variable for year 2 and cost.3 is the cost variable for year 3

Answer

The original issue with your code is that ifelse just provide a switch on the output based on the condition and not the input cost used within the TRUE branch of the ifelse. Therefore, cumsum(cost) computes the cumsum over all cost and not only on the portion of the TRUE branch of the ifelse. To fix this, we can define the following function that can then be executed for each year in turn.

library(dplyr)
alloc.invest <- function(df, ann.invest, y) {
  df %>% mutate(not.yet.alloc = ifelse(capital_percentage.5G < 100,cost-capital_allocated.5G,0),
                capital_allocated.5G = capital_allocated.5G + ifelse(capital_percentage.5G < 100,diff(c(0, pmin(cumsum(not.yet.alloc), ann.invest))), 0),
                capital_percentage.5G = capital_allocated.5G / cost * 100,
                year = ifelse(is.na(year) & capital_percentage.5G >= 50, paste0("Year.",y), year)) %>%
         select(-not.yet.alloc)
}

Note:

  1. Create a new temporary column not.yet.alloc from which we compute the resulting cumsum for the year's allocation.
  2. Don't need separate mutate statements.
  3. Need to also check is.na(year) before setting year. Otherwise, previous year already labelled will be overwritten.

To use this function, we must first augment the input data with some initial values for capital_allocated.5G, capital_percentage.5G, and year:

capital_allocated.5G <- rep(0,10)   ## initialize to zero
capital_percentage.5G <- rep(0,10)  ## initialize to zero
year <- rep(NA,10)                  ## initialize to NA
all <- data.frame(observation,pop.d.rank,cost,capital_allocated.5G,capital_percentage.5G,year) 

Then for Year 1:

annual.investment <- 500
all <- alloc.invest(all,annual.investment,1)
print(all)
##   observation pop.d.rank cost capital_allocated.5G capital_percentage.5G   year
##1            1          1  101                  101             100.00000 Year.1
##2            2          2  102                  102             100.00000 Year.1
##3            3          3  103                  103             100.00000 Year.1
##4            4          4  104                  104             100.00000 Year.1
##5            5          5  105                   90              85.71429 Year.1
##6            6          6  106                    0               0.00000   <NA>
##7            7          7  107                    0               0.00000   <NA>
##8            8          8  108                    0               0.00000   <NA>
##9            9          9  109                    0               0.00000   <NA>
##10          10         10  110                    0               0.00000   <NA>

and for Year 2:

all <- alloc.invest(all,annual.investment,2)
print(all)
##   observation pop.d.rank cost capital_allocated.5G capital_percentage.5G   year
##1            1          1  101                  101                   100 Year.1
##2            2          2  102                  102                   100 Year.1
##3            3          3  103                  103                   100 Year.1
##4            4          4  104                  104                   100 Year.1
##5            5          5  105                  105                   100 Year.1
##6            6          6  106                  106                   100 Year.2
##7            7          7  107                  107                   100 Year.2
##8            8          8  108                  108                   100 Year.2
##9            9          9  109                  109                   100 Year.2
##10          10         10  110                   55                    50 Year.2  

Update to new requirement of changing costs per year

If costs are different per year, then the function needs to readjust the capital_percentage.5G and possibly the year columns first:

library(dplyr)
alloc.invest <- function(df, ann.invest, y) {
  df %>% mutate_(cost=paste0("cost.",y)) %>%
         mutate(capital_percentage.5G = capital_allocated.5G / cost * 100,
                year = ifelse(capital_percentage.5G < 50, NA, year),
                not.yet.alloc = ifelse(capital_percentage.5G < 100,cost-capital_allocated.5G,0),
                capital_allocated.5G = capital_allocated.5G + ifelse(capital_percentage.5G < 100,diff(c(0, pmin(cumsum(not.yet.alloc), ann.invest))), 0),
                capital_percentage.5G = capital_allocated.5G / cost * 100,
                year = ifelse(is.na(year) & capital_percentage.5G >= 50, paste0("Year.",y), year)) %>%
         select(-cost,-not.yet.alloc)
}

Note that creating another temporary column cost using mutate_ is only for convenience as the cost column needs to be dynamically selected based on the input y (otherwise, we need to use mutate_ for all computations, which will be somewhat messier).

With the updated data similarly augmented with initial values for capital_allocated.5G, capital_percentage.5G, and year, Year 1:

annual.investment <- 500
all <- alloc.invest(all,annual.investment,1)
print(all)
##   observation pop.d.rank cost.1 cost.2 cost.3 capital_allocated.5G capital_percentage.5G   year
##1            1          1    101    102    103                  101             100.00000 Year.1
##2            2          2    102    103    104                  102             100.00000 Year.1
##3            3          3    103    104    105                  103             100.00000 Year.1
##4            4          4    104    105    106                  104             100.00000 Year.1
##5            5          5    105    106    107                   90              85.71429 Year.1
##6            6          6    106    107    108                    0               0.00000   <NA>
##7            7          7    107    108    109                    0               0.00000   <NA>
##8            8          8    108    109    110                    0               0.00000   <NA>
##9            9          9    109    110    111                    0               0.00000   <NA>
##10          10         10    110    111    112                    0               0.00000   <NA>

Year 2: Note that last asset has less than 50% allocated so its year is still NA.

all <- alloc.invest(all,annual.investment,2)
print(all)
##   observation pop.d.rank cost.1 cost.2 cost.3 capital_allocated.5G capital_percentage.5G   year
##1            1          1    101    102    103                  102             100.00000 Year.1
##2            2          2    102    103    104                  103             100.00000 Year.1
##3            3          3    103    104    105                  104             100.00000 Year.1
##4            4          4    104    105    106                  105             100.00000 Year.1
##5            5          5    105    106    107                  106             100.00000 Year.1
##6            6          6    106    107    108                  107             100.00000 Year.2
##7            7          7    107    108    109                  108             100.00000 Year.2
##8            8          8    108    109    110                  109             100.00000 Year.2
##9            9          9    109    110    111                  110             100.00000 Year.2
##10          10         10    110    111    112                   46              41.44144   <NA>

Year 3:

all <- alloc.invest(all,annual.investment,3)
print(all)
##   observation pop.d.rank cost.1 cost.2 cost.3 capital_allocated.5G capital_percentage.5G   year
##1            1          1    101    102    103                  103                   100 Year.1
##2            2          2    102    103    104                  104                   100 Year.1
##3            3          3    103    104    105                  105                   100 Year.1
##4            4          4    104    105    106                  106                   100 Year.1
##5            5          5    105    106    107                  107                   100 Year.1
##6            6          6    106    107    108                  108                   100 Year.2
##7            7          7    107    108    109                  109                   100 Year.2
##8            8          8    108    109    110                  110                   100 Year.2
##9            9          9    109    110    111                  111                   100 Year.2
##10          10         10    110    111    112                  112                   100 Year.3
Comments