jeromeResearch jeromeResearch -4 years ago 136
R Question

R data table: replace subset of row values across multiple columns using conditional with another column

This is my first post in stack overflow so forgive any mistakes. I'm also very new to R syntax and data tables.

Specifically for a data table, I want to conditionally test and replace row values across four columns in comparison with values in a fifth column. Example data is the following:

head(loadProfiles)
load_ev_ag load_ev_res load_ev_res_tou load_ev_workplace maxICA
1: 8469.231 2317.895 36700.00 220200.000 8808
2: 8768.000 2609.524 36533.33 36533.333 8768
3: 8744.000 3168.116 27325.00 10409.524 8744
4: 7006.452 3810.526 24133.33 3620.000 8688
5: 5794.595 4660.870 19490.91 2144.000 8576
6: 6057.143 5888.889 16307.69 2208.333 8480
7: 7036.667 7279.310 14073.33 2814.667 8444
8: 8107.692 8107.692 14053.33 3634.483 8432
9: 8138.462 9200.000 11755.56 3992.453 8464
10: 8173.077 10625.000 10119.05 4427.083 8500


What I would like to do is loop the following action over each of the first 4 columns, comparing each column to values in the fifth column.

loadProfiles[load_ev_ag >= maxICA, load_ev_ag := maxICA]


The result I want should look like the following:

head(loadProfiles)
load_ev_ag load_ev_res load_ev_res_tou load_ev_workplace maxICA
1: 8469.231 2317.895 8808 8808 8808
2: 8768.000 2609.524 8768 8768 8768
3: 8744.000 3168.116 8744 8744 8744
4: 7006.452 3810.526 8688 3620.000 8688
5: 5794.595 4660.870 8576 2144.000 8576
6: 6057.143 5888.889 8480 2208.333 8480
7: 7036.667 7279.310 8444 2814.667 8444
8: 8107.692 8107.692 8432 3634.483 8432
9: 8138.462 8464 8464 3992.453 8464
10: 8173.077 8500 8500 4427.083 8500


I've tried the following with no luck:

loadProfileNames <- colnames(loadProfiles)[1:4]
loadProfiles[i = (loadProfileNames) >= maxICA,j = (loadProfileNames) := maxICA]


This produces the following warning and also changes all values in the first four columns equal to values in the fifth column

Warning message:
In (loadProfileNames) >= maxICA :
longer object length is not a multiple of shorter object length


I've also tried the following which changes the subset of x rows that meet the criteria
i = (loadProfileNames) >= maxICA
to the first x entries in maxICA rather than to the value in maxICA corresponding to row i in the subset of x rows

for(j in loadProfileNames) { set(loadProfiles,i=which(loadProfiles[[j]] >= loadProfiles[["maxICA"]]),j=j,value=loadProfiles[["maxICA"]]) }


and produces the following warning

Warning messages:
1: In set(loadProfiles, i = which(loadProfiles[[j]] >= loadProfiles[["maxICA"]]), :
Supplied 288 items to be assigned to 24 items of column 'load_ev_ag' (264 unused)
2: In set(loadProfiles, i = which(loadProfiles[[j]] >= loadProfiles[["maxICA"]]), :
Supplied 288 items to be assigned to 108 items of column 'load_ev_res' (180 unused)
3: In set(loadProfiles, i = which(loadProfiles[[j]] >= loadProfiles[["maxICA"]]), :
Supplied 288 items to be assigned to 156 items of column 'load_ev_res_tou' (132 unused)
4: In set(loadProfiles, i = which(loadProfiles[[j]] >= loadProfiles[["maxICA"]]), :
Supplied 288 items to be assigned to 156 items of column 'load_ev_workplace' (132 unused)


I'm pretty much stuck at this point. Any guidance would be much appreciated.

Answer Source

Your first attempt was almost right:

profilenames <- names(loadProfiles)[1:4]
for (i in profilenames) { 
  loadProfiles[get(i) >= maxICA, eval(i) := as.double(maxICA)]
}
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download