MaruB MaruB - 6 months ago 16
R Question

Grouping by row- data.table type change

This is related to the question Group by in data.table in R which only keep non NA values from columns

I have

df <- data.frame(x = c('a', 'a', 'b', 'b' ), y = c(1,NA,2,NA), z = c(NA, 3, NA, 4))


x y z
1 a 1 NA
2 a NA 3
3 b 2 NA
4 b NA 4

and I want

df2 <- data.frame(x = c('a', 'b' ), y = c(1,2), z = c(3,4))


x y z
1 a 1 3
2 b 2 4

I am having the same issue as in the above question and I tried the accepted answer and it worked, but it changed the type of the contents in my data frame. I need them to stay as numeric values for downstream analysis and using
afterwards did not work. I also tried solving the initial question using dplyr
but it didn't work either so I guess I am misunderstandig the function (still a beginner in R and data analysis in general!).

Sorry for the very basic question but I have been stuck trying to solve this for a while! Any suggestions are welcome.



We can do this with data.table

dt1 <- setDT(df)[, lapply(.SD, function(x) x[!]), x]
#Classes ‘data.table’ and 'data.frame':  2 obs. of  3 variables:
#$ x: Factor w/ 2 levels "a","b": 1 2
#$ y: num  1 2
#$ z: num  3 4

#Classes ‘data.table’ and 'data.frame':  4 obs. of  3 variables:
#$ x: Factor w/ 2 levels "a","b": 1 1 2 2
#$ y: num  1 NA 2 NA
#$ z: num  NA 3 NA 4

If we needed, we can change the 'dt1' to 'data.frame' with the setDF