Jake Jake - 4 months ago 9
R Question

Find the related group

I have data like this:

num group
0 433
0 433
0 433
0 211
0 211
0 211
1 309
1 309
1 309
0 424
0 947
1 309
0 433


I would like to check if a specific group has always 0 or 1 value in num column but my data frame has many rows (i.e. group 433 is at the start and it is possible to be in other rows in df). How is it possible to index it?

Answer

Here is an option using data.table to check if the length of unique elements in 'num' column is only 1 after grouping by 'group'. Convert the 'data.frame' to 'data.table' (setDT(df1)), grouped by 'group', if the length of unique elements in 'num' is 1, then we get the Subset of data.table (.SD)

library(data.table)
setDT(df1)[, if(uniqueN(num) == 1) .SD, by = group]

NOTE: In the example provided by the OP, all the 'group' have only single unique element, so it will get the full dataset.

If we need the 'group' ids for having only a single unique element

setDT(df1)[, if(uniqueN(num) == 1)  group, by = group]$V1
#[1] 433 211 309 424 947

If we need the 'num' column along with 'group' whenever there is only a single unique 'num' per 'group'

setDT(df1)[,if(uniqueN(num)==1) .(num = num[1L]) , by = group]
#   group num
#1:   433   0
#2:   211   0
#3:   309   1
#4:   424   0
#5:   947   0
Comments