user2014 user2014 - 2 months ago 6
R Question

How to replace NULL/? with 'None' or '0' in r

DF1 is

ID CompareID Distance
1 256 0
1 834 0
1 946 0
2 629 0
2 735 1
2 108 1


Expected output should be DF2 as below (Condition for generating DF2 -> In DF1, For any ID if 'Distance'==1, put the corresponding 'CompareID' into 'SimilarID' column, for 'Distance'==0, ignore the corresponding 'CompareID')

ID SimilarID
1 None
2 735,108


Comparison is done correctly , but i got below output

ID SimilarID
1 ?
2 735,108


I understood that, as there are no 'CompareID' to put in 'SimilarID' - ? mark is displayed.
I want to replace this '?' with 'None' or '0'. Kindly help
In some cases, i observe that instead of '?' i can also see 'NULL' value.

Thanks !

Answer

Using the data.table package, where df is your original data ...

library(data.table)
setDT(df)[, .(SimilarID = if(all(Distance == 0)) "None" 
              else toString(CompareID[Distance == 1])), by = ID]
#    ID SimilarID
# 1:  1      None
# 2:  2  735, 108

This follows your expected output by returning, by ID

  • "None" when all of the Distance column is zero
  • the CompareID values for when Distance is 1, as a comma-delimited string

Data:

df <- structure(list(ID = c(1L, 1L, 1L, 2L, 2L, 2L), CompareID = c(256L, 
834L, 946L, 629L, 735L, 108L), Distance = c(0L, 0L, 0L, 0L, 1L, 
1L)), .Names = c("ID", "CompareID", "Distance"), class = "data.frame", row.names = c(NA, 
-6L))
Comments