Emil Filipov Emil Filipov - 1 month ago 17
R Question

Creating a new variable for an outlier tag

So, i am doing an outlier detection for a given data set. This is done in R by the way.

With the function boxplot.stats(x)$out i get information for the variable that i am detecting outliers for. Right? I get the observation's value that is considered an outlier.

What i want to do is create a new column in the data set which could be binary and put a 1 for the observations that are outliers and 0 for the ones that are not outliers.

Example:

Var1 Var2
asd 111
dsa 15
ssa 10
aas 9
dad 10
dda 95


Lets say observation 1 and 6 is detected as an outlier:

Var1 Var2
asd 111
dda 95


When i use:

outlier <- boxplot.stats(Var2)$out


I only receive the value of the outliers - i get 111 and 95 in the console.
So.. After i have detected this outliers i want to do the following:

Var1 Var2 Outlier
asd 111 1
dsa 15 0
ssa 10 0
aas 9 0
dad 10 0
dda 95 1


It is probably really easy to do, but i don't know how. Any ideas?

Answer

say your data.frame name is "data" and you have the values of outliers in "outlier"

then do this:

data$outlier = 0

data[which(data$Var2 %in% outlier),"outlier"] <- 1