Emil Filipov - 8 months ago 54

R Question

So, i am doing an outlier detection for a given data set. This is done in R by the way.

With the function boxplot.stats(x)$out i get information for the variable that i am detecting outliers for. Right? I get the observation's value that is considered an outlier.

What i want to do is create a new column in the data set which could be binary and put a 1 for the observations that are outliers and 0 for the ones that are not outliers.

Example:

`Var1 Var2`

asd 111

dsa 15

ssa 10

aas 9

dad 10

dda 95

Lets say observation 1 and 6 is detected as an outlier:

`Var1 Var2`

asd 111

dda 95

When i use:

`outlier <- boxplot.stats(Var2)$out`

I only receive the value of the outliers - i get 111 and 95 in the console.

So.. After i have detected this outliers i want to do the following:

`Var1 Var2 Outlier`

asd 111 1

dsa 15 0

ssa 10 0

aas 9 0

dad 10 0

dda 95 1

It is probably really easy to do, but i don't know how. Any ideas?

Answer

say your data.frame name is "data" and you have the values of outliers in "outlier"

then do this:

```
data$outlier = 0
data[which(data$Var2 %in% outlier),"outlier"] <- 1
```

Source (Stackoverflow)