Gustavo - 1 year ago 99
R Question

# R replace the current column (values) with random extreme values, lower than 12.5% and upper than 87.5%

I have a data set with 10 rows (values). Data for example:

``````value <- c(40.557669, 44.436873, 18.541628, 16.524613, 19.34,
10.07, 17.33, 20.155240, 15.31, 101.23,
)

data <- data.frame(value)
``````

Using quantiles I can select values between the percentages 25%, 50%, 75%.

For example:

``````data\$value <- data\$value[data\$value>=quantile(data\$value)[4]]
newvalue <- data\$value[data\$value>=quantile(data\$value)[4]]
data\$value <- sample(newvalue, dim(data)[1], replace=T)
``````

I would like to replace the current values with random extreme values, lower than 12.5% and upper than 87.5%.

how to do that best?

Thank you!

I was having issues with your provided dataset, so let's make this reproducible. Start with a `data.frame` with one column, `value`, of 50 random integers:

``````set.seed(4)
df <- data.frame(value = sample(1:100, 50))
``````

Get the 12.5% and 87.5% ntiles:

``````ntiles <- quantile(df\$value, probs = c(0.125, 0.875))
# ntiles
#  12.5%  87.5%
# 19.625 85.500
``````

Now subset the `data.frame` into the lower extremes and upper extremes:

``````lowers <- subset(df, value < ntiles[1])
uppers <- subset(df, value > ntiles[2])
``````

Finally, sample from the combined group of `lowers\$value` and `uppers\$value`:

``````sample(c(lowers\$value, uppers\$value), NROW(df), replace = T)
``````

I used `NROW(df)` (which will be 50) to grab the same number of records from the original dataset.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download