DeltaIV - 1 year ago 54
R Question

Associate each elements of a numeric vector to the "most similar" level of a factor vector

I have a numeric vector:

``````x <-c(-18.695, -18.695, 19.477, 0.000, 55.000, 19.477, -18.695, 48.476, 55.000, 37.798, -18.695, 19.477, 37.798, 0.000, -18.695)
``````

and a factor vector, whose levels, as returned from the
`levels`
function, are:

``````y <- c("IV-18_7", "IV00", "IV00orig", "IV19_5", "IV37_8", "IV37_8_yp", "IV48_5", "IV48_5_yp", "IV55")
``````

I need to build a new factor vector
`z`
, of the same length as
`x`
, but having the levels listed in
`y`
, and such that the i-th element of
`z`
,
`z[i]`
is the "most similar" element of
`y`
to the corresponding element of
`x`
,
`x[i]`
. In other words:

``````z <-factor(c("IV-18_7", "IV-18_7", "IV19_5", "IV00", "IV55", "IV19_5", "IV-18_7", "IV48_5", "IV55", "IV37_8", "IV-18_7", "IV19_5", "IV37_8", "IV00", "IV-18_7"), levels = y)
``````

The example should make the meaning of "most similar" fairly obvious, anyway the idea is to take an element
`x[i]`
and then look for the element of
`y`
which is obtained by adding a "IV" prefix, then adding a string which is "similar" to the roundoff of
`x[i]`
(but not exactly equal, unfortunately), and finally without any suffix after the numeric part. I don't know how to code this efficiently in R, can you help me?

``````paste0("IV", sub(".", "_", sub("\\.0\$", "", sprintf("%04.1f", round(x, 1))), fixed=TRUE))
It works as follows. The original vector, x is rounded to the first significant digit. Then `sprintf` with the formatting "%04.1f" pads the result with a leading "0" if the number of characters is less than 4. This result is fed to `sub` which drops any instances of dots (periods) followed by "0". Finally, the outer `sub` replaces the dot with an underscore.