John Godlee John Godlee - 18 days ago 5
R Question

Find value closest to zero in each column of a data frame - R

I have a data frame with a few hundred columns, each with numerical data.

For each column I want to identify the value of the cell with the value closest to zero, without being a positive number.

e.g.

X = c(-1,-2,-3,-4,-5,-6,-7,-8,-9,-10)
Y = c(5,4,3,2,1,0,-1,-2,-3,-4)
Z = c(-11,-12,-13,-14,-15,-16,-17,-18,-19,-20)

df <- data.frame(X, Y, Z)


I would like some function (fun) to return this vector:

fun(df)

[1] -1 0 -11


I thought I could use apply functions, or maybe even a loop, or pipes?

Answer

The OP asked for

the cell with the value closest to zero, without being a positive number

(as pointed out by @Heroka), returning a vector of values as expected result.

This can be achieved using data.table:

library(data.table)
setDT(df)[, unlist(lapply(.SD, function(x) max(x[x<=0])))]

  X Y   Z
 -1 0 -11

Explanations

  • setDT(df) coerces the data.frame df to data.table by reference, i.e., without copying.
  • for each column, the maximum value which is not positive is returned.
  • unlist() coerces the resulting data.table to a (named) vector.