user919367 user919367 - 6 months ago 28
R Question

Replace value per row with value in first column

My question is very simple. I have a data frame with various numbers in each row, more than 100 columns. First column is always a non zero number. What I want to do is replace each nonzero number in each row (excluding the first column) with the first number in the row (the value of the first column)

I would think in the lines of an ifelse and a for loop that iterates through rows but there must be a simpler vectorised way to do it...


Another approach is to use sapply, which is more efficient than looping. Assuming your data is in a data frame df:

df[,-1] <- sapply(df[,-1], function(x) {ind <- which(x==0); x[ind] = df[ind,1]; return(x)})

Here, we are applying the function over each and all columns of df except for the first column. In the function, x is each of these columns in turn:

  1. First find the row indices of the column that are zeroes using which.
  2. Set these rows in x to the corresponding values in the rows of the first column of df.
  3. Returns the column

Note that the operations in the function are all "vectorized" over the column. That is, no looping over the rows of the column. The result from sapply is a matrix of the processed columns, which replaces all columns of df that are not the first column.

See this for an excellent review of the *apply family of functions.

Hope this helps.