Ernie Ernie - 2 months ago 14
R Question

Select column name based on data frame content R

I want to build a matrix or data frame by choosing names of columns where the element in the data frame contains does not contain an NA. For example, suppose I have:

zz <- data.frame(a = c(1, NA, 3, 5),
b = c(NA, 5, 4, NA),
c = c(5, 6, NA, 8))


which gives:

a b c
1 1 NA 5
2 NA 5 6
3 3 4 NA
4 5 NA 8


I want to recognize each NA and build a new matrix or df that looks like:

a c
b c
a b
a c


There will be the same number of NAs in each row of the input matrix/df. I can't seem to get the right code to do this. Suggestions appreciated!

DMC DMC
Answer
library(dplyr)
library(tidyr)

zz %>%
  mutate(k = row_number()) %>%
  gather(column, value, a, b, c) %>%
  filter(!is.na(value)) %>%
  group_by(k) %>%
  summarise(temp_var = paste(column, collapse = " ")) %>%
  separate(temp_var, into = c("var1", "var2"))

# A tibble: 4 × 3
      k  var1  var2
* <int> <chr> <chr>
1     1     a     c
2     2     b     c
3     3     a     b
4     4     a     c
Comments