drmariod drmariod - 9 days ago 6
R Question

can't prevent NAs for empty cells in factor columns using readr

I am trying to read file with some empty cells and getting for empty cells an expected

NA
.
I have some special columns which can only have the values
''
or
'+'
. So I would like to set these columns to a factor class by using

read_tsv('file.txt',
col_types=list(
column_with_empty_cells=col_factor(c('','+'))))


But the column still has
NAs
in these columns. I could change the global behaviour of the
readr_tsv
function by changing the
na
parameter, but this is not what I want. I want to change this only in specific columns.

Is there a way to convert these
NAs
directly to
''
? I could do this afterwards for sure, but I am wondering if I am using the thing in the wrong way.

EDIT
Here is a test file

How do I actually upload a file? I could only attach images...

Answer

You can make a new function to solve this issue using lapply and factor:

library(readr)

read_tsv2 <- function(file, na.char=" "){
  test <- read_tsv(file = file, col_types=list(column_with_empty_cells=col_character()))
  test <- as.data.frame(test)
  names_tsv <- names(test)
  test <- lapply(test,
         function(x){
    if(sum(is.na(x))!=length(x)){
      x[is.na(x)] <- na.char 
    factor(x,levels = unique(x))
    }else{
      x
    }
  }
  )
  test <- do.call(cbind.data.frame, test)
  names(test) <- names_tsv
  test
}

file <- read_tsv2(file = "~/Downloads/file.txt", na.char = " ")

file

   test column_with_empty_cells
1  <NA>                        
2  <NA>                        
3  <NA>                        
4  <NA>                        
5  <NA>                        
6  <NA>                        
7  <NA>                        
8  <NA>                        
9  <NA>                        
10 <NA>                        
11 <NA>                        
12 <NA>                        
13 <NA>                        
14 <NA>                        
15 <NA>                        
16 <NA>                        
17 <NA>                        
18 <NA>                        
19 <NA>                        
20 <NA>                        
21 <NA>                        
22 <NA>                        
23 <NA>                        
24 <NA>                       +
25 <NA>                       +
26 <NA>                        
27 <NA>                        
28 <NA>                       +
Comments