saeed Salehi saeed Salehi - 1 month ago 7
R Question

How to read a file in R where each line is a vector of variable length ('ragged')

I want to read a file that contain many vector for example you can see below

8984
8813
8685
11629
c(8527, 11629)
c(8527, 7685, 7822, 11629)
c(8527, 7685, 7822, 7137, 7318, 11629)
c(8527, 7685, 7822, 7137, 7318, 7063, 7075, 11629)
c(8527, 7685, 7822, 7137, 7318, 7063, 7075, 6911, 6946, 11629)
c(8527, 7685, 7822, 7137, 7318, 7063, 7075, 6911, 6946, 6703, 6909, 11629)
c(8527, 7685, 7822, 7137, 7318, 7063, 7075, 6911, 6946, 6703, 6909, 5751, 6614, 11629)
c(8527, 7685, 7822, 7137, 7318, 7063, 7075, 6911, 6946, 6703, 6909, 5751, 6614, 5436, 5493, 11629)
c(8527, 7685, 7822, 7137, 7318, 7063, 7075, 6911, 6946, 6703, 6909, 5751, 6614, 5436, 5493, 4694, 4998, 11629)
c(8527, 7685, 7822, 7137, 7318, 7063, 7075, 6911, 6946, 6703, 6909, 5751, 6614, 5436, 5493, 4694, 4998, 4211, 4678, 11629)


how can I read this file that every vector is specific in R?

Answer

If that's really what your file looks like (which would be strange), give this a try.

It works when I tried it on textConnection(yourtext) so it should work on your file. You don't tell us how you want the output to look, so I made it a list because that seems most appropriate here.

txt <- gsub("[c(),]", "", readLines("filename.ext"))
lapply(txt, function(x) scan(text = x, what = integer(), quiet = TRUE))
# [[1]]
# [1] 8984
# 
# [[2]]
# [1] 8813
# 
# [[3]]
# [1] 8685
# 
# [[4]]
# [1] 11629
# 
# [[5]]
# [1]  8527 11629
# 
# [[6]]
# [1]  8527  7685  7822 11629
#
# ... truncated ...
Comments