Mohit Mohit - 1 year ago 81
R Question

Passing Array in pdftools library of R

I am trying to convert multiple pdf files into excel version so that through vba I can manipulate the text and find some specific figures. The code that I have written is:

filenames <- list.files(pattern = "*.pdf", all.files = TRUE )
txt <- pdf_text(filenames[1])
write.table(txt, file = paste(filenames[1], ".xls", sep = ""), sep = " ")
txt <- pdf_text(filenames[2])
write.table(txt, file = paste(filenames[2], ".xls", sep = ""), sep = " ")
txt <- pdf_text(filenames[3])
write.table(txt, file = paste(filenames[3], ".xls", sep = ""), sep = " ")

Here I pass all the pdf file names into array name filenames and then I pass the filenames one by one to convert them into excel. What I want is to be independent of the last repetitive code lines. Suppose I have 25 files in a folder I need to write those lines 25 times. I there any code line which can pass all the names at once.

Answer Source


filenames <- list.files(pattern = "*.pdf", all.files = TRUE )

for (fname in filenames) {
  txt <- pdf_text(fname)
  write.table(txt, file = paste(fname, ".xls", sep = ""), sep = " ")

But, help("for") in the console would have provided sufficient information on how to use a for loop.

The "problem" with using the *apply family of functions for this is that there's a side-effect of dumping a result back into the environment (even though only temporarily). Even purrrr::walk() returns data back, but at least it does so invisibly (and returns the original data unmodified).

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download