clarkk clarkk - 4 months ago 13
Linux Question

PDF to text (multiple pages)

How to extract text from a PDF with multiple pages? I need to get each page as separated text strings

Page 1 as one string, page 2 as another string etc

Is it possible with

pdftotext
or any other tool?

I need a Linux command line tool

Answer

The easiest way would be to use the already named pdftotext tool, which can be installed by running sudo apt-get install poppler-utils. After that you can simply run pdftotext /link/to/input.pdf /link/to/output.txt.