Chris Chris - 16 days ago 9
Java Question

Extracting text from a PDF file

I need to extract the text from a PDF file. This text will likely be in a table format, and it is going to be used for automatic transfer of data between an external party and our systems.

Can anyone suggest a command line tool (eg pdf to txt) or a library that would be good for this?

Language options:


  • C# (preferred)

  • Java (if I must)



I found some ideas here, but i think the guy was talking more about a one-off situation, i'm talking more like a daily import:

http://stackoverflow.com/questions/488089/extracting-tables-from-pdf-files

Answer