Chris Chris - 3 months ago 30
Java Question

Extracting text from a PDF file

I need to extract the text from a PDF file. This text will likely be in a table format, and it is going to be used for automatic transfer of data between an external party and our systems.

Can anyone suggest a command line tool (eg pdf to txt) or a library that would be good for this?

Language options:

  • C# (preferred)

  • Java (if I must)

I found some ideas here, but i think the guy was talking more about a one-off situation, i'm talking more like a daily import: