kaychaks kaychaks - 1 month ago 10
HTML Question

Convert Word doc to HTML programmatically in Java

I need to convert a Word document into HTML file(s) in Java. The function will take input an word document and the output will be html file(s) based on the number of pages the word document has i.e. if the word document has 3 pages then there will be 3 html files generated having the required page break.

I searched for open source/non-commercial APIs which can convert doc to html but for no result. Anybody who have done this type of job before please help.

Thanks

Answer

We use tm-extractors (http://mvnrepository.com/artifact/org.textmining/tm-extractors), and fall back to the commercial Aspose (http://www.aspose.com/). Both have native Java APIs.

Comments