C Question

how to use libxml2 to parse dirty html in C programing

The html maybe dirty
such as premature end of data in tag

How can i do it? Thanks

Answer Source

Using the libxml2 HTML parser it will normalize "dirty" HTML into a normalized tree. see htmlDocPtr htmlParseFile(const char * filename, const char * encoding)