user1597438 user1597438 - 10 days ago 6
PHP Question

Yii2: phpoffice/phpexcel identifies .xlsx file as HTML

I am trying to use phpoffice/phpexcel plugin for my yii2 project to read excel files. For files with xls extension, the plugin works perfectly and I am able to read the contents but when I try using files with japanese filenames and xlsx extension, it gives me an error like so:


DOMDocument::loadHTML(): Invalid char in CDATA 0x3 in Entity, line: 1.


I tried investigating PHPExcel_IOFactory::identify. Investigating the function, I came across createReaderForFile in the IOFactory class and when checking the extension type set here, it says 'Excel2007' but for some reason, at the very end of the process, the file is still identified as HTML.

To further depict the issue, my files have different extensions and names but basically the same content like so:

col1 col2 col3
aaaa bbbb cccc


The files are as follows:


  1. あああ.xls (can be read)

  2. あああ.xlsx (can't be read)

  3. aaaa.xls (can be read)

  4. aaaa.xlsx (can be read)



Only あああ.xlsx can't be read but the rest are fine. Is this some sort of limitation to the phpoffice/phpexcel plugin? If it is, can you suggest other yii2 extensions that will enable me to read both xlsx and xls files properly? Or is there some way to fix this so that it can correctly identify the files?

Answer

I've managed to fix this now. The issue seems to be something about the file encoding on zip file, adding \PHPExcel_Settings::setZipClass(\PHPExcel_Settings::PCLZIP); before PHPExcel_IOFactory::identify fixed it.