AlbertoD AlbertoD - 9 months ago 122
Java Question

Weka - load UTF-8 encoded csv

Is there a way in Weka 3.7.13 to load UTF-8 encoded files without converting them to ANSII?

I am trying to load a csv file containing a string attribute, whose value can contain emoticons, and I need not to lose them.


It is very possible to do this. See this link, it describes how to do this from command line or GUI.

Add this parameter if using the command line -Dfile.encoding=utf-8.

If using the GUI then edit the RunWEKA.ini file. Change the fileEncoding placeholder to utf-8.