Nilesh Nilesh - 7 months ago 25
Java Question

java encode String to UTF

I want to display characters (Chinese or other language) from property file on Windows box.

Let's say I read a property server.location=上海的位置 from System property, which is set when server is started.

I tried to do this

new String(locationStr.getBytes(System.getProperty("file.encoding")), "UTF-8");


This works with Linux, but couldn't get it working on Windows.

Following is summarized snipet, without syntax of how the System Property is set

URL fileURL = new URL("file:filePathAndName");
InputStream iStream = fileURL.openStream () ;
Properties prop = new Properties();
prop.load(iStream);
//Enumerate over prop and set System.setProperty (key, value);


Reading property as System.getProperty("server.location")

This is done centrally for all property files, hence modifying anything while reading or setting specific encoding could affect others, hence not advisable.

Also tried to encode using
URLEncoder.encode
but didn't help.
I do not see any specific encoding set. Java uses UTF-16, on Windows the encoding is 'Cp1252'. What am I missing here?

Any help to make this work or throw some light is appreciated. Also tried to go through existing questions, but the answers didn't apply directly hence creating new question.
Thanks

Edit:
Couldn't convert the obtained String to UTF-8. Somehow convinced people to read properties in way Joop mentioned and retrieve the String properly

Answer

String/char/Reader/Writer in java contain Unicode text. Binary data, byte[], InputStream/OutputStream must be associated with an encoding to be convertable to text, String.

It seems your Properties file is in UTF-8. Then specify a fixed encoding when loading the properties.

InputStream iStream = fileURL.openStream();
Reader reader = new BufferedReader(new InputStreamReader(iStream, StandardCharsets.UTF_8));
Properties prop = new Properties();
prop.load(reader);

Here the InputStreamReader bridges the transition from binary data to (Unicode) text by a conversion specifying the encoding of the InputStream.