mihsathe mihsathe - 1 year ago 114
Java Question

UTF-8 response with servlet

I am reading HTTP response from a Perl page in a Servlet like this:

public String getHTML(String urlToRead) {
URL url;
HttpURLConnection conn;
BufferedReader rd;
String line;
String result = "";
try {
url = new URL(urlToRead);
conn = (HttpURLConnection) url.openConnection();
conn.setRequestProperty("Accept-Charset", "UTF-8");
conn.setRequestProperty("Content-Type", "text/xml; charset=UTF-8");

rd = new BufferedReader(new InputStreamReader(conn.getInputStream(), "UTF-8"));
while ((line = rd.readLine()) != null) {
byte [] b = line.getBytes();
result += new String(b, "UTF-8");
} catch (Exception e) {
return result;

I am displaying this result with this code:

response.setContentType("text/plain; charset=UTF-8");

PrintWriter out = new PrintWriter(new OutputStreamWriter(response.getOutputStream(), "UTF-8"), true);

try {

String query = request.getParameter("query");
String type = request.getParameter("type");

String res = getHTML(url);

} finally {

But the response still is not encoded as UTF-8. What am I doing wrong?

Thanks in advance.

laz laz
Answer Source

That call to line.getBytes() looks suspicious. You should probably make it line.getBytes("UTF-8") if you are certain that what is returned is UTF-8 encoded. Additionally, I'm not sure why it is even necessary. A typical approach to getting data out of a BufferedReader is to use a StringBuilder to continue appending each String retrieved from readLine into a result. The conversion back and forth between String and byte[] is unnecessary.

Change result into a StringBuilder and do this:

while ((line = rd.readLine()) != null) {
Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download