jabal jabal - 28 days ago 15
Java Question

Java servlet download filename special characters

I am writing a simple file download servlet and I can't get correct filenames. Tried URLEncoding and MimeEncoding the filename as seen in existing answers, but none of them worked.

The fileData object in the following snippet contains the mime type, the byte[] content and the filename, that needs at least ISO-8859-2 charset, ISO-8859-1 is not enough.

How can I get my browser to display the downloaded filename correctly?

Here is an example of the filename: árvíztűrőtükörfúrógép.xls and it results in: árvíztqrptükörfúrógép.xls

protected void renderMergedOutputModel(Map model, HttpServletRequest req, HttpServletResponse res) throws Exception {

RateDocument fileData = (RateDocument) model.get("command.retval");
OutputStream out = res.getOutputStream();
if(fileData != null) {
res.setContentType(fileData.getMime());
String enc = "utf-8"; //tried also: ISO-8859-2

String encodedFileName = fileData.getName();
// also tried URLencoding and mime encoding this filename without success

res.setCharacterEncoding(enc); //tried with and without this
res.setHeader("Content-Disposition", "attachment; filename=" + encodedFileName);
res.setContentLength(fileData.getBody().length);
out.write(fileData.getBody());
} else {
res.setContentType("text/html");
out.write("<html><head></head><body>Error downloading file</body></html>"
.getBytes(res.getCharacterEncoding()));
}
out.flush();
}

Answer

I found out solution that works in all browsers I have installed (IE8, FF16, Opera12, Chrome22).
It's based on the fact, that browsers expect value in filename parameter, that is encoded in browsers native encoding, if no [different] encoding is specified.

Usually browser's native encoding is utf-8 (FireFox, Opera, Chrome). But IE's native encoding is Win-1250.

So if we put value into filename parametr, that is encoded by utf-8/win-1250 according to user's browser, it should work. At least, it works for me.

String fileName = "árvíztűrőtükörfúrógép.xls";

String userAgent = request.getHeader("user-agent");
boolean isInternetExplorer = (userAgent.indexOf("MSIE") > -1);

try {
    byte[] fileNameBytes = fileName.getBytes((isInternetExplorer) ? ("windows-1250") : ("utf-8"));
    String dispositionFileName = "";
    for (byte b: fileNameBytes) dispositionFileName += (char)(b & 0xff);

    String disposition = "attachment; filename=\"" + dispositionFileName + "\"";
    response.setHeader("Content-disposition", disposition);
} catch(UnsupportedEncodingException ence) {
    // ... handle exception ...
}

Of course, this is tested only on browsers mentioned above and I cannot guarante on 100% that this will work in any browser all time.

Note #1 (@fallen): It's not correct to use URLEncoder.encode() method. Despite method's name, it doesn't encode string into URL-encoding, but it does encode into form-encoding. (Form-encoding is quite similiar to URL-encoding and in a lot of cases it produces same results. But there are some differences. For example space character ' ' is encoded different: '+' instead of '%20')

For correct URL-encoded string you should use URI class:

URI uri = new URI(null, null, "árvíztűrőtükörfúrógép.xls", null);
System.out.println(uri.toASCIIString());