Allwe Allwe - 3 months ago 49
Markdown Question

Encoding of input file for XSLT 2.0 function unparsed-text()

Let's say I have this
encoded in
(.md means it's markdown format)

Hello world
This text is encoded in UTF-8.

Then I approach it using function
unparsed-text('', 'UTF-8')
. That works like a charm.

Problem shows up when (let's say) I use one of my native language (Czech) specific character, for example this

Hello world
This character "ลก" is read like "sh" in english.

Using same encoding parameter in
I get error:

XTDE1200: Failed to read input file file:/C:/
(java.nio.charset.MalformedInputException): Input length = 1
has same encoding UTF-8 as
, czech characters are in this charset, yet XSLT processor doesn't accept it. If I change encoding parameter to
unparsed-text('', 'windows-1250')
it works nicely.

So question is, why I get this error? Does it relate to the fact that input file is with extension .md (.txt works). Is there way around it? I really want to be able to use same encoding in my xsl stylesheet as supplied input file has.

Thanks for answers.

Answer Source

As Martin says, the evidence you have provided suggests that the file is encoded in Windows-1252, and that unparsed-text('', 'utf-8') is therefore right to reject it.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download