Allwe Allwe - 1 year ago 145
Markdown Question

Encoding of input file for XSLT 2.0 function unparsed-text()

Let's say I have this

file.md
encoded in
UTF-8
(.md means it's markdown format)

Hello world
This text is encoded in UTF-8.


Then I approach it using function
unparsed-text('file.md', 'UTF-8')
. That works like a charm.

Problem shows up when (let's say) I use one of my native language (Czech) specific character, for example this
file2.md
:

Hello world
This character "ลก" is read like "sh" in english.


Using same encoding parameter in
unparsed-text()
I get error:


XTDE1200: Failed to read input file file:/C:/file2.md
(java.nio.charset.MalformedInputException): Input length = 1


file2.md
has same encoding UTF-8 as
file.md
, czech characters are in this charset, yet XSLT processor doesn't accept it. If I change encoding parameter to
windows-1250
ie.
unparsed-text('file2.md', 'windows-1250')
it works nicely.

So question is, why I get this error? Does it relate to the fact that input file is with extension .md (.txt works). Is there way around it? I really want to be able to use same encoding in my xsl stylesheet as supplied input file has.

Thanks for answers.

Answer Source

As Martin says, the evidence you have provided suggests that the file is encoded in Windows-1252, and that unparsed-text('file.md', 'utf-8') is therefore right to reject it.

Recommended from our users: Dynamic Network Monitoring from WhatsUp Gold from IPSwitch. Free Download