solidcell solidcell - 4 months ago 10
Swift Question

Foundation's string encoding isn't what sites are expecting

Specifically, it's encoding characters with an umlaut as two characters.

let unencoded = "könnten"
let encoded = unencoded.stringByAddingPercentEncodingWithAllowedCharacters(NSCharacterSet.URLQueryAllowedCharacterSet())!


encoded
is then equal to
ko%CC%88nnten
. So, it's converting the
ö
into
o%CC%88
. So it's really like
, where the umlaut (
¨
) and the
o
are separate.

However, most sites seem to be expecting the encoding to be
%C3%B6
, which is
ö
, where the umlaut (
¨
) and
o
are one single character.

You can see an example of the encoding not working here (how
Foundation
wants to encode it):

https://www.linguee.com/german-english/search?query=ko%CC%88nnten

And how it would ideally be:

https://www.linguee.com/german-english/search?query=k%C3%B6nnten

Is there a better way to be encoding this? Maybe different options or a different framework?

Answer

Ideally, the server should cope with both precomposed and decomposed strings. But if necessary, you can precompose the string on the client side:

let unencoded = "könnten"
let encoded = unencoded.precomposedStringWithCanonicalMapping
        .stringByAddingPercentEncodingWithAllowedCharacters(.URLQueryAllowedCharacterSet())!

print(encoded) // k%C3%B6nnten

See Technical Q&A QA1235 – Converting to Precomposed Unicode for more information.

Comments