Zmey Zmey - 4 months ago 17
Swift Question

Inserting ASCII symbols into a String (Swift)

I'm trying to insert a symbol with ASCII code 255 (Telnet IAC) into a String, but when converting the data back to utf8 I'm getting a different symbol:

var s = "\u{ff}"
print(s.utf8.count) // 2
try! s.write(toFile: "output.txt", atomically: true, encoding: .utf8)


The file contains
C3 BF
, not
FF
. I've also tried using

var s = "\(Character(UnicodeScalar(255)))"


but this produced the same result. How to escape it properly?

Answer

ASCII defines 128 characters from 0x00 to 0x7F. 0xFF (255) is not included.

In Unicode, U+00FF (in Swift, "\u{ff}") represents "ΓΏ" (LATIN SMALL LETTER Y WITH DIARESIS). And its UTF-8 representation is 0xC3 0xBF. See UTF-8, characters with code point from U+0080 to U+07FF are represented with two-byte sequence. Also you need to know that 0xFF is not a valid byte in UTF-8 byte sequence, which means you cannot get any 0xFF bytes in UTF-8 text file.

If you want to output "\u{ff}" as a single-byte 0xFF, use ISO-8859-1 (aka ISO-Latin-1) instead:

try! s.write(toFile: "output.txt", atomically: true, encoding: .isoLatin1)