dimid dimid - 4 months ago 10
Ruby Question

Zlib with non-breaking space

I ran into a weird problem when inflating and deflating strings with non-breaking space in ruby.

Strings with regular spaces behave as expected:

str = "hello world"; str_zipped = Zlib.deflate str; str == Zlib.inflate(str_zipped)
=> true


However,

str = "hello\xA0world"; str_zipped = Zlib.deflate str; str == Zlib.inflate(str_zipped)
=> false


Is this an expected behavior or a bug?

Answer

ZLib doesn't keep the encoding. Your string is probably UTF-8 encoded:

str = "hello\xA0world"
str.encoding
#=> <Encoding:UTF-8>

But ZLib returns a ACSII encoded string:

str_zipped = Zlib.deflate str
str = Zlib.inflate(str_zipped)
str.encoding
#=> <Encoding:ASCII-8BIT>

But when you fix that encoding:

str = "hello\xA0world"
str_zipped = Zlib.deflate str
str_utf8 = Zlib.inflate(str_zipped).force_encoding('UTF-8')
str == str_utf8
#=> true
Comments