Hayk Saakian Hayk Saakian - 7 months ago 22
Ruby Question

Force strings to UTF-8 from any encoding

In my rails app I'm working with RSS feeds from all around the world, and some feeds have links that are not in UTF-8. The original feed links are out of my control, and in order to use them in other parts of the app, they need to be in UTF-8.

How can I detect encoding and convert to UTF-8?

Answer

Ruby 1.9

"Forcing" an encoding is easy, however it won't convert the characters just change the Encoding:

str = str.force_encoding("UTF-8")

str.encoding.name # => 'UTF-8'

If you want to perform a converstion, use encode:

begin
  str.encode("UTF-8")
rescue Encoding::UndefinedConversionError
  # ...
end

I would definitely read the following post for more information:
http://graysoftinc.com/character-encodings/ruby-19s-string