sawa sawa - 5 months ago 18
Ruby Question

Why is a string key for a hash frozen?

According to the specification, strings that are used as a key to a hash are duplicated and frozen. Other mutable objects do not seem to have such special consideration. For example, with an array key, the following is possible.

a = [0]
h = {a => :a}
h.keys.first[0] = 1
h # => {[1] => :a}
h[[1]] # => nil
h[[1]] # => :a

On the other hand, a similar thing cannot be done with a string key.

s = "a"
h = {s => :s}
h.keys.first.upcase! # => RuntimeError: can't modify frozen String

Why is string designed to be different from other mutable objects when it comes to a hash key? Is there any use case where this specification becomes useful? What other consequences does this specification have?

I actually have a use case where absence of such special specification about strings may be useful. That is, I read with the
gem a manually written YAML file that describes a hash. the keys may be strings, and I would like to allow case insensitivity in the original YAML file. When I read a file, I might get a hash like this:

h = {"foo" => :foo, "Bar" => :bar, "BAZ" => :baz}

And I want to normalize the keys to lower case to get this:

h = {"foo" => :foo, "bar" => :bar, "baz" => :baz}

by doing something like this:


but that returns an error for the reason explained above.


In short it's just Ruby trying to be nice.

When a key is entered in a Hash, a special number is calculated, using the hash method of the key. The Hash object uses this number to retrieve the key. For instance, if you ask what the value of h['a'] is, the Hash calls the hash method of string 'a' and checks if it has a value stored for that number. The problem arises when someone (you) mutates the string object, so the string 'a' is now something else, let's say 'aa'. The Hash would not find a hash number for 'aa'.

The most common types of keys for hashes are strings, symbols and integers. Symbols and integers are immutable, but strings are not. Ruby tries to protect you from the confusing behaviour described above by dupping and freezing string keys. I guess it's not done for other types because there could be nasty performance side effects (think of large arrays).