Ka Mok Ka Mok - 2 months ago 6
Ruby Question

In Ruby (or Rails), is it faster to compare two large Strings as is, or convert it into Fixnum before using uniq?

Given 2000

Story
objects, each with a
body
attribute that equates to a string of about 500 characters.

What is the fastest way to compare them?


  1. Doing it like this:
    @stories.uniq { |story| story.body }



OR


  1. First converting each
    body
    into a
    Fixnum
    representation, then running
    uniq
    ?



I have vague feeling that computers are able to compare numbers faster than characters, but I also know that each character is really just represented in bytes.

Answer

It's easy to benchmark things like these. For example:

require 'benchmark'

string_array = 1.upto(2000).inject([]) do |arr| 
  arr << 1.upto(500).inject("") { |str| str << rand(10).to_s } 
end

fixnum_array = string_array.map(&:to_i)

Benchmark.bm do |x|
  x.report("bignum:") { 1000.times { fixnum_array.uniq } }
  x.report("bignum_and_to_i:") { 1000.times { string_array.map(&:to_i).uniq } }
  x.report("string:") { 1000.times { string_array.uniq } }
end

Outputs:

                        user     system      total        real
bignum:             1.710000   0.010000   1.720000 (  1.729463)
bignum_and_to_i:   28.500000   0.160000  28.660000 ( 28.738891)
string:             1.740000   0.000000   1.740000 (  1.754165)

Comparing 2000 strings containing numbers of about 500 digits is very much faster than first converting the strings into numbers and then comparing.

Comparing long strings versus comparing large numbers won't make a big difference.

Conclusion: Converting long strings into large numbers is so slow that it's faster to just compare the strings.