theartofbeing theartofbeing - 5 months ago 11
Ruby Question

Optimize map within a loop

I have a dataset that is an array of hashes

[
{:last_name=>"Smith", :first_name=>"John", :city=>"New York City", :birthdate=>"5/29/1986"},
{:last_name=>"Bar", :first_name=>"Foo", :city=>"Chicago", :birthdate=>"5/29/1986"},
...
]


I want to print the values in a specific order. Currently I'm doing it this way:

def print(dataset, select_fields)
output = ''
dataset.each do |set|
output += select_fields.map { |key| set[key] }.join(' ') + "\n"
end
puts output
end


Because I'm calling
map
within
each
I believe this is pretty slow. Maybe O(n²) slow?

Is there any way to optimize this? Using Ruby 2.2.1

Answer

my print is about 30% faster on my machine. I'm pretty sure there are guys who can make it much faster than I did. In general try to iterate over a specific array once. Btw - when you are testing code, avoid any puts, because it slows your tests tremendously.

set = [
  {:last_name=>"Smith", :first_name=>"John", :city=>"New York City", :birthdate=>"5/29/1986"},
  {:last_name=>"Bar", :first_name=>"Foo", :city=>"Chicago", :birthdate=>"5/29/1986"},
]

def my_print(dataset, select_fields)  
  output = ''
  dataset.each do |set|
    select_fields.each do |sf|
      output << "#{set[sf]} "
    end
    output[-1] = "\n"
  end  
  output
end

def your_print(dataset, select_fields)
  output = ''
  dataset.each do |set|
    output += select_fields.map { |key| set[key] }.join(' ') + "\n"
  end
  output  
end

Benchmark.bm do |bm|
  bm.report do
    1_000_000.times do
      my_print(set, [:first_name, :last_name])
    end
  end
end

Benchmark.bm do |bm|
  bm.report do
    1_000_000.times do
      your_print(set, [:first_name, :last_name])
    end
  end
end