SupremeA SupremeA - 5 months ago 12
Ruby Question

Ruby: Select a grouped array with multiple conditions

I have an array of transactions. I need to group the transactions by name and then select the group with the highest amount & more than 1 instance.

For example, if I have 1 transaction named "car" with an amount of $3000, and 3 transactions with "boat" totaling $1800, and 4 transactions with "house" totaling $500, the method will select boat because it is the highest amount group with multiple transactions.

@transactions =
[{"amount"=>-3000, "name"=>"CAR"},
{"amount"=>-600, "name"=>"BOAT"},
{"amount"=>-600, "name"=>"BOAT"},
{"amount"=>-600, "name"=>"BOAT"},
{"amount"=>-125, "name"=>"HOUSE" },
{"amount"=>-125, "name"=>"HOUSE" },
{"amount"=>-125, "name"=>"HOUSE" },
{"amount"=>-125, "name"=>"HOUSE" }]


Right now I have this but it selects based on length of name.

@transactions.group_by {|h| h['name'] }.max_by {|k, v| v.length }.first


How can I group, then sum, then select by highest amount in a group with multiple transactions.

Answer

There are a lot of good answers here. I'd like to add that you can eliminate a lot of iteration by combining operations.

For example, rather than calculating the sums for each group in a second step, you can do that inside your group_by block:

sums = Hash.new(0)

groups = transactions.group_by do |t|
  sums[t["name"]] += t["amount"]
  t["name"]
end

p groups
# => { "CAR" => [ { "amount" => -3000, "name" => "CAR" } ],
#      "BOAT" => [ ... ],
#      "HOUSE" => [ ... ] }

p sums
# => { "CAR" => -3000, "BOAT" => -1800, "HOUSE" => -500 }

Next instead of doing groups.select to eliminate groups with only one member and then min_by to get the final result, combine the former into the latter:

result = groups.min_by do |k,g|
  g.size > 1 ? sums[k] : Float::INFINITY
end

p result[1]
# => [ { "amount" => -600, "name" => "BOAT" },
#      { "amount" => -600, "name" => "BOAT" },
#      { "amount" => -600, "name" => "BOAT" } ]

Because everything is smaller than Float::INFINITY, those groups with only one member will never be selected (unless every group has only one member).

And so...

Solution 1

Putting it all together:

sums = Hash.new(0)

result =
  transactions.group_by {|t|
    sums[t["name"]] += t["amount"]
    t["name"]
  }.min_by {|k,g| g.size > 1 ? sums[k] : Float::INFINITY }[1]

p result
# => [ { "amount" => -600, "name" => "BOAT" },
#      { "amount" => -600, "name" => "BOAT" },
#      { "amount" => -600, "name" => "BOAT" } ]

Solution 2

You could also combine all of this into a single reduce and iterate over the data only once, but it's not very Rubyish:

sums = Hash.new(0)
groups = Hash.new {|h,k| h[k] = [] }
min_sum = Float::INFINITY

result = transactions.reduce do |min_group, t|
  name = t["name"]
  sum = sums[name] += t["amount"]
  (group = groups[name]) << t

  if group.size > 1 && sum < min_sum
    min_sum, min_group = sum, group
  end
  min_group
end

Note that you could move all of those outside variable declarations into, say, an array passed to reduce (instead of nil), but it would impact readability a lot.

Comments