zqe zqe - 13 days ago 5
Ruby Question

Build an intersection matrix in Ruby

I have a set of students who have each selected a certain number of courses that they want to take next semester, represented as an array of hashes:

[
{"student"=>"1", "English"=>true, "Algebra"=>true, "History"=>false},
{"student"=>"2", "English"=>false, "Algebra"=>false, "History"=>true},
{"student"=>"3", "English"=>false, "Algebra"=>true, "History"=>false},
{"student"=>"4", "English"=>true, "Algebra"=>false, "History"=>true}
]


I want to build a matrix showing how many conflicts there are between each course, end result something like this:

English Algebra History
English 2 1 1
Algebra 1 2 -
History 1 - 2


Where the number at the intersection is the number of students who have selected both courses, i.e., the number at the intersection of (English, English) is 2 = the total number of students who selected English. The number at (History, Algebra) is "-", because there is never a student who has selected both of those courses.

I tried looking at the ruby docs for the Matrix class and it seemed like that addresses more mathematical matrices- I'm not sure how to retool it for this purpose, or if it's an appropriate class for this problem.

What kind of approach could I try researching/googling to efficiently construct a matrix like this?

Answer

The following works for any number of subjects.

arr = [
  {"student"=>"1", "English"=>true,  "Algebra"=>true,  "History"=>false},
  {"student"=>"2", "English"=>false, "Algebra"=>false, "History"=>true},
  {"student"=>"3", "English"=>false, "Algebra"=>true,  "History"=>false},
  {"student"=>"4", "English"=>true,  "Algebra"=>false, "History"=>true}
]

pairs = (arr.first.keys - ["student"]).repeated_combination(2).to_a
  #=> [["English", "English"], ["English", "Algebra"], ["English", "History"],
  #    ["Algebra", "Algebra"], ["Algebra", "History"], ["History", "History"]] 

h = pairs.product([0]).to_h
  #=> {["English", "English"]=>0, ["English", "Algebra"]=>0, ["English", "History"]=>0,
  #    ["Algebra", "Algebra"]=>0, ["Algebra", "History"]=>0, ["History", "History"]=>0}

arr.each_with_object(h) { |g,h|
  pairs.each { |sub1, sub2| h[[sub1, sub2]] += 1 if (g[sub1] && g[sub2]) } }
  #=> {["English", "English"]=>2, ["English", "Algebra"]=>1, ["English", "History"]=>1,
  #    ["Algebra", "Algebra"]=>2, ["Algebra", "History"]=>0, ["History", "History"]=>2}

See Array#repeated_combination.

The steps are as follows.

a = arr.first.keys - ["student"]
  #=> ["English", "Algebra", "History"] 
b = a.repeated_combination(2)
  #=> #<Enumerator: ["English", "Algebra", "History"]:repeated_combination(2)> 
pairs = b.to_a
  #=> [["English", "English"], ["English", "Algebra"], ["English", "History"],
  #    ["Algebra", "Algebra"], ["Algebra", "History"], ["History", "History"]]
enum0 = arr.each_with_object(Hash.new(0))
  #=> #<Enumerator: [
  #     {"student"=>"1", "English"=>true, "Algebra"=>true, "History"=>false},
  #     {"student"=>"2", "English"=>false, "Algebra"=>false, "History"=>true},
  #     {"student"=>"3", "English"=>false, "Algebra"=>true, "History"=>false},
  #     {"student"=>"4", "English"=>true, "Algebra"=>false, "History"=>true}
  #   ]:each_with_object({})> 

The first element of enum1 is generated and pass to the block and the block variables are assigned to that object, using parallel assignment.

g,h = enum0.next
  #=> [{"student"=>"1", "English"=>true, "Algebra"=>true, "History"=>false}, {}]
g #=> {"student"=>"1", "English"=>true, "Algebra"=>true, "History"=>false} 
h #=> {} 

We now encounter a second enumerator.

enum1 = pairs.each
  #=> #<Enumerator: [
  #     ["English", "English"], ["English", "Algebra"], ["English", "History"],
  #     ["Algebra", "Algebra"], ["Algebra", "History"], ["History", "History"]
  #   ]:each>

The first element of that enumerator is passed to the block and the values of the block variables are computed.

sub1, sub2 = enum1.next
  #=> ["English", "English"] 
sub1
  #=> "English" 
sub2
  #=> "English" 

We next perform the block calculation.

c = g[sub1] && g[sub2]
  #=> g["English"] && g["English"]
  #=> true && true
  #=> true

Because c #=> true we execute

h[[sub1, sub2]] += 1
  #=> h[["English", "English"]] += 1
  #=> 1

so now

h #=> {["English", "English"]=>1}

The remaining calculations are similar.

Comments