Nona Nona - 6 months ago 12
Ruby Question

Should I use a LIKE query for these ActiveRecord relationships?

Let's say I have a single web page form user interface with 2 sets of checkboxes. With set 1 checkboxes, I can check off what Trainers I would like ("Jason", "Alexandra, etc.) With set 2 checkboxes, I can check off what animals I would like to see ("Tigers", "Bears", etc.) Once I submit the form with these options, I get back a list of zoos that match the criteria (let's assume all the trainers work at all the zoos and all the animals are at all the zoos for discussion's sake)

We'll be running our database query by "name" (e.g., search using trainer names and animal names, NOT database ids)

Let's say we are using a Postgres database that has hundreds of thousands of rows (if not millions).


  1. Is it more efficient to search using an "ILIKE" query or is it better to do a standard join query (e.g.,
    Zoo.includes(:animals, :trainers).where("animals.name = ? and trainers.name = ?", animal_names, trainer_names)
    ?

  2. Is there a better way than what I just showed in #1 above?



model setup

class Zoo < ActiveRecord::Base
has_many :animals, through: zoo_animals
has_many :trainers, through: zoo_trainers
has_many :zoo_trainers
has_many :zoo_animals
end

class Animal < ActiveRecord::Base
has_many :zoos, through :zoo_animals
has_many :zoo_animals
end

class Trainer < ActiveRecord::Base
has_many :zoos, through :zoo_trainers
has_many :zoo_trainers
end

class ZooAnimal < ActiveRecord::Base
belongs_to :animal
belongs_to :zoo
end

class ZooTrainer < ActiveRecord::Base
belongs_to :zoo
belongs_to :trainer
end


EDIT: let's suppose I don't have access to the database ID's.

Answer

LIKE '%Jason%' is much less efficient than querying for the exact string 'Jason' (or querying for an ID), because while exact comparisons and some uses of LIKE can use an index on the column being queried, LIKE with a pattern beginning with a wildcard can't use an index.

However, performance doesn't sound like the most important consideration here. LIKE %Jason% will still probably be fast enough on a reasonably sized database under reasonable load. If the application really needs to search for things by substring (which implies that a search might have multiple results), that requirement can't be met by simple equality.

There are an endless number of higher-powered solutions to searching text, including Postgres built-in full-text search and external solutions like Elasticsearch. Without specific requirements for scaling I'd go with LIKE until it started to slow down and only then invest in something more complicated.