elliedori elliedori - 1 year ago 50
Ruby Question

Why does the output from my map/regex block not capitalize?

I'm working through the Test First Ruby Master problems. My code for 08/book_titles is this:

class Book
attr_accessor :title

def title
if @title.include?(' ')
correct = @title.split.each_with_index.map {|x, index| ((x =~ /^a|an|of|the|or|in|and$/) && index != 0) ? x : x.capitalize}
correct.join(' ')
# this is throwing a weird error, the code looks right but isn't capitalizing last word (returns 'To Kill a mockingbird')
else @title.capitalize


I tested the map portion separately, and it works fine. But in the entirety of the problem, it does not capitalize as it should be. It throws an rspec error:

1) Book title should capitalize every word except... articles a
Failure/Error: expect(@book.title).to eq("To Kill a Mockingbird")

expected: "To Kill a Mockingbird"
got: "To Kill a mockingbird"

Anyone know why?

I originally didn't include
in the regex. I got the same error with a different title, and adding those anchors fixed it for that case. But then the error showed up again with the title.

Answer Source

The regex is slightly incorrect. The way to read it as it is can be done this way:

Match any string that

  • starts with 'a'
  • or contains 'an'
  • or contains 'of'
  • or contains 'the'
  • or contains 'or'
  • or contains 'in'
  • or ends in 'and'

What you really seem to want is something that reads like this:

Match any string that

  • only contains any of 'a', 'an', 'of', 'the', 'or', 'in', 'and'

To get this, you want your regex to be written like this:


Note the parenthesis around the alternation. (Alternation is the formal term for multiple choices in a regex, where choices are separated by '|').

If you're comparing against book or movie titles, this is much closer to the type of match you'd expect. It will match correctly for titles such as "Chariots of Fire" and "Benny and Joon", but not against falsely the 'in' of "To Kill a Mockingbird", which is a significant improvement.

However, it still won't quite work yet on something like "Benny AND Joon", because 'AND' is uppercase in this title (assuming that incoming titles may be arbitrarily mixed case). One last change will do it:


That last letter 'i' at the end of the regex says to 'ignore case', so that matches can occur regardless of whether the 'AND' is uppercase, lowercase, or mixed case.

This should get you close to what you're trying to achieve and handle a few bumpy use cases in the process.