bstockwell bstockwell - 16 days ago 7
Ruby Question

Match a substring that might contain reserved characters

I'm having some issues with matching one string to another if the string I'm testing for contains regex characters.

Background: I'm working on a script that migrates news articles from 2 legacy systems into one. In some cases, these stories are duplicated within the systems, so I'm running a script to check stored data an archive file (in html form) to see if the title of the current story matches anything in the archive.

#...(for each line)
line.match(title) then
return true
end


This generally works, except when I have a regex character in the title, for example:

<span class="title">$8.9 Million Grant for UC Center Focused on Occupational Safety and Health</span>


doesn't match

$8.9 Million Grant for UC Center Focused on Occupational Safety and Health


Here's some example output from irb to demonstrate

2.3.0 :012 > str = '<span class="title">$8.9 Million Grant for UC Center Focused on Occupational Safety and Health</span>'
2.3.0 :020 > str.match("$8.9 Million Grant for UC Center Focused on Occupational Safety and Health")
=> nil
2.3.0 :021 > str.match("\\$8.9 Million Grant for UC Center Focused on Occupational Safety and Health")
=> #<MatchData "$8.9 Million Grant for UC Center Focused on Occupational Safety and Health">
2.3.0 :022 > str.match("8.9 Million Grant for UC Center Focused on Occupational Safety and Health")
=> #<MatchData "8.9 Million Grant for UC Center Focused on Occupational Safety and Health">
2.3.0 :023 >


So I'm pretty sure the
$
is the issue, and that the issue stems from it being a recursive regex character.

Ruby isn't my daily language, and I'm having some trouble figuring out where to look to see if there either a ruby method to do the match without relying on regex, or to treat the pattern literally, or to automatically escape potential regex special characters. Help is appreciated.

Answer
str.match(Regexp.new(Regexp.escape("$8.9 Million ...")))
=> #<MatchData "$8.9 Million Grant for UC Center Focused...