cman77 cman77 - 1 year ago 73
Ruby Question

Filtering by multiple values using XPath

I am trying to filter an XML document of Jobs by the Company name.

I am able to pull all items that match specific Company names using:

doc.xpath("/source/job[company[text() = 'BigCorp' or text() = 'MegaCorp']]")

I am unable to do the opposite and exclude by these values, using something like:

doc.xpath("/source/job[company[text() != 'Hodes' or text() != 'Scurri']]")

Where am I going wrong? Is there a way to provide a comma-separated list of values?

Answer Source

Try changing the or to and:

doc.xpath("/source/job[company[text() != 'Hodes' and text() != 'Scurri']]")

If you use or, it's always going to return the job.

For example, it would return the job with the company Hodes because text() != 'Scurri' is true (and vice versa).

so normalize-space() did it! doc.xpath("/source/job[company[normalize-space() != 'Hodes' and normalize-space() != 'Scurri']]") not sure why?

The reason normalize-space() worked is because text() is also going to return whitespace.

For example, if you have an element like:



<company> Hodes </company>

the text() would equal "_Hodes_". (I replaced the spaces with _ to make them easier to see.)

Because of the whitespace, "_Hodes_" doesn't equal "Hodes".

Using normalize-space() will strip the leading/trailing whitespace and replace multiple spaces with a single space.