WorkSmarter WorkSmarter - 10 days ago 5
SQL Question

Full Text Search of URL field sql server

Objective: Return all URLs beginning with

"https://mywebsite.domain.com/as/product/4/"


Given:


  • Applied full text search on URL field.

  • SQL Server edition: 2014.

  • 20+ Million rows



URL



https://mywebsite.domain.com/as/product/1/production
https://mywebsite.domain.com/as/product/2/items
https://mywebsite.domain.com/as/product/1/affordability
https://mywebsite.domain.com/as/product/3/summary
https://mywebsite.domain.com/as/product/4/schedule
https://mywebsite.domain.com/as/product/4/resources/summary


Query 1:

WHERE CONTAINS (URL, 'https://mywebsite.domain.com/as/product/4')


Result:

All records returned


Query 2 (Added "*" after reading MSDN article)

WHERE CONTAINS (URL, '"https://mywebsite.domain.com/as/product/4*"')


Result:

No records returned


Any assistance would be greatly appreciated.

Answer

You can use CONTAINS with a LIKE subquery for matching only a start:

SELECT * 
FROM (
SELECT * 
FROM myTable WHERE CONTAINS (URL, '"https://mywebsite.domain.com/as/product/4/"')
) AS S1 
WHERE S1.URL LIKE 'https://mywebsite.domain.com/as/product/4/%' 

This way, the SLOW LIKE operator query will be run against a smaller set of records

EDIT1: (if WHERE CONTAINS (URL, '"https://mywebsite.domain.com/as/product/4/"') is not filtering Values)

After a lot of searches. the problem is in / . The forward-slash isn't contained in the Noise Words file, but I guess it's classed as a delimiter or Word breaker and therefore isn't searchable.

Read these Topics:

EDIT2:

I found one suggested solution that is

/ is considered as an english wordbreaker You may change It from Registry

  • Navigate to Registry values HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\<InstanceRoot>\MSSearch\Language\eng and HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\<InstanceRoot>\MSSearch\Language\enu
  • clear value for WBreakerClass.

Sql server consider https://mywebsite.domain.com/as/product/4 as one word.

Note: above both path i have taken by assuming that you are using English language as word breaker.

Read more about Word Breaker in This MSDN Topic