Palmi Palmi - 5 months ago 12
SQL Question

Query with wildcard and dot not matching data with Oracle Text index

When using the wildcard character in combination with a dot in a text search, my query does not find the matching row.

For example:

CREATE TABLE MY_TABLE( ITEM_NUMBER VARCHAR2(50 BYTE) NOT NULL);
INSERT INTO MY_TABLE (ITEM_NUMBER) VALUES ('1234.1234');
create index TIX_ITEMNO on MY_TABLE(ITEM_NUMBER) indextype is ctxsys.context;


I want to find the row in MY_TABLE where ITEM_NUMBER column is '1234.1234'

This does find the row:

SELECT * FROM MY_TABLE
WHERE CONTAINS(ITEM_NUMBER, '%1234') > 0


This does not find the row:

SELECT * FROM MY_TABLE
WHERE CONTAINS(ITEM_NUMBER, '%.1234') > 0


I do not understand why, since according to Oracle the dot is not a special character that has to be escaped.

How do I have to handle this situation?

Answer

This is because your default lexer is treating the period as a word separator.

Initial setup:

create table my_table(item_number varchar2(50 byte) not null);

insert into my_table values ('1234.1234');

create index my_index on my_table (item_number) 
indextype is ctxsys.context;

This gets the behaviour you see:

SELECT * FROM MY_TABLE
WHERE CONTAINS(ITEM_NUMBER, '%.1234') > 0;

no rows selected

SELECT * FROM MY_TABLE
WHERE CONTAINS(ITEM_NUMBER, '%1234') > 0;

--------------------------------------------------
1234.1234

If you add a lexer that defines PRINTJOINS to include the period:

drop index my_index;

begin 
  ctx_ddl.create_preference('my_lexer', 'BASIC_LEXER'); 
  ctx_ddl.set_attribute('my_lexer', 'PRINTJOINS', '.');
end;
/

create index my_index on my_table (item_number) 
indextype is ctxsys.context
parameters ('lexer my_lexer');

then it behaves the way you want:

SELECT * FROM MY_TABLE
WHERE CONTAINS(ITEM_NUMBER, '%.1234') > 0;

ITEM_NUMBER
--------------------------------------------------
1234.1234

SELECT * FROM MY_TABLE
WHERE CONTAINS(ITEM_NUMBER, '%1234') > 0;

ITEM_NUMBER
--------------------------------------------------
1234.1234

Read more about text indexing elements.

Comments