wingyip wingyip - 1 month ago 7
C# Question

Lucene.Net greater than/less than TermRangeQuery?

I have built a Lucene.net index of books. All is working well but I need to add another way to query the index and I cant figure out how to do it.

Basically each book has an age range that it is suitable for. This is expressed by two columns namely - minAge and maxAge. Both columns are integers.

I am indexing and storing these fields in the following loop

foreach (var catalogueBook in books)
{
var book = new Book(catalogueBook.CatalogueBookNo,catalogueBook.IssueId);

var strTitle = book.FullTitle ?? "";
var strAuthor = book.Author ?? "";
// create a Lucene document for this book
var doc = new Document();

// add the ID as stored but not indexed field, not used to query on
doc.Add(
new Field(
"BookId",
book.CatalogueBookNo.ToString(System.Globalization.CultureInfo.InvariantCulture),
Field.Store.YES,
Field.Index.NOT_ANALYZED_NO_NORMS,
Field.TermVector.NO));

// add the title and author as stored and tokenized fields, the analyzer processes the content
doc.Add(
new Field("FullTitle",
strTitle.Trim().ToLower(),
Field.Store.YES,
Field.Index.ANALYZED,
Field.TermVector.NO));

doc.Add(
new Field("Author",
strAuthor.Trim().ToLower(),
Field.Store.YES,
Field.Index.ANALYZED,
Field.TermVector.NO));

doc.Add(
new Field("IssueId",
book.IssueId,
Field.Store.YES,
Field.Index.NOT_ANALYZED_NO_NORMS,
Field.TermVector.NO));

doc.Add(
new Field(
"PublicationId",
book.PublicationId.Trim().ToLower(),
Field.Store.YES,
Field.Index.NOT_ANALYZED_NO_NORMS,
Field.TermVector.NO));

doc.Add(
new Field(
"MinAge",
book.MinAge.ToString("0000"),
Field.Store.YES,
Field.Index.NOT_ANALYZED_NO_NORMS,
Field.TermVector.NO));

doc.Add(
new Field(
"MaxAge",
book.MaxAge.ToString("0000"),
Field.Store.YES,
Field.Index.NOT_ANALYZED_NO_NORMS,
Field.TermVector.NO));

doc.Add(new NumericField("Price",Field.Store.YES,true).SetDoubleValue(Convert.ToDouble(book.Price)));

//Now we can loop through categories
foreach(var bc in book.GetBookCategories())
{
doc.Add(
new Field("CategoryId",
bc.CategoryId.Trim().ToLower(),
Field.Store.YES,
Field.Index.NOT_ANALYZED_NO_NORMS,
Field.TermVector.NO));
}

// add the document to the index
indexWriter.AddDocument(doc);
}

// make lucene fast
indexWriter.Optimize();
}


As you can see I am padding out the minAge and maxAge fields as I thought it would be easiest to run a TermRangeQuery against it.

However I need to query both the minAge and maxAge columns with an Age to see if that Age falls with in the Age range defined by minAge and maxAge.

Sql would be

Select *
From books
where @age >= minAge and @age <= maxAge


Unfortunately I cannot see a way to do this. Is this even possible in Lucene.Net?

Answer

You should be able to do this utilizing the range queries if memory serves. This is effectively the inverse of a standard range query, but you should be able to, something like:

+minAge:[* TO @age] +maxAge:[@age TO *]

Or, if your constructing the query objects, a RangeQuery (or better yet, NumericRangeQuery) with either the upper or lower bound null works as an open-ended range.

I've used the syntax above before, but support seems to be a bit...shaky on it. If that doesn't work, you can always just set an adequately low lower bound (0) and high upper bound (say, 1000), such as:

+minAge:[0000 TO @age] +maxAge:[@age TO 1000]

Which should be safe enough, barring any Methuselahs.