Gemini Keith Gemini Keith - 22 days ago 8
Java Question

Multiple filters not work in HBase 1.2.6

My HBase schema looks like as following:

{
"<trace-id>": {
"span-timestamp": {
"ts:span:<timestamp>": ""
},
"span-name": {
"ts:span:<name>": ""
},
"span-duration": {
"ts:span:<duration>": ""
},
"span-blob": {
"ts:span:<span-id>": "<span>"
},
"endpoint": {
"ts:endpoint:<service-name>": ""
},
"annotation": {
"ts:annotation:<value>": ""
},
"binary-annotation": {
"ts:binary-annotation:<key>": "<value>",
},
}
}


In my circumstance, I need query specific qualifiers, so I constructed following filters:

final FilterList filters = new FilterList(Operator.MUST_PASS_ALL);
final Charset cs = HOperation.CHARSET;
filters.addFilter(Filters.qualifier(Schema.SCHEMA_TRACES_ENDPOINT, CompareOp.EQUAL, request.serviceName));
filters.addFilter(Filters.qualifier(Schema.SCHEMA_TRACES_SPAN_NAME, CompareOp.EQUAL, request.spanName));
filters.addFilter(Filters.qualifier(Schema.SCHEMA_TRACES_SPAN_TIMESTAMP,
request.endTs * 1000 - request.lookback * 1000, request.endTs * 1000));
filters.addFilter(new PageFilter(request.limit));
scan.setFilter(filters);
scan.setLoadColumnFamiliesOnDemand(true);


As you can see, I've bound column family filter with qualifier filter, which means the row will be returned only if both of family filter and qualifier filter evaluate to true.

static FilterList qualifier(final Schema schema, final CompareOp op, final byte[] value) {
final FilterList list = new FilterList(Operator.MUST_PASS_ALL);
list.addFilter(new FamilyFilter(CompareOp.EQUAL, new BinaryComparator(schema.cf().getBytes(HOperation.CHARSET))));
list.addFilter(new QualifierFilter(op, new BinaryComparator(value)));
return list;
}


After I've tried the code, I found my find method based on
Table#getScanner(Scan)
could not work properly.

What's more, I found these two filters could not work together:

filters.addFilter(Filters.qualifier(Schema.SCHEMA_TRACES_ENDPOINT, CompareOp.EQUAL, request.serviceName));
filters.addFilter(Filters.qualifier(Schema.SCHEMA_TRACES_SPAN_NAME, CompareOp.EQUAL, request.spanName));


Typically, when I comment out any one of these two filters it work. Of course, not perfectly work, cause I need it return
limit
rows, however, it's not.

Any ideas would be appreciate. Thanks a lot!

Answer Source

After several days research about HBase, I finally figured out the real reason why these multiple filters not working properly together.

In HBase, Filter is obviously for output not just working as an condition.

For example, if a row (with row key row) with 3 columns called as cf:A,cf1:1,cf2:2(column family is obviously cf,cf1,cf2).


Situation 1:

Applying a FamilyFilter and the family must be cf, this will not return any family that not match(only those families with name of cf returned for one row), in this scenario, cf1:1 and cf2:2 will not be included in return value.


Situation 2:

Applying a QualifierFilter and the qualifier must be 1, than only cf1:1 will be returned.


Of course, these situations should be easy to understand.


However, how about applying these filters together? The result is interesting. If you try to applying following filters:

QualifierFilter(=,'binary:1') && QualifierFilter(=,'binary:2')

It seems like you want to get rows with both qualifier 1 and 2exist no matter which family it belongs to. Actually, you won't get row caused there is no column matching those two filters at same time.


After all, this is not best practice for using HBase. In chapter 34(Table Schema Rules Of Thumb) of official reference, there are suggestions for building schemas. And this question is not best designed and confused with RDBMS.