Sother Sother - 3 months ago 17
SQL Question

how to grab an element in a string in hive?

I have a table called datatable like this:

columnA columnB
1 1^1^4~3^4^2~4^9^a
2 4^2^2~8^9^a~1^1^1


I want to be able to search the elements of an array and operate on it. In this case, I need to find the first number before the
a
in each set between the
~
. So, the output should look like this:

columnA columnB
1 4
2 8


How do I express this in
HIVEQL
? Right now, I know how to split on the
~
:

select columnA, split(columnB, '~') from datatable
but...I don't know how to program it to grab the element from group containing "
a
". How do I do this?

I'm trying to avoid regular expressions, but I realize they might be needed here....

Answer

The regular expressions seem to be pretty handy in your case. The one extracting first numbers in an a-featured group is:

(\d+)\^(?:\d+\^)*a

Demo: https://regex101.com/r/rL8gV3/1

And the query would be:

select columnA, regexp_extract(columnB, '(\\d+)\\^(?:\\d+\\^)*a', 1) from datatable