Jazzmine Jazzmine - 26 days ago 5
SQL Question

In Hive looking to select rows that have a special character in a column

I'm looking to SELECT rows in Hive that have a special character [a-zA-Z0-9] in a column.

I'm not quite sure how to construct the WHERE clause but based upon other threads I think it should look something like:

SELECT DISTINCT user_name
FROM user_info
WHERE user_name like regexp_extract('%[^a-zA-Z\d\s:]%')


but Hive provided this error:


Error while compiling statement: FAILED: SemanticException [Error
10014]: Line 3:21 Wrong arguments ''%[^a-zA-Z\d\s:]%'': No matching
method for class org.apache.hadoop.hive.ql.udf.UDFRegExpExtract with
(string). Possible choices: FUNC(string, string) FUNC(string,
string, int)


How should I set up the regexp_extract clause?

Thanks

Answer
SELECT DISTINCT user_name
FROM user_info
WHERE user_name rlike '[^a-zA-Z\\d\\s:]'

https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-StringOperators

hive> select 'Dudu Markovitz: 123' rlike '[^a-zA-Z\\d\\s:]';
OK
false
hive> select 'Dudu Markovitz: @123' rlike '[^a-zA-Z\\d\\s:]';
OK
true