Anoop Mamgain Anoop Mamgain - 1 month ago 11
Java Question

Creating, adding and using a UDF in Hive

I have written a sample UDF to TRIM a string from a table in hive :

package anoop;


import org.apache.hadoop.hive.ql.exec.UDF;

import org.apache.hadoop.io.Text;

public class DataTrim extends UDF{

String trimmed;

public Text trim(Text incomingData){

trimmed= incomingData.toString().trim();

return new Text(trimmed);
}


}


I created a jar for this "trim_string.jar" and saved it to hive lib folder.
Now I ran following :

add jar '~/hive-1.2.1/lib/trim_string.jar'; (success)


Now i run

create temporary function trimmed1 as 'anoop.DataTrim';


But i am getting following error:

FAILED: Class anoop.DataTrim does not implement UDF, GenericUDF, or UDAF
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.FunctionTask


can some please help? Thanks in advance!

Answer

hive UDF must contain function named evaluate. rename the function named trim to evaluate

public Text evaluate(Text incomingData)

Note: String trimmed is unnecessary as class member. You could move this to function as a local variable.

package anoop;


import org.apache.hadoop.hive.ql.exec.UDF;
import org.apache.hadoop.io.Text;

public class DataTrim  extends UDF{

    public Text evaluate(Text incomingData){
        String trimmed;
        trimmed= incomingData.toString().trim();
        return new Text(trimmed);
    }


}

For more information refer this tutorial