Hive中的自定義函數簡介
(1) 在類中創建自定義函數。自定義UDF需要繼承'org.apache.hadoop.hive.ql.exec.UDF',實現evaluate函數,evaluate函數支持重載。
(2) 將該類所在的包導出成jar包,放入linux目錄下。
(3) 進入hive客戶端,刪除舊的jar包
hive> delete jar /dir/.jar;
(4) 添加新的jar包
hive> add jar /dir/.jar
(5) 創建臨時函數,指向jar包中的類
hive> create temporary function <函數名> as 'java類名';
(6) 使用臨時函數
select <函數名> (參數); drop temporary function <函數名>;
Hive中的自定義函數案例

package demo.udf; import org.apache.hadoop.hive.ql.exec.UDF; import org.apache.hadoop.io.Text; public class ConcatString extends UDF { // string can not translation in hadoop public Text evaluate(Text a, Text b) { return new Text(a.toString() + "*******" + b.toString()); } }
hive> delete jar /root/pl62716/hive/contactString.jar; Deleted [/root/pl62716/hive/contactString.jar] from class path hive> add jar /root/pl62716/hive/contactString.jar; Added [/root/pl62716/hive/contactString.jar] to class path Added resources: [/root/pl62716/hive/contactString.jar] hive> create temporary function myconcat as 'demo.udf.ConcatString'; OK Time taken: 2.747 seconds hive> select myconcat('HELLO','world'); OK HELLO*******world Time taken: 0.598 seconds, Fetched: 1 row(s)