第五步,将打好的jar包上传到hive server
-bash-3.2$ scp dwapp@10.20.151.57:/export/home/dwapp/mxm/python/HelloWorldUDF.jar .
此时自定义UDF函数就完成了,下面需要把它加入到hive的环境中去
第六步,进入自己的hive环境,输入命令add jar HelloWorldUDF.jar
hive> add jar HelloWorldUDF.jar;
Added HelloWorldUDF.jar to class path
Added resource: HelloWorldUDF.jar
第七步,创建一个临时函数,名称自定义(给自定义的UDF取个名),as后面是jar包保存的类的名称
hive> create temporary function helloworld as 'com.mxm.udf.HelloWorldUDF';
OK
Time taken: 0.0060 seconds
第八步,使用这个函数
hive> select helloworld(name) from mxm_test2012102401;
Automatically selecting local only mode for query
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
Execution log at: /tmp/dwdev/dwdev_20130108113535_2e4312fc-1571-4735-ac29-8fb5297bbd9e.log
Job running in-process (local Hadoop)
Hadoop job information for null: number of mappers: 0; number of reducers: 0
2013-01-08 11:35:58,069 null map = 0%, reduce = 0%
2013-01-08 11:36:00,074 null map = 100%, reduce = 0%
Ended Job = job_local_0001
2013-01-08 11:36:01 End of local task; Time Taken: 9.176 sec.
OK hello world,mengxm Time taken: 20.689 seconds
mxm_test2012102401表的建表脚本以及load数据脚本如下:
hive> create table mxm_test2012102401 (id int, name string) row format delimited fields terminated by ',' stored as textfile;
OK
Time taken: 0.143 seconds
hive> load data local inpath '/home/dwdev/mxm/a.txt' into table mxm_test2012102401;
Copying data from file:/home/dwdev/mxm/a.txt
a.txt如下:
-bash-3.2$ cat a.txt
1,mengxm
删除helloworld函数
hive> drop temporary function helloworld;
OK
Time taken: 0.0040 seconds