Showing posts with label simple UDF. Show all posts
Showing posts with label simple UDF. Show all posts

Monday, November 4, 2013

UDF's Part 1: Custom simple eval UDFs in Pig and Hive (NVL2)


1.0. What's in this blog?


A demonstration of creating a custom simple eval UDF to mimic NVL2 functionality from the DBMS world, in Pig and Hive.  It includes sample data, java code for creating the UDF, expected results, commands to execute and the output.

About NVL2:
NVL2 takes three parameters, we will refer to as expr1, expr2 and expr3.
NVL2 lets you determine the value returned by a query based on whether a specified expression is null or not null. If expr1 is not null, then NVL2 returns expr2. If expr1 is null, then NVL2 returns expr3.

2.0. NVL2 UDF in Hive


1: Create the test data file for a Hive external table

2: Create the Hive table

3: Create the UDF in Java

4: Expected results

5: Test the UDF

3.0. NVL2 UDF in Pig


We will reuse data from section 2.
1: Create the UDF in Java

2: Create the pig script

3: Test the UDF
[Modify path of the data file between local and HDFS locations in the pig script - better - make it parameterized]

4: Results



Do share any additional insights/comments.
Cheers!

Follow me on Twitter: