Monday, June 17, 2013

Apache Oozie - Part 3: Workflow with sqoop action (hive to mysql)

What's covered in the blog?

I have covered this topic in my blog-
Apache Sqoop - Part 5: Scheduling Sqoop jobs in Oozie
[Versions: Oozie 3.3.0, Sqoop (1.4.2) with Mysql (5.1.69)]

It includes:
1. Documentation on the Oozie sqoop action
2. A sample workflow (against syslog generated logs) that includes oozie sqoop action (export from hive to mysql).  Instructions on loading sample data and running the workflow are provided, along with some notes based on my learnings.

Related blogs


Blog 1: Oozie workflow - hdfs and email actions
Blog 2: Oozie workflow - hdfs, email and hive actions
Blog 3: Oozie workflow - sqoop action (Hive-mysql; sqoop export)
Blog 4: Oozie workflow - java map-reduce (new API) action
Blog 5: Oozie workflow - streaming map-reduce (python) action 
Blog 6: Oozie workflow - java main action
Blog 7: Oozie workflow - Pig action
Blog 8: Oozie sub-workflow
Blog 9a: Oozie coordinator job - time-triggered sub-workflow, fork-join control and decision control
Blog 9b: Oozie coordinator jobs - file triggered 
Blog 9c: Oozie coordinator jobs - dataset availability triggered
Blog 10: Oozie bundle jobs
Blog 11: Oozie Java API for interfacing with oozie workflows
Blog 12: Oozie workflow - shell action +passing output from one action to another


Your thoughts/updates:
If you want to share your thoughts/updates, email me at


  1. How to run import command in java program?

  2. Job job = Import.createSubmittableJob(Configuration , String[] arguments);
    job.submit(); this will import you can use job.waitForCompletion() also. For tracking your job use ur job id as job.getJobID() and do hadoop job -status job id

  3. thakyou it vry nice blog for beginners