1.0. What's covered in the blog?
1. Documentation on the Oozie SSH action2. Sample oozie workflow application that demonstrates the SSH action - SSH to a specific node, as a specified user, and executes a local shell script that loads a local file to HDFS.
It was tricky getting this action working - and the solution is not something covered in the Apache documentation. Issues and resolution are documented below.
Version:
Oozie 3.3.0
Related blogs:
Blog 1: Oozie workflow - hdfs and email actions
Blog 2: Oozie workflow - hdfs, email and hive actions
Blog 3: Oozie workflow - sqoop action (Hive-mysql; sqoop export)
Blog 4: Oozie workflow - java map-reduce (new API) action
Blog 5: Oozie workflow - streaming map-reduce (python) action
Blog 6: Oozie workflow - java main action
Blog 7: Oozie workflow - Pig action
Blog 8: Oozie sub-workflow
Blog 9a: Oozie coordinator job - time-triggered sub-workflow, fork-join control and decision control
Blog 9b: Oozie coordinator jobs - file triggered
Blog 9c: Oozie coordinator jobs - dataset availability triggered
Blog 10: Oozie bundle jobs
Blog 11: Oozie Java API for interfacing with oozie workflows
Blog 12: Oozie workflow - shell action +passing output from one action to another
Blog 13: Oozie workflow - SSH action
2.0. Documentation on the Oozie SSH Action
Apache documentation is available at - http://oozie.apache.org/docs/3.3.0/WorkflowFunctionalSpec.html#a3.2.5_Ssh_Action
Note: The functionality was going to be eventually removed but later decided that it would remain.
So, disregard any mention of deprecation.
3.0. Sample workflow application
3.0.1. Highlights:
Oozie server is running on node cdh-dev01 in my environment.
With the sample workflow application, I am going to submit an Oozie job while logged in as myself (akhanolk), on this machine (Oozie server - cdh-dev01) from the CLI.
The workflow executes a shell script on cdh-dn01 as user akhanolk. The shell script loads a local file to HDFS. If the file load completes successfully, the workflow sends an email to me.
Oozie server is running on node cdh-dev01 in my environment.
With the sample workflow application, I am going to submit an Oozie job while logged in as myself (akhanolk), on this machine (Oozie server - cdh-dev01) from the CLI.
The workflow executes a shell script on cdh-dn01 as user akhanolk. The shell script loads a local file to HDFS. If the file load completes successfully, the workflow sends an email to me.
3.0.2. Pictorial overview:
3.0.3. SSH setup:
1. Passphrase-less SSH for akhanolk from cdh-dev01 (Oozie server) to cdh-dn01 (remote node) and vice versa
2. Passphrase-less SSH for oozie user ID (oozie in my case) on cdh-dev01 to cdh-dn01 as akhanolk
[Running ps -ef | grep oozie on Oozie server will give you the configured Oozie user ID]
3.0.4. Workflow application components:
workflow definition (workflow.xml - in HDFS)
job properties file (job.properties from node submitting job)
Shell script (uploadFile.sh) on remote node (cdh-dn01; At /home/akhanolk/scripts)
Data file (employees_data) on remote node (cdh-dn01; At /home/akhanolk/data)
3.0.5. Desired result:
Upon execution of the workflow, the employees_data on cdh-dn01 should get moved to a specified directory in HDFS
3.0.6. Subsequent sections cover-
- Data and script download
- Oozie job properties file
- Oozie workflow file
- Shell script - uploadFile.sh
- Data load commands
- Oozie SMTP configuration
- SSH setup
- Oozie commands
- Output in HDFS
- Output email
- Oozie web console - screenshots
- Issues encountered and resolution
3.0.7. Data and script download:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
************************************ | |
*Data and code/application download | |
************************************ | |
Data and code: | |
-------------- | |
Github: | |
https://github.com/airawat/OozieSamples | |
Email me at airawat.blog@gmail.com if you encounter any issues | |
Directory structure of application download | |
-------------------------------------------- | |
oozieProject | |
workflowSshAction | |
job.properties | |
workflow.xml | |
scripts | |
uploadFile.sh | |
data | |
employees_data | |
3.0.8. Oozie job.properties file:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#************************************************* | |
# job.properties | |
#************************************************* | |
nameNode=hdfs://cdh-nn01.chuntikhadoop.com:8020 | |
jobTracker=cdh-jt01:8021 | |
queueName=default | |
oozie.libpath=${nameNode}/user/oozie/share/lib | |
oozie.use.system.libpath=true | |
oozie.wf.rerun.failnodes=true | |
oozieProjectRoot=${nameNode}/user/${user.name}/oozieProject | |
appPath=${oozieProjectRoot}/workflowSshAction | |
oozie.wf.application.path=${appPath} | |
inputDir=${oozieProjectRoot}/data | |
focusNodeLogin=akhanolk@cdh-dn01 | |
shellScriptPath=~/scripts/uploadFile.sh | |
emailToAddress=akhanolk@cdh-dev01 |
3.0.9. Oozie workflow.xml:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<!--******************************************--> | |
<!--workflow.xml --> | |
<!--******************************************--> | |
<workflow-app name="WorkFlowForSshAction" xmlns="uri:oozie:workflow:0.1"> | |
<start to="sshAction"/> | |
<action name="sshAction"> | |
<ssh xmlns="uri:oozie:ssh-action:0.1"> | |
<host>${focusNodeLogin}</host> | |
<command>${shellScriptPath}</command> | |
<capture-output/> | |
</ssh> | |
<ok to="sendEmail"/> | |
<error to="killAction"/> | |
</action> | |
<action name="sendEmail"> | |
<email xmlns="uri:oozie:email-action:0.1"> | |
<to>${emailToAddress}</to> | |
<subject>Output of workflow ${wf:id()}</subject> | |
<body>Status of the file move: ${wf:actionData('sshAction')['STATUS']}</body> | |
</email> | |
<ok to="end"/> | |
<error to="end"/> | |
</action> | |
<kill name="killAction"> | |
<message>"Killed job due to error"</message> | |
</kill> | |
<end name="end"/> | |
</workflow-app> |
3.0.10. Shell script (fileUpload.sh):
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
################################# | |
# Name: uploadFile.sh | |
# Location: remote node where we | |
# want to run an | |
# operation | |
################################# | |
#!/bin/bash | |
hadoop fs -rm -R oozieProject/results-sshAction/* | |
hadoop fs -put ~/data/* oozieProject/results-sshAction/ | |
status=$? | |
if [ $status = 0 ]; then | |
echo "STATUS=SUCCESS" | |
else | |
echo "STATUS=FAIL" | |
fi |
3.0.11. HDFS load commands:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
***************************************** | |
Location of files/scripts & commands | |
***************************************** | |
I have pasted information specific to my environment; Modify as required. | |
1) Node (cdh-dev01) where the Oozie CLI will be used to submit/run Oozie workflow: | |
Structure/Path: | |
~/oozieProject/workflowSshAction/job.properties | |
2) HDFS: | |
Workflow directory structure: | |
/user/akhanolk/oozieProject/workflowSshAction/workflow.xml | |
Commands to load: | |
hadoop fs -mkdir oozieProject | |
hadoop fs -mkdir oozieProject/workflowSshAction | |
hadoop fs -put ~/oozieProject/workflowSshAction/workflow.xml oozieProject/workflowSshAction | |
Output directory structure: | |
/user/akhanolk/oozieProject/results-sshAction | |
Command: | |
hadoop fs -mkdir oozieProject/results-sshAction | |
3) Remote node (cdh-dn01) where we want to run a shell script: | |
Directory structure/Path: | |
~/scripts/uploadFile.sh | |
~/data/employee_data |
3.0.12. Oozie SMTP configuration:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oozie SMTP configuration | |
------------------------ | |
Add the following to the oozie-site.xml, and restart oozie. | |
Replace values with the same specific to your environment. | |
<!-- SMTP params--> | |
<property> | |
<name>oozie.email.smtp.host</name> | |
<value>cdh-dev01</value> | |
</property> | |
<property> | |
<name>oozie.email.smtp.port</name> | |
<value>25</value> | |
</property> | |
<property> | |
<name>oozie.email.from.address</name> | |
<value>oozie@cdh-dev01</value> | |
</property> | |
<property> | |
<name>oozie.email.smtp.auth</name> | |
<value>false</value> | |
</property> | |
<property> | |
<name>oozie.email.smtp.username</name> | |
<value></value> | |
</property> | |
<property> | |
<name>oozie.email.smtp.password</name> | |
<value></value> | |
</property> |
3.0.13. Oozie SSH setup:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
************************ | |
SSH setup | |
************************ | |
Issues: | |
Review my section on issues encountered to see all the issues and fixes I had to make | |
to get the workflow application to work. | |
------------------------------------------------------------------------------------------------------ | |
Oozie documentation: | |
To run SSH Testcases and for easier Hadoop start/stop configure SSH to localhost to be passphrase-less. | |
Create your SSH keys without a passphrase and add the public key to the authorized file: | |
$ ssh-keygen -t dsa | |
$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys2 | |
Test that you can ssh without password: | |
$ ssh localhost | |
------------------------------------------------------------------------------------------------------ | |
SSH tutorial: | |
Setup ssh - https://www.digitalocean.com/community/articles/how-to-set-up-ssh-keys--2 |
3.0.14. Oozie commands:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oozie commands | |
--------------- | |
Note: Replace oozie server and port, with your cluster-specific. | |
1) Submit job: | |
$ oozie job -oozie http://cdh-dev01:11000/oozie -config oozieProject/workflowSshAction/job.properties -submit | |
job: 0000012-130712212133144-oozie-oozi-W | |
2) Run job: | |
$ oozie job -oozie http://cdh-dev01:11000/oozie -start 0000014-130712212133144-oozie-oozi-W | |
3) Check the status: | |
$ oozie job -oozie http://cdh-dev01:11000/oozie -info 0000014-130712212133144-oozie-oozi-W | |
4) Suspend workflow: | |
$ oozie job -oozie http://cdh-dev01:11000/oozie -suspend 0000014-130712212133144-oozie-oozi-W | |
5) Resume workflow: | |
$ oozie job -oozie http://cdh-dev01:11000/oozie -resume 0000014-130712212133144-oozie-oozi-W | |
6) Re-run workflow: | |
$ oozie job -oozie http://cdh-dev01:11000/oozie -config oozieProject/workflowSshAction/job.properties -rerun 0000014-130712212133144-oozie-oozi-W | |
7) Should you need to kill the job: | |
$ oozie job -oozie http://cdh-dev01:11000/oozie -kill 0000014-130712212133144-oozie-oozi-W | |
8) View server logs: | |
$ oozie job -oozie http://cdh-dev01:11000/oozie -logs 0000014-130712212133144-oozie-oozi-W | |
Logs are available at: | |
/var/log/oozie on the Oozie server. |
3.0.15. Output in HDFS:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
************************ | |
Output | |
************************ | |
[akhanolk@cdh-dev01 ~]$ hadoop fs -ls oozieProject/res* | |
Found 1 items | |
-rw-r--r-- 3 akhanolk akhanolk 13821993 2013-10-30 20:59 oozieProject/results-sshAction/employees_data |
3.0.16. Output email:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
******************** | |
Output email | |
******************** | |
From akhanolk@cdh-dev01.localdomain Wed Oct 30 22:59:16 2013 | |
Return-Path: <akhanolk@cdh-dev01.localdomain> | |
X-Original-To: akhanolk@cdh-dev01 | |
Delivered-To: akhanolk@cdh-dev01.localdomain | |
From: akhanolk@cdh-dev01.localdomain | |
To: akhanolk@cdh-dev01.localdomain | |
Subject: Output of workflow 0000003-131029234028597-oozie-oozi-W | |
Content-Type: text/plain; charset=us-ascii | |
Date: Wed, 30 Oct 2013 22:59:16 -0500 (CDT) | |
Status: R | |
Status of the file move: SUCCESS |
3.0.17. Issues encountered:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
************************* | |
Issues encountered | |
************************* | |
Permissions denied error: | |
------------------------- | |
.... | |
2013-10-29 16:13:25,949 WARN org.apache.oozie.command.wf.ActionStartXCommand: | |
USER[akhanolk] GROUP[-] TOKEN[] APP[WorkFlowForSshAction] JOB[0000002- | |
131029144918199-oozie-oozi-W] ACTION[0000002-131029144918199-oozie-oozi- | |
W@sshAction] Error starting action [sshAction]. ErrorType [NON_TRANSIENT], | |
ErrorCode [AUTH_FAILED], Message [AUTH_FAILED: Not able to perform operation | |
[ssh -o PasswordAuthentication=no -o KbdInteractiveDevices=no -o | |
StrictHostKeyChecking=no -o ConnectTimeout=20 akhanolk@cdh-dn01 | |
mkdir -p oozie-oozi/0000002-131029144918199-oozie-oozi-W/sshAction--ssh/ ] | |
| ErrorStream: Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password). | |
Steps taken to resolve: | |
----------------------- | |
a) | |
Tried running the command in square brackets, above, manually from cdh-dev01 (Oozie server), | |
when logged in as akhanolk. It worked! But the worklow in Oozie didnt; | |
b) | |
Tried running as Oozie- | |
sudo -u oozie ssh -o PasswordAuthentication=no -o KbdInteractiveDevices=no -o | |
StrictHostKeyChecking=no -o ConnectTimeout=20 akhanolk@cdh-dn01 mkdir | |
-p oozie-oozi/0000001-1310081859355-oozie-oozi-W/action1--ssh/ | |
Got the error | |
Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password). | |
c) | |
Googled - and chanced upon this- | |
http://stackoverflow.com/questions/19272430/oozie-ssh-action | |
So, performed the necessary actions detailed below to allow oozie to ssh to cdh-dn01 as akhanolk | |
On cdh-dev01 (my Oozie server), located the oozie home directory and ran ssh keygen | |
Appended the public key to authorized_keys file home/akhanolk/.ssh/authorized_keys on cdh-dev01 | |
Appended the same public key to authorized_keys file in cdh-dn01 (remote node) at | |
home/akhanolk/.ssh/authorized_keys | |
Issue resolved!! | |
3.0.18. Oozie web console - screenshots:
Any additional insights are greatly appreciated.
Cheers!