What's covered in the blog?
1. Documentation on the Oozie map reduce action
2. A sample workflow that includes oozie map-reduce action to process some syslog generated log files. Instructions on loading sample data and running the workflow are provided, along with some notes based on my learnings.
Versions covered:
Oozie 3.3.0; Map reduce new API
Related blogs:
Blog 1: Oozie workflow - hdfs and email actions
Blog 2: Oozie workflow - hdfs, email and hive actions
Blog 3: Oozie workflow - sqoop action (Hive-mysql; sqoop export)
Blog 4: Oozie workflow - java map-reduce (new API) action
Blog 5: Oozie workflow - streaming map-reduce (python) action
Blog 6: Oozie workflow - java main action
Blog 7: Oozie workflow - Pig action
Blog 8: Oozie sub-workflow
Blog 9a: Oozie coordinator job - time-triggered sub-workflow, fork-join control and decision control
Blog 9b: Oozie coordinator jobs - file triggered
Blog 9c: Oozie coordinator jobs - dataset availability triggered
Blog 10: Oozie bundle jobs
Blog 11: Oozie Java API for interfacing with oozie workflows
Blog 12: Oozie workflow - shell action +passing output from one action to another
Blog 13: Oozie workflow - SSH action
Your thoughts/updates:
If you want to share your thoughts/updates, email me at airawat.blog@gmail.com.
About the Oozie MapReduce action
Excerpt from Apache Oozie documentation...
The map-reduce action starts a Hadoop map/reduce job from a workflow. Hadoop jobs can be Java Map/Reduce jobs or streaming jobs.
A map-reduce action can be configured to perform file system cleanup and directory creation before starting the map reduce job. This capability enables Oozie to retry a Hadoop job in the situation of a transient failure (Hadoop checks the non-existence of the job output directory and then creates it when the Hadoop job is starting, thus a retry without cleanup of the job output directory would fail).
The workflow job will wait until the Hadoop map/reduce job completes before continuing to the next action in the workflow execution path.
The counters of the Hadoop job and job exit status (=FAILED=, KILLED or SUCCEEDED ) must be available to the workflow job after the Hadoop jobs ends. This information can be used from within decision nodes and other actions configurations.
The map-reduce action has to be configured with all the necessary Hadoop JobConf properties to run the Hadoop map/reduce job.
Hadoop JobConf properties can be specified in a JobConf XML file bundled with the workflow application or they can be indicated inline in the map-reduce action configuration.
The configuration properties are loaded in the following order, streaming , job-xml and configuration , and later values override earlier values.
Streaming and inline property values can be parameterized (templatized) using EL expressions.
The Hadoop mapred.job.tracker and fs.default.name properties must not be present in the job-xml and inline configuration.
Apache Oozie documentation:
http://oozie.apache.org/docs/3.3.0/WorkflowFunctionalSpec.html#a3.2.2_Map-Reduce_Action
Components of a workflow with java map reduce action:
Sample workflow
Highlights
The sample workflow application runs a java map reduce program that parses log files (syslog generated) in HDFS and generates a report on the same.The following is a pictorial representation of the workflow.
Workflow application details
Oozie web GUI - screenshots
http://YourOozieServer:TypicallyPort11000/oozie/
Do share, if you have any additional insights that can be addd to the blog.
Do share, if you have any additional insights that can be addd to the blog.
https://cwiki.apache.org/OOZIE/map-reduce-cookbook.html
How to use a sharelib in Oozie
http://blog.cloudera.com/blog/2012/12/how-to-use-the-sharelib-in-apache-oozie/
Everything-you-wanted-to-know-but-were-afraid-to-ask-about-oozie
http://www.slideshare.net/ChicagoHUG/everything-you-wanted-to-know-but-were-afraid-to-ask-about-oozie
Oozie workflow use cases
https://github.com/yahoo/oozie/wiki/Oozie-WF-use-cases
Hi,
ReplyDeleteI am getting following error after running Oozie in Cloudera VM.
Error: AUTHENTICATION : Could not authenticate, Authentication failed, status: -1, message: null
Please help me out in solving this issue.Thanks in advance
I was looking about the Oracle Training in Chennai for something like this ,Thank you for posting the great content..I found it quiet interesting, hopefully you will keep posting such blogs…
ReplyDeleteOracle Training in chennai
Nice article i was really impressed by seeing this article, it was very interesting and it is very useful for me.I get a lot of great information from this blog. Thank you for your sharing this informative blog.
ReplyDeletePega Training in Chennai
I have read your blog and i got a very useful and knowledgeable information from your blog.You have done a great job.
ReplyDeleteSAS Training in Chennai
This information is impressive; I am inspired with your post writing style & how continuously you describe this topic. After reading your post, thanks for taking the time to discuss this, I feel happy about it and I love learning more about this topic
ReplyDeleteGreen Technologies In Chennai
when i try to run this command i get this error
ReplyDeleteCOMMAND:oozie job -oozie http://localhost:11000/oozie -config /home/hduser/oozie/oozie-4.1.0/oozie-bin/examples/apps/map-
reduce/job.properties -run
ERROR:Error: E0501 : E0501: Could not perform authorization operation, Call From ubuntu/127.0.1.1 to localhost:8020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
Truely a very good article on how to handle the future technology. This content creates a new hope and inspiration within me. Thanks for sharing article like this. The way you have stated everything above is quite awesome. Keep blogging like this. Thanks :)
ReplyDeleteSoftware testing training in chennai | Testing courses in chennai | Software testing course
Cloud computing is the next big thing, through cloud the users have the liberty to use a shared network. The companies can focus on core business parts rather than investing heavily on infrastucture.
ReplyDeletecloud computing training in chennai|cloud computing courses in chennai|cloud computing training
Oracle database management system is a very secure and reliable platform for storing database and secured information.Due its reliable and trustworthy factor oracle DBA is famous all around the globe and is prefered by many large MNC which are using database management system.
ReplyDeleteoracle training in Chennai | oracle dba training in chennai | oracle training institutes in chennai
Great post. This is useful. Thanks for sharing.
ReplyDeleteIELTS classes in Kuwait
Thanks for your informative article.
ReplyDeleteqlikview training in chennai
Thanks for sharing such a great information..Its really nice and informative..
ReplyDeletesas training in chennai
• Nice information in the post....Keep on sharing..
ReplyDeleteios training in chennai
Great tip! Thank you very much!
ReplyDeleteistqb training in chennai
You have shared very useful details with me. Thanks for your great effort.
ReplyDeleteDBA course | Oracle dba course
Thank you so much for sharing... What is Lucky Patcher
ReplyDeleteAwesome.This blog worked perfectly for me. Thanks!
ReplyDeleteRegards,
Kevin Costner
Needed to compose you a very little word to thank you yet again regarding the nice suggestions you’ve contributed here.
ReplyDeleteamazon-web-services-training-institute-in-chennai
I simply wanted to write down a quick word to say thanks to you for those wonderful tips and hints you are showing on this site.
ReplyDeleteselenium training in chennai
The young boys ended up stimulated to read through them and now have unquestionably been having fun with these things.
ReplyDeleteDigital Marketing Training in Chennai
Quite Interesting post!!! Thanks for posting such a useful post. I wish to read your upcoming post to enhance my skill set, keep blogging.
ReplyDeleteRegards,
Ece Project Centers in Chennai | Mba Application Projects in Chennai
It is really very helpful for me and I have gathered some important information from this blog.
ReplyDeleteData Mining Project Centers in Chennai | Secure Computing Project Centers in Chennai.
thakyou it vry nice blog for beginners
ReplyDeletehttps://www.emexotechnologies.com/courses/big-data-analytics-training/big-data-hadoop-training/
Good Post! Thank you so much for sharing this pretty post, it was so good to read and useful to improve my knowledge as updated one, keep blogging.
ReplyDeleteBig Data Hadoop training in electronic city
Excellant post!!!. The strategy you have posted on this technology helped me to get into the next level and had lot of information in it.
ReplyDeletejava training in chennai | java training in bangalore
java online training | java training in pune
selenium training in chennai
selenium training in bangalore
ReplyDeleteThis article is very much helpful and i hope this will be an useful information for the needed one.Keep on updating these kinds of informative things...
Embedded System training in Chennai | Embedded system training institute in chennai | PLC Training institute in chennai | IEEE final year projects in chennai | VLSI training institute in chennai
Thanks Admin for sharing such a useful post, I hope it’s useful to many individuals for developing their skill to get good career.
ReplyDeletePython training in pune
AWS Training in chennai
Python course in chennai
An interesting topic with great examples, keep updating your knowledge with us.
ReplyDeleteSelenium Training in Chennai
Selenium Training
iOS Training in Chennai
iOS Training Institutes in Chennai
Android Training in Chennai
Android Training
Amazing information,thank you for your ideas.after along time i have studied
ReplyDeletean interesting information's.we need more updates in your blog.
Android Courses in OMR
Android Training Institutes in T nagar
Best Android Training Institute in Anna nagar
android app development course in bangalore
Your article gives lots of information to me. I really appreciate your efforts admin, continue sharing more like this.
ReplyDeleteBlockchain Training Institutes in Chennai
Blockchain Training Chennai
AWS Training in Chennai
AWS course in Chennai
ccna Training in Chennai
Python Training in Chennai
I ‘d mention that most of us visitors are endowed to exist in a fabulous place with very many wonderful individuals with very helpful things.
ReplyDeletenebosh course in chennai
This comment has been removed by the author.
ReplyDeleteI believe that your blog will surely help the readers who are really in need of this vital piece of information. Waiting for your updates.
ReplyDeleteBest English Speaking Course in Mumbai
English Classes in Mumbai
Best Spoken English Classes in Mumbai
English Speaking Training Center in Mumbai
Spoken English Coaching Institute in Mumbai
Best English Classes in Mumbai
Best English Speaking Training near me
Nice tutorial. Thanks for sharing the valuable information. it’s really helpful. Who want to learn this blog most helpful. Keep sharing on updated tutorials…
ReplyDeleteDevops Training courses
Devops Training in Bangalore
Best Devops Training in pune
Devops interview questions and answers
Devops interview questions and answers
It is better to engaged ourselves in activities we like. I liked the post. Thanks for sharing.
ReplyDeleteJava interview questions and answers
Core Java interview questions and answers
Java training in Chennai | Java training in Tambaram
Java training in Chennai | Java training in Velachery
It would have been the happiest moment for you,I mean if we have been waiting for something to happen and when it happens we forgot all hardwork and wait for getting that happened.
ReplyDeleteData Science course in kalyan nagar | Data Science Course in Bangalore
Data Science course in OMR | Data Science Course in Chennai
Data Science course in chennai | Best Data Science training in chennai
Data science course in velachery | Data Science course in Chennai
Data science course in jaya nagar | Data Science course in Bangalore
Data Science interview questions and answers
I really appreciate this post. I’ve been looking all over for this! Thank goodness I found it on Bing. You’ve made my day! Thx again!
ReplyDeletepython training Course in chennai
python training in Bangalore
Python training institute in bangalore
Innovative thinking of you in this blog makes me very useful to learn.
ReplyDeletei need more info to learn so kindly update it.
German Training in Nungambakkam
German Training in Mogappair
german language centre in bangalore
best german language classes in bangalore
Its a wonderful post and very helpful, thanks for all this information. You are including better information regarding this topic in an effective way. T hank you so much.
ReplyDeleteSelenium Training
Selenium Course in Chennai
Selenium Training Institute in Chennai
Best Software Testing Training Institute in Chennai
Testing training
Software testing training institutes
The way of you expressing your ideas is really good.you gave more useful ideas for us and please update more ideas for the learners.
ReplyDeletevmware training in bangalore
vmware courses in bangalore
vmware Training in Ambattur
vmware Training in Guindy
Its really an Excellent post. I just stumbled upon your blog and wanted to say that I have really enjoyed reading your blog. Thanks for sharing....
ReplyDeleteData Science training in chennai | Best Data Science training in chennai
Data Science training in OMR | Data science training in chennai
Data Science training in chennai | Best Data science Training in Chennai
Data science training in velachery | Data Science Training in Chennai
Data science training in tambaram | Data Science training in Chennai
Data Science training in anna nagar | Data science training in Chennai
apple mac service center | apple ipad service center | apple service center | imac service center
ReplyDeleteapple mac service center
ReplyDeleteiphone service center chennai
iphone service center
ipad service center chennai
ipad service center
imac service center chennai
imac service center
Great very impressive to read this post
ReplyDeleteaws training course in chennai
Informative Blog, Thank you to share this
ReplyDeleteRegards,
Best Devops Training in Chennai | Best Devops Training Institute in Chennai
The knowledge of technology you have been sharing thorough this post is very much helpful to develop new idea. here by i also want to share this.
ReplyDeleteDevops Training in Chennai | Devops Training Institute in Chennai
ReplyDeleteHello, I read your blog occasionally, and I own a similar one, and I was just wondering if you get a lot of spam remarks? If so how do you stop it, any plugin or anything you can advise? I get so much lately it’s driving me insane, so any assistance is very much appreciated.
Android Course Training in Chennai | Best Android Training in Chennai
Selenium Course Training in Chennai | Best Selenium Training in chennai
Devops Course Training in Chennai | Best Devops Training in Chennai
Thank you for sharing such a nice post!
ReplyDeleteGet Web Methods Training in Bangalore from Real Time Industry Experts with 100% Placement Assistance in MNC Companies. Book your Free Demo with Softgen Infotech.
nice post
ReplyDeletehttps://www.techsoftskillsource.com/