Wednesday, July 3, 2013

Running native mapreduce jobs inside Pig

There might be situations were you may have to reuse java map reduce programs within a pig program. This blog includes a sample pig script, with associated jars and sample data. The input is Syslog generated log files, and the output is a count of occurrences of processes logged inception to date.

Apache Pig documentation:

My blog 1 on Log parsing in Hadoop (link) covers the Java code. This blog blog uses the jar from the blog in a pig script.

Details on running native mapreduce job in Pig scripts:


  1. There are lots of information about hadoop have spread around the web, but this is a unique one according to me. The strategy you have updated here will make me to get to the next level in big data. Thanks for sharing this.

    Hadoop Course in Chennai
    Hadoop training institutes in chennai

  2. I was just wondering how I missed this article so far, this is a great piece of content I have ever seen in the entire Internet. Thanks for sharing this worth able information in here and do keep blogging like this.

    Hadoop Training Chennai | Big Data Training Chennai | Big Data Training in Chennai

  3. Hadoop is one of the best cloud based tool for analysisng the big data. With the increase in the usage of big data there is a quite a demand for hadoop professionals.
    Big data training in Chennai | Hadoop training Chennai | Hadoop training in Chennai

  4. thakyou it vry nice blog for beginners