tag:blogger.com,1999:blog-2624858645478833656.post3554313514886475561..comments2024-03-26T08:11:32.687-07:00Comments on Hooked on Hadoop: Map-side join of large datasets using CompositeInputFormatAnagha Khanolkarhttp://www.blogger.com/profile/14724095760127064262noreply@blogger.comBlogger8125tag:blogger.com,1999:blog-2624858645478833656.post-71515574536166694862018-11-20T01:54:21.799-08:002018-11-20T01:54:21.799-08:00Nice blog..! I really loved reading through this a...Nice blog..! I really loved reading through this article. Thanks for sharing such <br />a amazing post with us and keep blogging...<a href="http://www.pmptraininginchennai.in/" rel="nofollow">pmp training Chennai</a> | <a href="http://www.pmptraininginchennai.in/" rel="nofollow">pmp training centers in Chenai</a> | <a href="http://www.pmptraininginchennai.in/" rel="nofollow">pmp training institutes in Chennai</a> | <a href="http://www.pmptraininginchennai.in/" rel="nofollow">pmp training and certification in Chennai</a> | <a href="http://www.pmptraininginchennai.in/" rel="nofollow">pmp training in velachery</a> Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-2624858645478833656.post-78564881069249413412018-08-21T22:35:04.817-07:002018-08-21T22:35:04.817-07:00Good Post! Thank you so much for sharing this pret...Good Post! Thank you so much for sharing this pretty post, it was so good to read and useful to improve my knowledge as updated one, keep blogging.<br /><br /><a href="https://www.emexotechnologies.com/online-courses/big-data-analystic-training/big-data-hadoop-training-in-electronic-city/" rel="nofollow">Big Data Hadoop training in electronic city</a>Anonymoushttps://www.blogger.com/profile/05859351157828374408noreply@blogger.comtag:blogger.com,1999:blog-2624858645478833656.post-17553684816382697932018-07-08T05:13:02.276-07:002018-07-08T05:13:02.276-07:00thakyou it vry nice blog for beginners
https://www...thakyou it vry nice blog for beginners<br />https://www.emexotechnologies.com/courses/big-data-analytics-training/big-data-hadoop-training/Anonymoushttps://www.blogger.com/profile/13757141712513722548noreply@blogger.comtag:blogger.com,1999:blog-2624858645478833656.post-83419354435528398912015-09-03T10:48:54.599-07:002015-09-03T10:48:54.599-07:00very useful. Do we need to have same number of rec...very useful. Do we need to have same number of records in both the tables?<br />I tried this with both inner and outer. Inner join gives me the expected results but outer join is not. Can you try outer join and let me know. My code is here <br />https://github.com/swapnaraja/hadoopCode/tree/master/mapReduce/joins/compositeinputswapnahttps://www.blogger.com/profile/06105735575540327890noreply@blogger.comtag:blogger.com,1999:blog-2624858645478833656.post-66324443817818652592015-08-05T14:22:36.603-07:002015-08-05T14:22:36.603-07:00Very useful. ThanksVery useful. ThanksAnonymoushttps://www.blogger.com/profile/15242357584395379616noreply@blogger.comtag:blogger.com,1999:blog-2624858645478833656.post-64903660560704143062014-10-01T20:20:04.385-07:002014-10-01T20:20:04.385-07:00Hi,
I would like to know how to deal with the cas...Hi,<br /><br />I would like to know how to deal with the case where you have unsorted data. For example the input data was as follows:<br /><br />Employee data [joinProject/data/employees_sorted/part-e]<br />--------------------------------------------------------<br />[EmpNo,DOB,FName,LName,Gender,HireDate,DeptNo]<br />10001,1953-09-02,Georgi,Facello,M,1986-06-26,d005<br />10002,1964-06-02,Bezalel,Simmel,F,1985-11-21,d007<br />10003,1959-12-03,Parto,Bamford,M,1986-08-28,d004<br />10004,1954-05-01,Chirstian,Koblick,M,1986-12-01,d004<br />10005,1955-01-21,Kyoichi,Maliniak,M,1989-09-12,d003<br />10006,1953-04-20,Anneke,Preusig,F,1989-06-02,d005<br />.....<br />Salary data [joinProject/data/salaries_sorted/part-s]<br />------------------------------------------------------<br />[EmpNo,Salary,FromDate,ToDate]<br />10001,88958,2002-06-22,9999-01-01<br />10002,72527,2001-08-02,9999-01-01<br />10004,43311,2001-12-01,9999-01-01<br />10003,74057,2001-11-27,9999-01-01<br />10005,94692,2001-09-09,9999-01-01<br />..........<br /><br />and the desired output was :<br /><br />************************************<br />Expected Results - tab separated<br />************************************<br />[EmpNo FName LName Salary]<br />10001 Georgi Facello 88958<br />10002 Bezalel Simmel 72527<br />10003 Parto Bamford 43311<br />10004 Chirstian Koblick 74057<br />10005 Kyoichi Maliniak 94692<br />10006 Anneke Preusig 59755<br />10009 Sumant Peac 94409<br />10010 Duangkaew Piveteau 80324<br /><br />Anonymoushttps://www.blogger.com/profile/15120359038488214776noreply@blogger.comtag:blogger.com,1999:blog-2624858645478833656.post-33243134373633982842014-05-01T01:00:18.883-07:002014-05-01T01:00:18.883-07:00Really good blog :) thanks for sharing.
I have on...Really good blog :) thanks for sharing.<br /><br />I have one question on the CompositeInputFormat. How does it work internally? It looks like it will read the all the paths and do a inner join on the key and the result will be passed on to the mapper. If this is true, other than "inner" as op is there any other operation that can be done?vishnuhttps://www.blogger.com/profile/09100566879842212357noreply@blogger.comtag:blogger.com,1999:blog-2624858645478833656.post-61758354101621275542014-01-09T04:44:23.744-08:002014-01-09T04:44:23.744-08:00thank you, very helpfulthank you, very helpfulAnonymoushttps://www.blogger.com/profile/01892413637328542105noreply@blogger.com