Tuesday, February 25, 2014

Cascading extensions for Accumulo

I recently had the opportunity to work on extending Cascading to read/write to Accumulo.
Versions - Cascading 2.5.2 and Accumulo 1.5.0.

The source code is at -
https://github.com/airawat/cascading.accumulo

Examples of using the AccumuloTap are at -
https://github.com/airawat/cascading.accumulo.examples

The examples cover the following functionality-
1.  Querying Accumulo from Cascading.
2.  Performing Accumulo table operations like - create table, create table with splits, check if table exists, delete table & flush, from Cascading 
3.  Dump data in Accumulo to HDFS from Cascading.
4.  Export data in Accumulo to HDFS, after transposing to a flat, delimited format with column headers.
5.  Import data in HDFS, in a flat delimited format into Accumulo.
6.  Read data in Accumulo and write (back) to Accumulo 
7.  Export data in Accumulo into Mysql


18 comments:

  1. It is only after attending the hadoop hadoop online training, I was selected for job in an MNC in India. Thanks for support provided by the informative blogs like this.

    ReplyDelete
  2. Some topics covered,may be it helps someone,HDFS is a Java-based file system that provides scalable and reliable data storage,and it was designed to span large clusters of commodity servers. HDFS has demonstrated production scalability of up to 200 PB of storage and a single cluster of 4500 servers, supporting close to a billion files and blocks.
    http://www.computaholics.in/2015/12/hdfs.html
    http://www.computaholics.in/2015/12/mapreduce.html
    http://www.computaholics.in/2015/11/hadoop-fs-commands.html

    ReplyDelete
  3. Thank u for giving this valuable information..it's really very nice

    ReplyDelete
  4. very nice article on Big Data. With the explosion of big data, companies are faced with data challenges in three different areas. hadoop online training

    ReplyDelete
  5. thakyou it vry nice blog for beginners
    https://www.emexotechnologies.com/courses/big-data-analytics-training/big-data-hadoop-training/

    ReplyDelete
  6. Thanks for such important information.keep up the good work.Ethical Hacking training is based on current industry standards that helps attendees to secure placements in their dream jobs at MNCs. Indian Cyber Army Provides Best Ethical Hacking Training in India.Indian Cyber Army credibility in Ethical hacking training & Cybercrime investigation training is acknowledged across nation as we offer hands on practical knowledge and full assistance with basic as well as advanced level ethical hacking & cybercrime investigation courses

    ReplyDelete
  7. Good Post! Thank you so much for sharing this pretty post, it was so good to read and useful to improve my knowledge as updated one, keep blogging.

    https://www.emexotechnologies.com/online-courses/big-data-hadoop-training-in-electronic-city/

    ReplyDelete
  8. Good Post! Thank you so much for sharing this pretty post, it was so good to read and useful to improve my knowledge as updated one, keep blogging.

    Big Data Hadoop training in electronic city

    ReplyDelete
  9. When I initially commented, I clicked the “Notify me when new comments are added” checkbox and now each time a comment is added I get several emails with the same comment. Is there any way you can remove people from that service? Thanks.
    Amazon Web Services Training in OMR , Chennai | Best AWS Training in OMR,Chennai
    Amazon Web Services Training in Tambaram, Chennai|Best AWS Training in Tambaram, Chennai

    ReplyDelete
  10. This comment has been removed by the author.

    ReplyDelete