Checkpoint and Replication Oriented Fault Tolerant Mechanism for MapReduce Framework

Yang Liu, Wei Wei, Yuhong Zhang

Abstract


MapReduce is an emerging programming paradigm and an associated implementation for processing and generating big data which has been widely applied in data-intensive systems. In cloud environment, node and task failure is no longer accidental but a common feature of large-scale systems. In MapReduce framework, although the rescheduling based fault-tolerant method is simple to implement, it failed to fully consider the location of distributed data, the computation and storage overhead. Thus, a single node failure will increase the completion time dramatically. In this paper, a Checkpoint and Replication Oriented Fault Tolerant scheduling algorithm (CROFT) is proposed, which takes both task and node failure into consideration. Preliminary experiments show that with less storage and network overhead.CROFT will significantly reduce the completion time at failure time, and the overall performance of MapReduce can be improved at least over 30% than original mechanism in Hadoop.

 

DOI : http://dx.doi.org/10.11591/telkomnika.v12i2.4324

 

 


Keywords


MapReduce; Fault Tolerant; Cloud Computing; Checkpoint

Full Text:

PDF

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License