Abstract
Fault-tolerance is an important and critical issue in distributed and parallel processing system. Distributed system consists of a collection of interconnected stand-alone computers working together as a single, to produce complete result. If the numbers of computing nodes are increased concurrently and dynamically in distributed computing, it may have the many changes to become crush failures. In this paper, we propose application level checkpoint-based fault tolerance approach for distributed computing. T