International Journal of Applied Science and Technology

ISSN 2221-0997 (Print), 2221-1004 (Online) 10.30845/ijast

A Heuristic Checkpoint Placement Algorithm for Adaptive Application-Level Checkpointing
Yanqing Ji, Hai Jiang, Vipin Chaudhary

Checkpoint/rollback is an effective scheme for fault tolerance and has been widely used to reduce the overall execution time of long-running applications in case of faults. The locations of checkpoints in application programs are critical since the distance between two consecutive ones determines the checkpointing scheme’s sensitivity and overheads. If they are too far apart, applications might be insensitive to job failure. That is, the lost computational time between the point of failure and the end of previous checkpoint would be very large. But if they are too close, the related checkpointing overheads will slow down the normal computation. This paper proposes a heuristic checkpoint placement algorithm to improve the checkpointing schemes’ performance in terms of sensitivity and flexibility. This heuristic algorithm enables automatic and transparent insertion of checkpoints in user’s source code. Experiments on benchmark programs and real applications demonstrate this algorithm’s efficiency and sufficiency.

Full Text: PDF