Fail-stutter fault tolerance
- 25 August 2005
- conference paper
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
Abstract
Traditional fault models present system designers with two extremes: the Byzantine fault model, which is general and therefore difficult to apply, and the fail-stop fault model, which is easier to employ but does not accurately capture modern device behavior. To address this gap, we introduce the concept of fail-stutter fault tolerance, a realistic and yet tractable fault model that accounts for both absolute failure and a new range of performance failures common in modern components. Systems built under the fail-stutter model will likely perform well, be highly reliable and available, and be easier to manage when deployed.Keywords
This publication has 18 references indexed in Scilit:
- Manageable storage via adaptation in WiNDPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- Bimodal multicastACM Transactions on Computer Systems, 1999
- Searching for the sorting recordPublished by Association for Computing Machinery (ACM) ,1998
- Cluster-based scalable network servicesPublished by Association for Computing Machinery (ACM) ,1997
- The Vesta parallel file systemACM Transactions on Computer Systems, 1996
- Hypervisor-based fault tolerancePublished by Association for Computing Machinery (ACM) ,1995
- Parallel database systemsCommunications of the ACM, 1992
- The design and implementation of a log-structured file systemACM Transactions on Computer Systems, 1992
- Implementing fault-tolerant services using the state machine approach: a tutorialACM Computing Surveys, 1990
- The Gamma database machine projectIEEE Transactions on Knowledge and Data Engineering, 1990