Title:
Simplifying Fault-Tolerance: Providing the Abstraction of Crash Failures
Simplifying Fault-Tolerance: Providing the Abstraction of Crash Failures
Author(s)
Bazzi, Rida Adnan
Neiger, Gil
Neiger, Gil
Advisor(s)
Editor(s)
Collections
Supplementary to
Permanent Link
Abstract
The difficulty of designing fault-tolerant distributed algorithms
increases with the severity of failures that an algorithm must
tolerate. This paper considers methods that automatically translate
algorithms tolerant of simple crash failures into ones tolerant of more
severe failures. These translations simplify the design task by
allowing algorithm designers to assume that processors fail only by
stopping. Such translations can be quantified by two measures:
fault-tolerance, which is a measure of how many processors must remain
nonfaulty for the translation to be correct, and round-complexity,
which is a measure of how the translation increases the running time of
an algorithm. Understanding these translations and their limitations
with respect to these measures can provide insight into the relative
impact of different models of faulty behavior on the ability to provide
fault-tolerant applications.
This paper considers two classes of translations from crash failures to
each of the following types of more severe failures: omission to send
messages; omission to send and receive messages; and totally arbitrary
behavior. It shows that previously developed translations to
send-omission failures are optimal with respect to both fault-tolerance
and round-complexity. It exhibits a hierarchy of translations to
general (send/receive) omissions that improves upon the fault-tolerance
of previously developed translations. It also gives a series of
translations to arbitrary failures that improves upon the
round-complexity of previously developed translations. All
translations developed in this paper are shown to be optimal in that
they cannot be improved with respect to one measure without negatively
affecting the other; that is, both hierarchies of translations are
matched by corresponding hierarchies of impossibility results.
Sponsor
Date Issued
1993
Extent
482170 bytes
Resource Type
Text
Resource Subtype
Technical Report