Title:
Simplifying Fault-Tolerance: Providing the Abstraction of Crash Failures

dc.contributor.author Bazzi, Rida Adnan en_US
dc.contributor.author Neiger, Gil
dc.date.accessioned 2005-06-17T18:02:12Z
dc.date.available 2005-06-17T18:02:12Z
dc.date.issued 1993 en_US
dc.description.abstract The difficulty of designing fault-tolerant distributed algorithms increases with the severity of failures that an algorithm must tolerate. This paper considers methods that automatically translate algorithms tolerant of simple crash failures into ones tolerant of more severe failures. These translations simplify the design task by allowing algorithm designers to assume that processors fail only by stopping. Such translations can be quantified by two measures: fault-tolerance, which is a measure of how many processors must remain nonfaulty for the translation to be correct, and round-complexity, which is a measure of how the translation increases the running time of an algorithm. Understanding these translations and their limitations with respect to these measures can provide insight into the relative impact of different models of faulty behavior on the ability to provide fault-tolerant applications. This paper considers two classes of translations from crash failures to each of the following types of more severe failures: omission to send messages; omission to send and receive messages; and totally arbitrary behavior. It shows that previously developed translations to send-omission failures are optimal with respect to both fault-tolerance and round-complexity. It exhibits a hierarchy of translations to general (send/receive) omissions that improves upon the fault-tolerance of previously developed translations. It also gives a series of translations to arbitrary failures that improves upon the round-complexity of previously developed translations. All translations developed in this paper are shown to be optimal in that they cannot be improved with respect to one measure without negatively affecting the other; that is, both hierarchies of translations are matched by corresponding hierarchies of impossibility results. en_US
dc.format.extent 482170 bytes
dc.format.mimetype application/pdf
dc.identifier.uri http://hdl.handle.net/1853/6757
dc.language.iso en_US
dc.publisher Georgia Institute of Technology en_US
dc.relation.ispartofseries CC Technical Report; GIT-CC-93-12 en_US
dc.subject Algorithms
dc.subject Crash failures
dc.subject Fault tolerance
dc.subject Fault tolerant distributed algorithms
dc.subject Translations
dc.title Simplifying Fault-Tolerance: Providing the Abstraction of Crash Failures en_US
dc.type Text
dc.type.genre Technical Report
dspace.entity.type Publication
local.contributor.corporatename College of Computing
local.relation.ispartofseries College of Computing Technical Report Series
relation.isOrgUnitOfPublication c8892b3c-8db6-4b7b-a33a-1b67f7db2021
relation.isSeriesOfPublication 35c9e8fc-dd67-4201-b1d5-016381ef65b8
Files
Original bundle
Now showing 1 - 1 of 1
Thumbnail Image
Name:
GIT-CC-93-12.pdf
Size:
470.87 KB
Format:
Adobe Portable Document Format
Description: