Efficient failure detection and consensus at extreme-scale systems

Soma Sekhar Kolisetty; Battula Srinivasa Rao

doi:10.11591/ijece.v12i5.pp5339-5347

Efficient failure detection and consensus at extreme-scale systems

Soma Sekhar Kolisetty, Battula Srinivasa Rao

Abstract

Distributed systems and extreme-scale systems are ubiquitous in recent years and have seen throughout academia organizations, business, home, and government sectors. Peer-to-peer (P2P) technology is a typical distributed system model that is gaining popularity for delivering computing resources and services. Distributed systems try to increase its availability in the event of frequent component failures and functioning the system in such scenario is notoriously difficult. In order to identify component failures in the system and achieve global agreement (consensus) among failed components, this paper implemented an efficient failure detection and consensus algorithm based on fail-stop type process failures. The proposed algorithm is fault-tolerant to process failures occurring before and during the execution of the algorithm. The proposed algorithm works with the epidemic gossip protocol, which is a randomly generated paradigm of computation and communication that is both fault-tolerant and scalable. A simulation of an extreme-scale information dissemination process shows that global agreement can be achieved. A P2P simulator, PeerSim, is used in the paper to implement and test the proposed algorithm. The proposed algorithm results exhibited high scalability and at the same time detected all the process failures. The status of all the processes is maintained in a Boolean matrix.

Keywords

Consensus; epidemic protocol; failure detection; fault tolerance; gossip-based protocol; message passing interface process; scalability;

Full Text:

PDF

DOI: http://doi.org/10.11591/ijece.v12i5.pp5339-5347

Copyright (c) 2022 Institute of Advanced Engineering and Science

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

International Journal of Electrical and Computer Engineering (IJECE)
p-ISSN 2088-8708, e-ISSN 2722-2578

This journal is published by the Institute of Advanced Engineering and Science (IAES).

Username
Password
Remember me