6.824-Raft - There are three roles in Raft algorithm, Follower, Candidate, Leader, each node store currentTerm, votedFor and log

6.824-Raft

Raft note

There are three roles in Raft algorithm, Follower, Candidate, Leader, each node store currentTerm, votedFor and log. There are Two RPC in this protocol , RequestVote and AppendEntries.

When raft clusters start, each node perform a series of initialization (currentTerm is set to 0, votedFor is set to -1, and the initial state is Follower)

If the election time exceeds and does not receive the AppendEntries message from the current leader or the RequestVote message that agrees to the election of the candidate, the Follower will be converted to the Candidate state.

When a node step into candidate state, its currentTerm will increased 1, votedFor is set to the current node, random timer will reset and RequestVote messages will send to all other nodes.

Candidates will remain in the candidate status until one of the following three situations occurs First, With the agreement of majority of the nodes, the node is elected as the leader and changes its state to the leader Second, Received the AppendEntries message from another node indicates that it is the leader. If the current candidate node thinks that the leader is legal (term> currentTerm in the message carrying parameters), it will switch itself to the follower state. Third, If the majority of votes are not received before the timer expires, a new round of elections will start.

The detail of RequestVote. When a node receives a RequestVote message, it will make the following judgments:

If the term carried by the RPC smaller than the node’s currentTerm, it will return currentTerm and reject the voting request: (currentTerm, false), this node will keep the current node state unchanged.

If the term carried by the RPC equal to currentTerm, here are two siutaions, if the votedFor of the voting node is not empty and is not equal to the candidateId carried by the parameter, the current term is returned and the voting request is rejected, the current node status remains unchanged; if voteFor is empty Or it is equal to candidateId, the node will agree to the voting request: (currentTerm, true), and convert the current node to Follower, reset the timer, and set voteFor to candidateId.

If the term carried by the RPC bigger than currentTerm, first set the voting node currentTerm to carried term, set voteFor to empty and switch to Follower state, reset the timer, and enter the next step of judgment: If the term of the last log of the voting node bigger than lastLogTerm carried by the RPC, it will return the updated term and reject the voting request: If the term of the last log of the voting node equal to lastLogTerm carried by the RPC, we will compare the Index of the last log of the node with the lastLogIndex carried by the RPC; if the Index of the last log of the node bigger than lastLogIndex, return to the updated term and refuse to vote Request, otherwise agree to the voting request.

If the term of the last log of the voting node smaller than lastLogTerm carried by the RPC, set voteFor equal to candidateId and agree to the voting request

So from such RPC, we will elect a leader form candidate.

When a node step into leader state, it will Initialize the nextIndex and matchIndex for all other nodes. Regularly send AppendEntries messages (heartbeat messages) to each node to maintain its leadership.

The detail of AppendEntries. When a node receives a AppendEntries message, it will make the following judgments:

If the term carried by the RPC smaller than the node’s currentTerm, it will return currentTerm and reject the request: (currentTerm, false), this node will keep the current node state unchanged.

Otherwise, we need a further judgement: If the current node log[] structure does not contain a log at the prevLogIndex which carried by the RPC, then reject this request If the current node log[] structure contains a log at the index of prevLogIndex but the term of the log is not equal to prevLogTerm, reject this request Otherwise, store this log. When store this log, If the new log entry conflicts with the existing log entry, you should delete the existing log entry and the log entries after it, and then add the entries carried by the RPC to the current node log.

When leader receive the results of AppendEntries request, it will make the following judgments:

If the return term bigger than its currentTerm, set currentTerm equal to return term, reset the random timer, and convert itself to Follower state If the returned message is false (due to log inconsistency), subtract 1 from the value of the node in nextIndex[] and resend the message If the returned message is true, update the nextIndex and matchIndex values of the node

Owner
Southeast University graduate student
null
Similar Resources

A feature complete and high performance multi-group Raft library in Go.

A feature complete and high performance multi-group Raft library in Go.

Dragonboat - A Multi-Group Raft library in Go / 中文版 News 2021-01-20 Dragonboat v3.3 has been released, please check CHANGELOG for all changes. 2020-03

Dec 30, 2022

A linearizability distributed database by raft and wisckey.

AlfheimDB A linearizability distributed database by raft and wisckey, which supports redis client. Build This project build by mage, you will need ins

Jul 18, 2022

Distributed disk storage database based on Raft and Redis protocol.

Distributed disk storage database based on Raft and Redis protocol.

IceFireDB Distributed disk storage system based on Raft and RESP protocol. High performance Distributed consistency Reliable LSM disk storage Cold and

Dec 31, 2022

Easy to use Raft library to make your app distributed, highly available and fault-tolerant

Easy to use Raft library to make your app distributed, highly available and fault-tolerant

An easy to use customizable library to make your Go application Distributed, Highly available, Fault Tolerant etc... using Hashicorp's Raft library wh

Nov 16, 2022

Golang implementation of the Raft consensus protocol

raft raft is a Go library that manages a replicated log and can be used with an FSM to manage replicated state machines. It is a library for providing

Jan 9, 2023

A distributed MySQL binlog storage system built on Raft

A distributed MySQL binlog storage system built on Raft

What is kingbus? 中文 Kingbus is a distributed MySQL binlog store based on raft. Kingbus can act as a slave to the real master and as a master to the sl

Dec 31, 2022

A distributed and coördination-free log management system

A distributed and coördination-free log management system

OK Log is archived I hoped to find the opportunity to continue developing OK Log after the spike of its creation. Unfortunately, despite effort, no su

Dec 26, 2022

Distributed reliable key-value store for the most critical data of a distributed system

etcd Note: The main branch may be in an unstable or even broken state during development. For stable versions, see releases. etcd is a distributed rel

Dec 30, 2022

Golang client library for adding support for interacting and monitoring Celery workers, tasks and events.

Celeriac Golang client library for adding support for interacting and monitoring Celery workers and tasks. It provides functionality to place tasks on

Oct 28, 2022
Raft: a consensus algorithm for managing a replicated log

Raft Consensus Algorithm Raft is a consensus algorithm for managing a replicated

Dec 20, 2021
An implementation of a distributed KV store backed by Raft tolerant of node failures and network partitions 🚣
An implementation of a distributed KV store backed by Raft tolerant of node failures and network partitions 🚣

barge A simple implementation of a consistent, distributed Key:Value store which uses the Raft Concensus Algorithm. This project launches a cluster of

Nov 24, 2021
Raft library Raft is a protocol with which a cluster of nodes can maintain a replicated state machine.

Raft library Raft is a protocol with which a cluster of nodes can maintain a replicated state machine. The state machine is kept in sync through the u

Oct 15, 2021
The TinyKV course builds a key-value storage system with the Raft consensus algorithm.
The TinyKV course builds a key-value storage system with the Raft consensus algorithm.

The TinyKV Course The TinyKV course builds a key-value storage system with the Raft consensus algorithm. It is inspired by MIT 6.824 and TiKV Project.

Nov 19, 2021
A naive implementation of Raft consensus algorithm.

This implementation is used to learn/understand the Raft consensus algorithm. The code implements the behaviors shown in Figure 2 of the Raft paper wi

Dec 3, 2021
This is my implementation of Raft consensus algorithm that I did for own learning.

This is my implementation of Raft consensus algorithm that I did for own learning. Please follow the link to learn more about raft consensus algorithm https://raft.github.io. And Soon, I will be developing same algorithm in Java as well

Jan 12, 2022
Labs, solutions and related materials from the MIT 6.824 Distributed Systems course.
Labs, solutions and related materials from the MIT 6.824 Distributed Systems course.

MIT 6.824 Distributed Systems Labs, solutions and related materials from the MIT 6.824 Distributed Systems course. Overview From the official website:

Nov 5, 2022
MIT 6.824: Distributed Systems

MIT 6.824 is a core 12-unit graduate subject with lectures, readings, programming labs, an optional project, a mid-term exam, and a final exam.

Jul 6, 2022
MIT6.824 Distributed Systems

MIT6.824-Distributed-Systems My Solutions for MIT6.824

Jan 28, 2022