• Used to ensure linearlizable read
  • Before processing read request, leader ensures its leadership, and every log that can be read before (on current node or other node) can still be read after
  1. Leader check if log in current term have been committed (NoopIndex). If not, abort and wait.
    • Ensure that the local commit index is the cluster’s commit index
  2. Leader take current CommitIndex as ReadIndex
    • Optimization: ReadIndex = max(noopIndex, commitIndex)
  3. Send heartbeats to quorum to ensure it’s the only leader (and no higher term)
  4. Wait for AppliedIndex >= ReadIndex, then read state machine

Proof

image

  • leader receives a read request read_1, with wall clock time time_1 and term Term_1
  • We need to ensure linearlizable read so for read_0 with time time_1, time_0 < time_1,
  • read_0 read [(0,0), (Term_0, index_0)]
    • Term_0 > Term_1: impossible as leader sent heartbeats to quorum and confirm there’s no higher term
    • Term_0 < Term_1: raft ensures that when leader is elected, it contains all committed logs, thus index_0 < NoopIndex
      • state machine needs to include NoopIndex log
    • Term_0 = Term_1: in current node and thus trivial

Follower Read

通过ReadIndex机制,还能实现 follower read。当 follower 收到只读请求后,可以给 leader 发送一条获取 read index 的消息,当 leader 通过心跳广播确认自己是合法的 leader 后,将其记录的 read index 返回给 follower,follower 等到自己的 apply index 大于等于其收到的 read index 后,即可以安全地提供满足线性一致性的只读服务。