Rink is a "distributed sticky ranked ring" using etcd.

Rink

Rink is a "distributed sticky ranked ring" using etcd. A rink provides role scheduling across distributed processes, with each role only assigned to a single instance. Stickiness allows smooth transferring of roles during controlled rolling deploys which minimises role rebalancing.

  • A rink is a distributed self-managing group of processes; the members of the rink.
  • Members can join or leave the rink. Inactive members are automatically removed.
  • A rank is assigned to each member; mth of n members.
  • Roles (leader, 3dbad20c, or BTC/ETH) are implicitly assigned to a member by consistent hashing.
  • When the size of rink (n) grows or shrinks, ranks are rebalanced (ensuring unique ranks less than n).
  • Rebalancing results in all roles beign reassigned.
  • Stickyness is provided by a configurable rebalance delay. A rebalance due to a joined member is only triggered after the rebalacne delay. If another member leaves during this time, its rank will immediately be transferred without affecting other members.
  • Kubernetes rolling deploy with MaxUnavailable=0, MaxSurge>0 will smoothly transfer roles from old members to new members.

Usage

Example 1: Given multiple replicas of a service, rink will ensure that at most one replica is always the "leader" even as other replicas start, stop or fail.

Note error handling is omitted for brevity.

cl, err := etcd_clientv3.New()
s, err := concurrency.NewSession(cl) // Session identifies the member
defer s.Close()                      // Leave the rink when done.
r, err := rink.New(s, "my_rink")

for {
  // Block until this replica assumes the "leader" role  
  ctx := r.AwaitRole("leader")

  // ctx is closed when the role is unassigned.

  // leadByExample runs until first error is encountered (err != nil).
  err := leadByExample(ctx)
  if ctx.Err() != nil {
    log.Printf("leader role unassigned: %v", err)
  } else {
    log.Printf("lead error: %v", err)
  }
  
  // In both cases, just try again.
}

Example 2: Given multiple processor services that runs jobs (note jobs can be both short or long lived). rink will ensure that jobs are evenly distributed among processors.

cl, err := etcd_clientv3.New()
s, err := concurrency.NewSession(cl) // Session identifies the member
defer s.Close()                      // Leave the rink when done.
r, err := rink.New(s, "my_rink")

for {
  jobs, _ := jobs.ListPendingJobs()
  
  for _, job := range jobs {
    ctx, ok := r.GetRole(job.ID)  
    if !ok {
      // Another processor has the role for this job now.
      continue
    }
  
    // Only this processor will process this job now.
    // ctx is closed when the role is unassigned.
    err := processJob(ctx, job)
    if err != nil {
      // Job completed. ListPendingJobs should not return this anymore.
    } else if ctx.Err() != nil {
      log.Printf("job role unassigned: %v", err)
    } else {
      log.Printf("job process error: %v", err)
    }
  }

  if len(jobs) == 0 {
    time.Sleep(time.Minute) // Backoff if no jobs
  }
}

Concepts and building blocks

rink leverages etcd's concurrency package as underlying building blocks. It also introduces some concepts of its own.

concurrency.Session: An etcd session identifies a member and acts as liveliness proxy. A member joins the rink with an active session. A member leaves the rink by closing the session. A session is implemented in etcd with an opaque key and lease (default TTL of 60s). The session is kept alive by an internal goroutine that periodically extends the lease. If the session is closed, the key is deleted. If the lease times out, the key is automatically deleted and the session is assumed cancelled. rink uses the Session.LeaseID is link member keys to their session.

concurrency.Election: An etcd election elects a leader that may proclaim values. These proclamations appear as an append-only log of values to all members of the election. All members of rink join an election. One member will therefore be the leader. The leader maintains the rink; detecting member joins and leaves, promoting and rebalacing ranks, and "proclaiming" subsequent version of the rink state. The election proclamations are therefore a immutable log of rink states.

member keys: When joining the rink, each member creates a key {rink_keyprefix}/members/{name} attached to its session lease. The value of the key is its RebalanceAt timestamp; implying the rebalance delay it is willing to wait. Since the member keys are linked to the session lease, they are automatically deleted when the session is closed (or times out).

immutable rink state log: One of the members is always the leader (of the etcd election). The leader maintains the rink state; a map of ranks by member names map[string]int. Subsequent versions of the rink state is proclaimed on the etcd election. Members observe these proclamations (similar to stream consumers), updating their own state accordingly. Note this does mean that members are eventually consistent, with possibiilty of two members temporarily assuming the same role.

leader: The leader does the following to maintain the rink state:

  • It watches all member keys and reacts to members joining and leaving.
  • It maintains a timer of a future rebalance due to waiting members.
  • If a member leaves, it promotes any waiting member to its rank.
  • If required it rebalances ranks.

Comparison to guv and scheduler

  1. Rink and guv have sticky roles while scheduler redistributes roles very often:
  • Scheduler redistributes roles on each join and each leave.
  • Given a rolling deploy of 5 instances, scheduler redistributes roles 10 times.
  • It is very difficult to determine what is happening during the 10 consecutive redistributions.
  • Delays or bugs resulting from this would be difficult to identify and fix.
  • Depending on application's role duration requirements, it might not get any work done during the deploy.
  • Error logging will probably be high during this time.
  • Not rink doesn't have sticky roles on leaves.
  1. Rink and guv supports stable and immediate role succession:
  • Given Kubernetes rolling deploys with maxUnavailable=0 and maxSurge>0
  • Roles are immediately assumed by new waiting pods as old pods are stopped.
  • Each role is only moved once.
  • This provides very stable and strong role HA guarantees.
  1. Rink and guv and scheduler2 supports arbitrary roles:
  • They all don't manages roles, it only manages member rank.
  1. Rink is simpler to understand, debug and implement (imho):
  • It has much less features than Guv.
  • It leverages etcd concurrency package.
  • Rink has a single leader maintaining state, compared to Guv and Schedeler where all members update state requiring more complex atomic guarded operations.
  1. Rink has much less features than Guv.
  • Guv also has sticky leaves, while rink only has sticky joins.
  • Guv therefore supports staggered rebalances when cluster shrinks.
  • Guv therefore supports sticky roles if a pod restarts.
  • Rink and guv support staggered rebalance when cluster grows.
  • Rink only supports smooth roles transfers on rolling deploy described above.
  • Guv has explicit state generations/versions, members sync to each version.
  • When rebalacing, guv can therefore first unassigns all ranks to ensure no roles overlap.
  • Rink members are eventualy consistent, they have no mechanism to sync to specific version.
  • Rink therefore has a probability of role overlap; multiple members assuming the same role.
  • This is considered "good enough", since it should only occur during failures, it should only be temporary (few seconds max) and reflex consumers should be able to handle it (idempotent).
Similar Resources

Simple key-value store abstraction and implementations for Go (Redis, Consul, etcd, bbolt, BadgerDB, LevelDB, Memcached, DynamoDB, S3, PostgreSQL, MongoDB, CockroachDB and many more)

gokv Simple key-value store abstraction and implementations for Go Contents Features Simple interface Implementations Value types Marshal formats Road

Dec 24, 2022

Golang 微服务框架,支持 grpc/http,支持多种注册中心 etcd,consul,mdns 等

一个用于构建分布式系统的工具集或者轻框架,支持 grpc 和 http ,支持多种注册中心 consul ,etcd , mdns 等。

Nov 30, 2022

zk2etcd 是一款同步 zookeeper 数据到 etcd 的工具

zk2etcd 是一款同步 zookeeper 数据到 etcd 的工具

zk2etcd zk2etcd 是一款同步 zookeeper 数据到 etcd 的工具 项目背景 在云原生大浪潮下,业务都逐渐上 k8s,许多业务以前使用 zookeeper 作为注册中心,现都逐渐倾向更加贴近云原生的 etcd。 在业务向云原生迁移改造的过程中,可能需要将 zookeeper 中

Sep 1, 2022

微服务架构-micro-基于go-zero zrpc etcd 单独集成orm-gorm 日志-uber/zap

micro目录结构 pkg ├── api 业务接口逻辑层 所有的业务逻辑存放目录。 │ ├── model 数据模型 数据管理层,仅用于操作管理数据,如数据库操作 ├── assets

Jul 12, 2022

Service registration and discovery, support etcd, zookeeper, consul, etc.

discox 支持类型 zookeeper etcd consul 示例 zookeeper server package main import ( "fmt" "github.com/goeasya/discox" "os" ) func main() { cfg := discox

Aug 31, 2022

一款依赖 etcd 作为注册中心的 Golang 轻量级 GRPC 框架

一款依赖 etcd 作为注册中心的 Golang 轻量级 GRPC 框架

Golang 微服务 GRPC 标准框架(轻量级) 特性介绍 可使用 etcd 集群或单节点作为注册中心 客户端请求服务端自带负载均衡 服务端启动后自动向 etcd 注册,默认每 10s 进行一次心跳续租 自带优雅停止 panic recover 服务端无需指定启动端口,当然你也可以通过 WithP

Nov 11, 2021

A webhook for performaning DNS01 validation against CoreDNS backended by etcd.

[WIP] cert-manager-webhook-coredns-etcd A webhook for performaning DNS01 validation against CoreDNS backended by etcd. Running the test suite All DNS

Jan 15, 2022

Terraform provider for the etcd store

About This is a terraform provider for etcd. Its scope is currently limited to the following resources: roles users keys We'll add further functionali

Nov 19, 2022

A helper tool for getting OpenShift/Kubernetes data directly from Etcd.

Etcd helper A helper tool for getting OpenShift/Kubernetes data directly from Etcd. How to build $ go build . Basic Usage This requires setting the f

Dec 10, 2021

Dynamic service configuration with etcd.

dynconf This Go package provides a dynamic service configuration backed by etcd, so there should be no need to redeploy a service to change its settin

Dec 6, 2021

A service registry and service discovery implemention for kitex based on etcd

kitex etcd Introduction kitexetcd is an implemention of service registry and service discovery for kitex based on etcd. Installation go get -u github.

Feb 18, 2022

A rate limiter for Golang, with ETCD data bindings

Go Rate limiter This package allows us to have a distributed rate limiter, using Redis as a central counter. The limits that are set are only "soft" l

Dec 9, 2021

An etcd data defragmentation tool

etcd-defrag An etcd data defragmentation tool for v3 Example etcd defrag data by API err := etcd_defrag.Run( []string{ "https://127.0.0.1:

Dec 8, 2021

An instance of go-micro with etcd

go-micro_with_etcd_instance An instance of go-micro with etcd。最新版go-micro用Etcd做服

Mar 10, 2022

A simple tool to sync your etcd cluster to PostgreSQL in realtime.

etcd-postgresql-syncer A simple tool to sync your etcd cluster to PostgreSQL in realtime. It sets up a watcher on etcd and commits all changes to Post

Jan 20, 2022

A library to construct etcd/clientv3 connection params from environment variables

etcd-client-from-env This library reads environment variables and returns a clientv3.Config. It makes it easy to write tools against etcd that give th

Jan 17, 2022

A letsencrypt client that uses etcd as its storage.

letsencrypt-with-etcd This is a letsencrypt client that uses etcd as its storage. It stores your (automatically created) LetsEncrypt account in /letse

Jan 20, 2022

A library and binary to dump authentication configuration from etcd.

A libary and binary to dump authentication information from etcd. The commands are suitable for configuring an empty etcd cluster to get to the same authentication config.

Jan 20, 2022

Etcd config dispenser

etcd-config-dispenser Some things are best explained with an example: I use lets

Jan 20, 2022
Leader-follower-pattern - Build leader-follower system pattern with etcd election

主备系统模式 原理 使用分布式锁实现主备节点系统。通过对分布式锁进行续期,保持长期锁, 从而使当前服务节点处于主服务节点 无法获取分布式锁的服务节点,则作为备选

Jan 24, 2022
Dynatomic is a library for using dynamodb as an atomic counter

Dynatomic Dynatomic is a library for using dynamodb as an atomic counter Dynatomic Motivation Usage Development Contributing Motivation The dynatomic

Sep 26, 2022
Simplified distributed locking implementation using Redis

redislock Simplified distributed locking implementation using Redis. For more information, please see examples. Examples import ( "fmt" "time"

Dec 24, 2022
Lockgate is a cross-platform locking library for Go with distributed locks using Kubernetes or lockgate HTTP lock server as well as the OS file locks support.

Lockgate Lockgate is a locking library for Go. Classical interface: 2 types of locks: shared and exclusive; 2 modes of locking: blocking and non-block

Dec 16, 2022
a Framework for creating microservices using technologies and design patterns of Erlang/OTP in Golang
a Framework for creating microservices using technologies and design patterns of Erlang/OTP in Golang

Technologies and design patterns of Erlang/OTP have been proven over the years. Now in Golang. Up to x5 times faster than original Erlang/OTP in terms

Dec 28, 2022
A Distributed Content Licensing Framework (DCLF) using Hyperledger Fabric permissioned blockchain.

A Distributed Content Licensing Framework (DCLF) using Hyperledger Fabric permissioned blockchain.

Nov 4, 2022
This is a comprehensive system that simulate multiple servers’ consensus behavior at local machine using multi-process deployment.

Raft simulator with Golang This project is a simulator for the Raft consensus protocol. It uses HTTP for inter-server communication, and a job schedul

Jan 30, 2022
Kstone is an etcd management platform, providing cluster management, monitoring, backup, inspection, data migration, visual viewing of etcd data, and intelligent diagnosis.
Kstone is an etcd management platform, providing cluster management, monitoring, backup, inspection, data migration, visual viewing of etcd data, and intelligent diagnosis.

Kstone 中文 Kstone is an etcd management platform, providing cluster management, monitoring, backup, inspection, data migration, visual viewing of etcd

Dec 27, 2022
A distributed lock service in Go using etcd

locker A distributed lock service client for etcd. What? Why? A distributed lock service is somewhat self-explanatory. Locking (mutexes) as a service

Sep 27, 2022
Manage local application configuration files using templates and data from etcd or consul

confd confd is a lightweight configuration management tool focused on: keeping local configuration files up-to-date using data stored in etcd, consul,

Dec 27, 2022