A distributed locking library built on top of Cloud Spanner and TrueTime.

main

spindle

A distributed locking library built on top of Cloud Spanner. It uses Spanner's TrueTime and transactions support to achieve its locking mechanism.

This library is also one of the main building blocks of hedge, another one of our production staples. For context, spindle has been in our production for quite some time and is used in several critical services. The biggest deployment size running spindle is about ~100+ pods with unpredictable scaling schedules. Between k8s scaling triggers and multiple service deployments in a day, spindle so far has held its own, running like clockwork.

Usage

At the moment, the table needs to be created beforehand using the following DDL (locktable is just an example):

CREATE TABLE locktable (
    name STRING(MAX) NOT NULL,
    heartbeat TIMESTAMP OPTIONS (allow_commit_timestamp=true),
    token TIMESTAMP OPTIONS (allow_commit_timestamp=true),
    writer STRING(MAX),
) PRIMARY KEY (name)

This library doesn't use the usual synchronous "lock", "do protected work", "unlock" sequence. For that, you can check out the included lock package. Instead, after instantiating the lock object, you will call the Run(...) function which will attempt to acquire a named lock at a regular interval (lease duration) until cancelled. A HasLock() function is provided that returns true (along with the lock token) if the lock is successfully acquired. Something like:

db, _ := spanner.NewClient(context.Background(), "your/database")
defer db.Close()

// Notify me when done.
done := make(chan error, 1)

// For cancellation.
quit, cancel := context.WithCancel(context.Background())

// Instantiate the lock object using a 5s lease duration using locktable above.
lock := spindle.New(db, "locktable", "mylock", spindle.WithDuration(5000))

// Start the main loop, async.
lock.Run(quit, done)

time.Sleep(time.Second * 20)

locked, token := lock.HasLock()
log.Println("HasLock:", locked, token)

time.Sleep(time.Second * 20)
cancel()
<-done

How it works

The initial lock (the lock record doesn't exist in the table yet) is acquired by a process using an SQL INSERT. Once the record is created (by one process), all other INSERT attempts will fail. In this phase, the commit timestamp of the locking process' transaction will be equal to the timestamp stored in the token column. This will serve as our fencing token in situations where multiple processes are somehow able to acquire a lock. Using this token, the real lock holder will start sending heartbeats by updating the heartbeat column.

When a lock is active, all participating processes will detect if the lease has expired by checking the heartbeat against Spanner's current timestamp. If so (say, the active locker has crashed, or cancelled), another round of SQL INSERT is attempted, this time, using the name format . The process that gets the lock this time will then attempt to update the token column using its commit timestamp, thus, updating the fencing token. In the event that the original locker process recovers (if crashed), or continues after a stop-the-world GC pause, the latest token should invalidate its locking claim (its token is already outdated).

A simple code is provided to demonstrate the mechanism through logs. You can try running multiple binaries in multiple terminals or in a single terminal, like:

$ cd examples/simple/
$ go build -v
$ for num in 1 2 3; do ./simple &; done

TODO

  • Add tests using Spanner Emulator
Owner
Similar Resources

Distributed lock manager. Warning: very hard to use it properly. Not because it's broken, but because distributed systems are hard. If in doubt, do not use this.

What Dlock is a distributed lock manager [1]. It is designed after flock utility but for multiple machines. When client disconnects, all his locks are

Dec 24, 2019

Distributed reliable key-value store for the most critical data of a distributed system

etcd Note: The main branch may be in an unstable or even broken state during development. For stable versions, see releases. etcd is a distributed rel

Dec 30, 2022

CockroachDB - the open source, cloud-native distributed SQL database.

CockroachDB - the open source, cloud-native distributed SQL database.

CockroachDB is a cloud-native distributed SQL database designed to build, scale, and manage modern, data-intensive applications. What is CockroachDB?

Dec 29, 2022

A library built to provide support for defining service health for golang services. It allows you to register async health checks for your dependencies and the service itself, provides a health endpoint that exposes their status, and health metrics.

A library built to provide support for defining service health for golang services. It allows you to register async health checks for your dependencies and the service itself, provides a health endpoint that exposes their status, and health metrics.

go-sundheit A library built to provide support for defining service health for golang services. It allows you to register async health checks for your

Dec 27, 2022

Easy to use Raft library to make your app distributed, highly available and fault-tolerant

Easy to use Raft library to make your app distributed, highly available and fault-tolerant

An easy to use customizable library to make your Go application Distributed, Highly available, Fault Tolerant etc... using Hashicorp's Raft library wh

Nov 16, 2022

This library contains utilities that are useful for building distributed services.

Grafana Dskit This library contains utilities that are useful for building distributed services. Current state This library is still in development. D

Jan 2, 2023

High performance, distributed and low latency publish-subscribe platform.

High performance, distributed and low latency publish-subscribe platform.

Emitter: Distributed Publish-Subscribe Platform Emitter is a distributed, scalable and fault-tolerant publish-subscribe platform built with MQTT proto

Jan 2, 2023

Fast, efficient, and scalable distributed map/reduce system, DAG execution, in memory or on disk, written in pure Go, runs standalone or distributedly.

Gleam Gleam is a high performance and efficient distributed execution system, and also simple, generic, flexible and easy to customize. Gleam is built

Jan 1, 2023
Lockgate is a cross-platform locking library for Go with distributed locks using Kubernetes or lockgate HTTP lock server as well as the OS file locks support.

Lockgate Lockgate is a locking library for Go. Classical interface: 2 types of locks: shared and exclusive; 2 modes of locking: blocking and non-block

Dec 16, 2022
Simplified distributed locking implementation using Redis

redislock Simplified distributed locking implementation using Redis. For more information, please see examples. Examples import ( "fmt" "time"

Dec 24, 2022
JuiceFS is a distributed POSIX file system built on top of Redis and S3.
JuiceFS is a distributed POSIX file system built on top of Redis and S3.

JuiceFS is a high-performance POSIX file system released under GNU Affero General Public License v3.0. It is specially optimized for the cloud-native

Jan 4, 2023
MySQL Backed Locking Primitive

go-mysql-lock go-mysql-lock provides locking primitive based on MySQL's GET_LOCK Lock names are strings and MySQL enforces a maximum length on lock na

Dec 21, 2022
Distributed-Services - Distributed Systems with Golang to consequently build a fully-fletched distributed service

Distributed-Services This project is essentially a result of my attempt to under

Jun 1, 2022
Dapr is a portable, event-driven, runtime for building distributed applications across cloud and edge.
Dapr is a portable, event-driven, runtime for building distributed applications across cloud and edge.

Dapr is a portable, serverless, event-driven runtime that makes it easy for developers to build resilient, stateless and stateful microservices that run on the cloud and edge and embraces the diversity of languages and developer frameworks.

Jan 5, 2023
A distributed MySQL binlog storage system built on Raft
A distributed MySQL binlog storage system built on Raft

What is kingbus? 中文 Kingbus is a distributed MySQL binlog store based on raft. Kingbus can act as a slave to the real master and as a master to the sl

Dec 31, 2022
The lightweight, distributed relational database built on SQLite
The lightweight, distributed relational database built on SQLite

rqlite is a lightweight, distributed relational database, which uses SQLite as its storage engine. Forming a cluster is very straightforward, it grace

Jan 2, 2023
Kafka implemented in Golang with built-in coordination (No ZK dep, single binary install, Cloud Native)

Jocko Kafka/distributed commit log service in Go. Goals of this project: Implement Kafka in Go Protocol compatible with Kafka so Kafka clients and ser

Dec 28, 2022