Sync distributed sets using bloom filters

goSetReconciliation

An implementation to sync distributed sets using bloom filters. Based on the paper "Low complexity set reconciliation using Bloom filters" by Magnus Skjegstad and Torleiv Maseng

Introduction

Syncing multiple distributed sets over a distributed system and hard and tedious. To tackle this the paper aims at syncing multiple lists by sending across bloom filters to each node to then caluclate the elements missing between the sets. This idea makes syncing sets much easier and of lower complexity.

Steps

To provision the cluster:

$ git clone https://github.com/el10savio/goSetReconciliation
$ cd goSetReconciliation
$ make provision

This creates a 2 node Set cluster established in their own docker network.

To view the status of the cluster

$ make info

This provides information on the cluster and its associated ports to access each node. An example of the output seen in make info would be like below:

d3fd26dd4df3  set  "/go/bin/set"  2 hours ago  Up 2 hours  0.0.0.0:8004->8080/tcp  peer-1
8830feb6cd68  set  "/go/bin/set"  2 hours ago  Up 2 hours  0.0.0.0:8003->8080/tcp  peer-0

Now we can also send requests to add, list, and sync values to any peer node using its port allocated.

$ curl -i -X POST localhost:<peer-port>/set/add -d {"values": <values>}
$ curl -i -X GET localhost:<peer-port>/set/list
$ curl -i -X GET localhost:<peer-port>/set/sync

In the logs for each peer docker container, we can see the logs of the peer nodes getting in sync after each local operation.

To tear down the cluster and remove the built docker images:

$ make clean

This is not certain to clean up all the locally created docker images at times. You can do a docker rmi to delete them.

To provision the cluster and run automated end to end tests:

$ make e2e

This uses BATS bash testing to run curl requests to each node and asserts the output received.

References

Similar Resources

A Go implementation of an in-memory bloom filter, with support for boltdb and badgerdb as optional data persistent storage.

Sprout A bloom filter is a probabilistic data structure that is used to determine if an element is present in a set. Bloom filters are fast and space

Jul 4, 2022

Type-safe, zero-allocation sets for Go

Set Package set is a type-safe, zero-allocation port of the excellent package fatih/set. It contains sets for most of the basic types and you can gene

Jan 5, 2023

Generic types that are missing from Go, including sets, trees, sorted lists, etc.

go-typ Generic types that are missing from Go, including sets, trees, sorted lists, etc. All code is implemented with 0 dependencies and in pure Go co

Dec 4, 2022

A typed implementation of the Go sync.Map using code generation

syncmap A typed implementation of the Go sync.Map using code generation. Install go get -u github.com/a8m/syncmap@master Examples: Using CLI $ syncma

Dec 26, 2022

skipmap is a high-performance concurrent sorted map based on skip list. Up to 3x ~ 10x faster than sync.Map in the typical pattern.

skipmap is a high-performance concurrent sorted map based on skip list. Up to 3x ~ 10x faster than sync.Map in the typical pattern.

Introduction skipmap is a high-performance concurrent map based on skip list. In typical pattern(one million operations, 90%LOAD 9%STORE 1%DELETE), th

Jan 8, 2023

sync.WaitGroup for concurrent use

concwg Description This package provides a version of sync.WaitGroup that allows calling Add and Wait in different goroutines. Motivation sync.WaitGro

May 20, 2022

Multi-String Pattern Matching Algorithm Using TrieHashNode

Multi-String Pattern Matching algorithm. This implementation is inspired from Aho-Corasick algorithm Getting Started modelA = mspm.NewModel("mspm_mode

Dec 9, 2022

Pipelines using goroutines

pipeline This package provides a simplistic implementation of Go pipelines as outlined in Go Concurrency Patterns: Pipelines and cancellation. Docs Go

Oct 5, 2022

A simple Set data structure implementation in Go (Golang) using LinkedHashMap.

Set Set is a simple Set data structure implementation in Go (Golang) using LinkedHashMap. This library allow you to get a set of int64 or string witho

Sep 26, 2022
Go package implementing Bloom filters

Bloom filters A Bloom filter is a concise/compressed representation of a set, where the main requirement is to make membership queries; i.e., whether

Dec 28, 2022
Distributed merge sort for large sets across nodes in a network

distMergeSort An implementation of mergesort distributed across nodes used to sort large sets. Introduction Merge sort partitions sets so that they ca

Jul 8, 2022
Go library implementing xor filters
Go library implementing xor filters

xorfilter: Go library implementing xor filters Bloom filters are used to quickly check whether an element is part of a set. Xor filters are a faster a

Dec 30, 2022
Cuckoo Filter: Practically Better Than Bloom

Cuckoo Filter Cuckoo filter is a Bloom filter replacement for approximated set-membership queries. While Bloom filters are well-known space-efficient

Jan 4, 2023
Package ring provides a high performance and thread safe Go implementation of a bloom filter.

ring - high performance bloom filter Package ring provides a high performance and thread safe Go implementation of a bloom filter. Usage Please see th

Nov 20, 2022
A simple Bloom Filter implementation in Go

This is a simple Bloom filter implementation written in Go. For the theory behind Bloom filters, read http://en.wikipedia.org/wiki/Bloom_filter This

Apr 26, 2018
go/golang: fast bit set Bloom filter

package implements a fast bloom filter with real 'bitset' and JSONMarshal/JSONUnmarshal to store/reload the Bloom filter.

Nov 3, 2022
Leaked password check library with bloom filter

Leaked password check With this library you can check the password is probably leaked or not. Pre generated bitset DB includes 6 Million leaked passwo

Aug 24, 2022
A bloom filter is a probabilistic data structure that is used to determine if an element is present in a set

A Go implementation of a bloom filter, with support for boltdb and badgerdb as optional in-memory persistent storage.

Jul 4, 2022