A distributed key value store in under 1000 lines. Used in production at comma.ai

minikeyvalue

Tests

Fed up with the complexity of distributed filesystems?

minikeyvalue is a ~1000 line distributed key value store, with support for replication, multiple machines, and multiple drives per machine. Optimized for values between 1MB and 1GB. Inspired by SeaweedFS, but simple. Should scale to billions of files and petabytes of data. Used in production at comma.ai.

A key part of minikeyvalue's simplicity is using stock nginx as the volume server.

Even if this code is crap, the on disk format is super simple! We rely on a filesystem for blob storage and a LevelDB for indexing. The index can be reconstructed with rebuild. Volumes can be added or removed with rebalance.

API

  • GET /key
    • 302 redirect to nginx volume server.
  • PUT /key
    • Blocks. 201 = written, anything else = probably not written.
  • DELETE /key
    • Blocks. 204 = deleted, anything else = probably not deleted.

Start Volume Servers (default port 3001)

# this is just nginx under the hood
PORT=3001 ./volume /tmp/volume1/ &;
PORT=3002 ./volume /tmp/volume2/ &;
PORT=3003 ./volume /tmp/volume3/ &;

Start Master Server (default port 3000)

GO111MODULE=auto ./mkv -volumes localhost:3001,localhost:3002,localhost:3003 -db /tmp/indexdb/ server

Usage

# put "bigswag" in key "wehave" (will 403 if it already exists)
curl -v -L -X PUT -d bigswag localhost:3000/wehave

# get key "wehave" (should be "bigswag")
curl -v -L localhost:3000/wehave

# delete key "wehave"
curl -v -L -X DELETE localhost:3000/wehave

# unlink key "wehave", this is a virtual delete
curl -v -L -X UNLINK localhost:3000/wehave

# list keys starting with "we"
curl -v -L localhost:3000/we?list

# list unlinked keys ripe for DELETE
curl -v -L localhost:3000/?unlinked

# put file in key "file.txt"
curl -v -L -X PUT -T /path/to/local/file.txt localhost:3000/file.txt

# get file in key "file.txt"
curl -v -L -o /path/to/local/file.txt localhost:3000/file.txt

./mkv Usage

Usage: ./mkv <server, rebuild, rebalance>

  -db string
        Path to leveldb
  -fallback string
        Fallback server for missing keys
  -port int
        Port for the server to listen on (default 3000)
  -protect
        Force UNLINK before DELETE
  -replicas int
        Amount of replicas to make of the data (default 3)
  -subvolumes int
        Amount of subvolumes, disks per machine (default 10)
  -volumes string
        Volumes to use for storage, comma separated

Rebalancing (to change the amount of volume servers)

# must shut down master first, since LevelDB can only be accessed by one process
./mkv -volumes localhost:3001,localhost:3002,localhost:3003 -db /tmp/indexdb/ rebalance

Rebuilding (to regenerate the LevelDB)

./mkv -volumes localhost:3001,localhost:3002,localhost:3003 -db /tmp/indexdbalt/ rebuild

Performance

# Fetching non-existent key: 116338 req/sec
wrk -t2 -c100 -d10s http://localhost:3000/key

# go run thrasher.go
starting thrasher
10000 write/read/delete in 2.620922675s
thats 3815.40/sec
Owner
George Hotz
We will win self driving cars.
George Hotz
Comments
  • Feature/basic auth

    Feature/basic auth

    I have added basic auth.

    Keep in mind this is my first attempt at writing go. I'm sure there is a easier way to add this functionality. I tried to make the cleanest possible with zero go experience. That being said I don't except the pull request to be accepted right away.

    I will accept every suggestion.

    For reading and validating basic auth files I used https://github.com/tg123/go-htpasswd

    I think https://github.com/abriosi/minikeyvalue/blob/feature/basicAuth/volume#L9-L123 can be improved but my bash scripting sucks.

    It passes the tests. I think this is the right time to ask for feedback

  • Simplify error creation with `fmt.Errorf`

    Simplify error creation with `fmt.Errorf`

    Description

    Hi :wave: I ran the DeepSource static analyzer on the forked copy of this repo and found some interesting code quality issues. This PR fixes a few of them.

    Summary of fixes

    • Simplify error creation with fmt.Errorf
    • added .deepsource.toml fle
  • refactor: move from io/ioutil to io and os packages

    refactor: move from io/ioutil to io and os packages

    This PR introduce two small changes:

    1. Use actions/setup-go instead of installing the golang package from apt. This speeds up the workflow and allows us to update the Go version easily (packages provided by Ubuntu are often several releases behind the latest version).

    2. The io/ioutil package has been deprecated in Go 1.16 (See https://golang.org/doc/go1.16#ioutil). This PR replaces the existing io/ioutil functions with their new definitions in io and os packages.

  • MD5

    MD5

    There is no requirement for a cryptographic hash to use, right? Let us use a non cryptographic hash then like xxhash or murmur3 as they are much faster.

    I am happy to send a PR in case we agree :-)

  • Updated readme with docker-compose Instructions

    Updated readme with docker-compose Instructions

    In order to test out minikeyvalue for my use case, I decided to try it out using docker-compose. I thought that the following addition could assist those in a similar position to get up and running quickly.

    Instead of adding a new docker compose file I just just editing the readme would be sufficient for such a change.

  • Just say Hi from SeaweedFS

    Just say Hi from SeaweedFS

    Hi, George,

    You are one of the guys that I respect. I was watching the youtube video https://www.youtube.com/watch?v=iwcYp-XT7UI where Lex Fridman interviewed you, and you mentioned SeaweedFS for 0.5 seconds. :)

    I work on SeaweedFS. And I wanted to learn your approach to file storage. In another coding session, you mentioned SeaweedFS has some bugs. If you still remember the exact bugs, please let me know.

    Thanks and keep up the nice work!

    Chris

  • Thank you. A quick question on adding a security layer

    Thank you. A quick question on adding a security layer

    Thank you very much for putting this repository together. Reading these lines of code has taught me a lot. Simple, scalable and structured.

    How do you add a security layer to this filesystem in case you need to access it from other services which are not on the same network:

    1. Do you create a VPN? (wouldn't this bottleneck the distributed nature of PUTs and GETs since all traffic would have to me routed by the VPN server?)
    2. A reverse proxy (same problem has 1.)
    3. Do you add authentication, such as, Basic Authentication together with https?
    4. Is there a simpler solution I'm missing?
  • Question about stored file name

    Question about stored file name

    I'm not clear why the stored file name is not set to the requested file name like this:

    fmt.Sprintf("/%02x/%02x/%s", mkey[0], mkey[1], key)

    So that we can simply download the files as their default file name.

  • Please do Gofmt to *.go files in the project

    Please do Gofmt to *.go files in the project

    Please do gofmt to the project (*.go files) because some contributors may use Goland (with File Watchers + Auto gofmt) or VSCode (with go plugins including gofmt so it will auto gofmt when saved the file) so It may caused merge conflict as code formatting is not the same with manual format.

  • Fallback HEAD/GET requests faster when volume server is down

    Fallback HEAD/GET requests faster when volume server is down

    When a node goes down, HEAD requests are used to randomly find a volume server that is up, and it seems to take about 5 seconds to time out by default. This change makes the timeout configurable.

    This was a lot easier on newer versions of go, so I went ahead and updated things to ubuntu 20.04

  • allow 404 response from volume server when deleting

    allow 404 response from volume server when deleting

    when a PUT request fails due to a volume server being down a record is left in the database (in soft delete state) and therefore a DELETE request subsequently fails because it issues a request to the volume server that was down (which returns a 404).

    Seems like if there is no file to delete (note that the node does have to be up to get a 404 response) the delete should succeed (and subsequently clean up the record in the database).

  • optimize WriteToReplicas for better performance

    optimize WriteToReplicas for better performance

    creates a buffered reader to read the value, also calculates the hash of the value before writing to the reps if needed and writes to the replicas in parallel using goroutines uses channels to wait for all of them to complete

  • Data integrity feature?

    Data integrity feature?

    Just a suggestion for implementing (tell me if it doesn't make sense for the project):

    • File integrity: Append the hash of of the previous index value of data (SHA-256) to the newest index value. This prevents tampering with file contents, because every machine can check to make sure that the appended block hash of some index is equal to the block hash of the (index - 1) contents.

    Further clarification with this image: image

    Keep in mind that h_0 represents hash of the newest data || h_1, and h_1 is the hash of data || h_2, and so on. This nested check ensures file integrity.

    Mutability requires a re-computation of the hashes, but can only be done with a key.

    Feedback? Will this fit in the 1000-line requirement?

  • Parallelize writes to replicas

    Parallelize writes to replicas

    • Writes are done sequentially, don't see a reason why they should;
    • Simplify sync.WaitGroup usage: no need to bump the counter on each task, only per goroutine;
    • Panic on failure to bind to a port instead of silently exiting;
  • do not write multi-part upload data to disk

    do not write multi-part upload data to disk

    Writing to /tmp will wear out your OS drive pretty fast pumping hundreds of terabytes into minikeyvalue. Since RAM is good enough for non-multipart uploads, it should be fine for multi-part uploads, too. Maybe we want to suggest using a RAM disk and add expiring partial uploads where the final PUT never happens within some time period?

  • replica 0 write failed: http://localhost:3001/sv07/60/08/L3dlaGF2ZQ==

    replica 0 write failed: http://localhost:3001/sv07/60/08/L3dlaGF2ZQ==

    Hi, I do this: curl -v -L -X PUT -d bigswag lo:3000/wehave, and server print: replica 0 write failed: http://localhost:3001/sv07/60/08/L3dlaGF2ZQ==

    then I go on: curl -v -L localhost:3000/bigswag or curl -v -L localhost:3000/wehave the result is:

    * About to connect() to localhost port 3000 (#0)
    *   Trying 127.0.0.1...
    * Connected to localhost (127.0.0.1) port 3000 (#0)
    > GET /wehave HTTP/1.1
    > User-Agent: curl/7.29.0
    > Host: localhost:3000
    > Accept: */*
    >
    < HTTP/1.1 404 Not Found
    < Content-Length: 0
    < Date: Sun, 31 Jan 2021 08:24:09 GMT
    <
    * Connection #0 to host localhost left intact
    

    I can't get the right value,

    Could you tell me what is wrong, please?

A distributed key-value store. On Disk. Able to grow or shrink without service interruption.

Vasto A distributed high-performance key-value store. On Disk. Eventual consistent. HA. Able to grow or shrink without service interruption. Vasto sca

Jan 6, 2023
Distributed cache and in-memory key/value data store.

Distributed cache and in-memory key/value data store. It can be used both as an embedded Go library and as a language-independent service.

Dec 30, 2022
🔑A high performance Key/Value store written in Go with a predictable read/write performance and high throughput. Uses a Bitcask on-disk layout (LSM+WAL) similar to Riak.

bitcask A high performance Key/Value store written in Go with a predictable read/write performance and high throughput. Uses a Bitcask on-disk layout

Sep 26, 2022
A disk-backed key-value store.

What is diskv? Diskv (disk-vee) is a simple, persistent key-value store written in the Go language. It starts with an incredibly simple API for storin

Jan 7, 2023
An in-memory key:value store/cache (similar to Memcached) library for Go, suitable for single-machine applications.

go-cache go-cache is an in-memory key:value store/cache similar to memcached that is suitable for applications running on a single machine. Its major

Dec 29, 2022
A simple, fast, embeddable, persistent key/value store written in pure Go. It supports fully serializable transactions and many data structures such as list, set, sorted set.

NutsDB English | 简体中文 NutsDB is a simple, fast, embeddable and persistent key/value store written in pure Go. It supports fully serializable transacti

Jan 1, 2023
Embedded key-value store for read-heavy workloads written in Go
Embedded key-value store for read-heavy workloads written in Go

Pogreb Pogreb is an embedded key-value store for read-heavy workloads written in Go. Key characteristics 100% Go. Optimized for fast random lookups an

Jan 3, 2023
Fast and simple key/value store written using Go's standard library
Fast and simple key/value store written using Go's standard library

Table of Contents Description Usage Cookbook Disadvantages Motivation Benchmarks Test 1 Test 4 Description Package pudge is a fast and simple key/valu

Nov 17, 2022
Low-level key/value store in pure Go.
Low-level key/value store in pure Go.

Description Package slowpoke is a simple key/value store written using Go's standard library only. Keys are stored in memory (with persistence), value

Jan 2, 2023
Key-value store for temporary items :memo:

Tempdb TempDB is Redis-backed temporary key-value store for Go. Useful for storing temporary data such as login codes, authentication tokens, and temp

Sep 26, 2022
a key-value store with multiple backends including leveldb, badgerdb, postgresql

Overview goukv is an abstraction layer for golang based key-value stores, it is easy to add any backend provider. Available Providers badgerdb: Badger

Jan 5, 2023
A minimalistic in-memory key value store.
A minimalistic in-memory key value store.

A minimalistic in-memory key value store. Overview You can think of Kiwi as thread safe global variables. This kind of library comes in helpful when y

Dec 6, 2021
Membin is an in-memory database that can be stored on disk. Data model smiliar to key-value but values store as JSON byte array.

Membin Docs | Contributing | License What is Membin? The Membin database system is in-memory database smiliar to key-value databases, target to effici

Jun 3, 2021
A simple Git Notes Key Value store

Gino Keva - Git Notes Key Values Gino Keva works as a simple Key Value store built on top of Git Notes, using an event sourcing architecture. Events a

Aug 14, 2022
Simple key value database that use json files to store the database

KValDB Simple key value database that use json files to store the database, the key and the respective value. This simple database have two gRPC metho

Nov 13, 2021
A rest-api that works with golang as an in-memory key value store

Rest API Service in GOLANG A rest-api that works with golang as an in-memory key value store Usage Run command below in terminal in project directory.

Dec 6, 2021
Eagle - Eagle is a fast and strongly encrypted key-value store written in pure Golang.

EagleDB EagleDB is a fast and simple key-value store written in Golang. It has been designed for handling an exaggerated read/write workload, which su

Dec 10, 2022
A SQLite-based hierarchical key-value store written in Go

camellia ?? A lightweight hierarchical key-value store camellia is a Go library that implements a simple, hierarchical, persistent key-value store, ba

Nov 9, 2022
Nipo is a powerful, fast, multi-thread, clustered and in-memory key-value database, with ability to configure token and acl on commands and key-regexes written by GO

Welcome to NIPO Nipo is a powerful, fast, multi-thread, clustered and in-memory key-value database, with ability to configure token and acl on command

Dec 28, 2022