Embedded key-value store for read-heavy workloads written in Go


Docs Build Status Go Report Card Codecov

Pogreb is an embedded key-value store for read-heavy workloads written in Go.

Key characteristics

  • 100% Go.
  • Optimized for fast random lookups and infrequent bulk inserts.
  • Can store larger-than-memory data sets.
  • Low memory usage.
  • All DB methods are safe for concurrent use by multiple goroutines.


$ go get -u github.com/akrylysov/pogreb


Opening a database

To open or create a new database, use the pogreb.Open() function:

package main

import (


func main() {
    db, err := pogreb.Open("pogreb.test", nil)
    if err != nil {
    defer db.Close()

Writing to a database

Use the DB.Put() function to insert a new key-value pair:

err := db.Put([]byte("testKey"), []byte("testValue"))
if err != nil {

Reading from a database

To retrieve the inserted value, use the DB.Get() function:

val, err := db.Get([]byte("testKey"))
if err != nil {
log.Printf("%s", val)

Iterating over items

To iterate over items, use ItemIterator returned by DB.Items():

it := db.Items()
for {
    key, val, err := it.Next()
    if err == pogreb.ErrIterationDone {
    if err != nil { 
    log.Printf("%s %s", key, val)


The benchmarking code can be found in the pogreb-bench repository.

Results of read performance benchmark of pogreb, goleveldb, bolt and badgerdb on DigitalOcean 8 CPUs / 16 GB RAM / 160 GB SSD + Ubuntu 16.04.3 (higher is better):


Design document.

  • High disk space utilization

    High disk space utilization

    Details https://github.com/ethereum/go-ethereum/pull/20029.

    When storing small keys/values Pogreb wastes too much space by making all writes 512-byte aligned.

  • Some explanation of the internals ?

    Some explanation of the internals ?

    Hello, I am trying to understand the internals of pogreb, but unfortunately I cannot seem to understand the semantics of certain aspects of the database. Namely the the data storage aspects and how they provide for ACID semantics ( if and to the extent supported by the database ) and of course the very impressive performance :) Could you please write a few words on the internals of pogreb ? I am sure that such information would be well received. Thank-you.

  • Slice out of bounds

    Slice out of bounds

    I wanted to test this db but I got this error:

    panic: runtime error: slice bounds out of range [:1073742336] with length 1073741824
    goroutine 1 [running]:
    github.com/akrylysov/pogreb/fs.mmap(0xc00008c038, 0x40000200, 0x80000000, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
            .../github.com/akrylysov/pogreb/fs/os_windows.go:32 +0x259
    github.com/akrylysov/pogreb/fs.(*osfile).Mmap(0xc000068c90, 0x40000200, 0x200, 0x200)
            .../github.com/akrylysov/pogreb/fs/os.go:100 +0x6e
    github.com/akrylysov/pogreb.(*file).append(0xc00004f140, 0xc0001b6800, 0x200, 0x200, 0x0, 0x0, 0x0)
            .../github.com/akrylysov/pogreb/file.go:45 +0xc7
    github.com/akrylysov/pogreb.(*dataFile).writeKeyValue(0xc00004f140, 0xc000089eb0, 0x8, 0x8, 0xc000089eb0, 0x8, 0x8, 0x3ffffe00, 0x0, 0x0)
            .../github.com/akrylysov/pogreb/datafile.go:44 +0x1a7
    github.com/akrylysov/pogreb.(*DB).put(0xc00004f110, 0xc95a802f, 0xc000089eb0, 0x8, 0x8, 0xc000089eb0, 0x8, 0x8, 0x0, 0x0)
            .../github.com/akrylysov/pogreb/db.go:432 +0x260
    github.com/akrylysov/pogreb.(*DB).Put(0xc00004f110, 0xc000089eb0, 0x8, 0x8, 0xc000089eb0, 0x8, 0x8, 0x0, 0x0)
            .../github.com/akrylysov/pogreb/db.go:366 +0x171
            .../main.go:27 +0x1b3
    exit status 2


    package main
    import (
    func main() {
    	db, err := pogreb.Open("pogreb.test", nil)
    	if err != nil {
    	defer db.Close()
    	start := time.Now()
    	var pk [8]byte
    	for i := uint64(1); i <= 10000000; i++ {
    		binary.BigEndian.PutUint64(pk[:], i)
    		if err := db.Put(pk[:], pk[:]); err != nil {
    	log.Println("put 10M: ", time.Now().Sub(start).String())

    I think the db needs to do automatic fsync when it reaches 1gb file?

  • panic after restart

    panic after restart

    After restart

    `panic: runtime error: slice bounds out of range [:8511984455920089209] with capacity 1073741824

    goroutine 1 [running]: github.com/akrylysov/pogreb/fs.(*osfile).Slice(0xc0002ea3f0, 0x7620a4c3a4c37679, 0x7620a4c3a4c37879, 0xc0000b7b58, 0xc0000b7af8, 0xc0000b7b48, 0xc0009a9340, 0xc0000b7b50) /exwindoz/home/juno/gowork/pkg/mod/github.com/akrylysov/[email protected]/fs/os.go:68 +0xa8 github.com/akrylysov/pogreb.(*bucketHandle).read(0xc0000b77d8, 0x20616c6c, 0x20616c6c61766174) /exwindoz/home/juno/gowork/pkg/mod/github.com/akrylysov/[email protected]/bucket.go:76 +0x56 github.com/akrylysov/pogreb.(*DB).forEachBucket(0xc0002f01a0, 0xc000000009, 0xc0000b7b58, 0x8928a1, 0x419b36) /exwindoz/home/juno/gowork/pkg/mod/github.com/akrylysov/[email protected]/db.go:178 +0xc4 github.com/akrylysov/pogreb.(*DB).put(0xc0002f01a0, 0x9d3cc9e9, 0xc00039c4b0, 0x10, 0x10, 0xc00068f000, 0x2927, 0x4b09, 0x0, 0x0) /exwindoz/home/juno/gowork/pkg/mod/github.com/akrylysov/[email protected]/db.go:384 +0x161 github.com/akrylysov/pogreb.(*DB).Put(0xc0002f01a0, 0xc00039c4b0, 0x10, 0x10, 0xc00068f000, 0x2927, 0x4b09, 0x0, 0x0) /exwindoz/home/juno/gowork/pkg/mod/github.com/akrylysov/[email protected]/db.go:366 +0x16a gitlab.com/remotejob/mlfactory-feederv4/pkg/pogrebhandler.InsertAllQue(0xc0001481c0, 0xc000586000, 0x63, 0x80, 0xc000aae000, 0x9c4) /exwindoz/home/juno/gowork/src/gitlab.com/remotejob/mlfactory-feederv4/pkg/pogrebhandler/pogrebhandler.go:25 +0x14e main.main() /exwindoz/home/juno/gowork/src/gitlab.com/remotejob/mlfactory-feederv4/cmd/rpcfeeder/main.go:274 +0x456 exit status 2`

  • Make db.sync public

    Make db.sync public

    Use case

    When using multiple databases at once, enabling the background sync feature in all of them would be redundant for most filesystems.

    The current workaround, deciding which of them to enable background sync on, can get needlessly complicated.


    • add a helper for the multi-db use case
  • Fix data corruption (issue #20)

    Fix data corruption (issue #20)

    Fixes a race condition that could lead to data corruption. See https://github.com/akrylysov/pogreb/issues/20 for more details.

    Adding an extra heap allocation and a copy made the read performance worse. I'll consider adding a new option ReadOnly which eliminates the copy for read-only use cases.


  • Memory mapping all segment files causes memory exhaustion

    Memory mapping all segment files causes memory exhaustion

    We are storing billions of records using Pogreb. It creates many 4GB segment files (.PSG). It is my understanding that those files represent the write-ahead log (WAL) which is only used in case of recover?

    If that is indeed the case, then only the last WAL file needs to be open (for writing)? Currently those files are literally exhausting our memory and use about 80 GB of RAM.


    Using RamMap we found the culprit - memory mapped PSG files: image

  • Open/read does not fail on invalid file

    Open/read does not fail on invalid file

    Recently I realized I was opening the wrong database and it took me an hour to figure it out because (*DB).FileSize() was returning non-zero and (*DB).Count() was returning zero, and there were no errors reported by (*DB).Open(). We have no standard way to figure out if the DB is invalid?

    As a bonus, doing this will also change the target file even if it wasn't a correct/working database file to begin with.

  • murmur hash functions fail on non-Windows machines due to unsafe pointers on go 1.14

    murmur hash functions fail on non-Windows machines due to unsafe pointers on go 1.14

    The Sum32WithSeed function in /hash/murmur32.go fails with "checkptr: unsafe pointer arithmetic" from Go 1.14 onwards, due to the flag -race now being applied automatically.

    This prevents pogreb from working on any non-Windows version running on Go 1.14

    An example of more correct code can be found here

  • Documentation Clarification: Rebuilding Daily

    Documentation Clarification: Rebuilding Daily

    In the documentation, you say:

    I needed to rebuild the mapping once a day and then access it in read-only mode.

    From this it makes me wonder whether pogreb is intended to be used that way, or if it was intended to solve the problem of having to do that.

  • Data corruption due to slice internals exposed

    Data corruption due to slice internals exposed

    Hi, I tested pogreb out with a very simple fuzzer that I initially wrote for bigCache, with very small adaptations (which explains why the test is a bit wonky, calling it "cache", for example). Here's the program:

    package main
    import (
    const (
    	slotsPerBucket = 28
    	loadFactor     = 0.7
    	indexPostfix   = ".index"
    	lockPostfix    = ".lock"
    	version        = 1 // file format version
    	// MaxKeyLength is the maximum size of a key in bytes.
    	MaxKeyLength = 1 << 16
    	// MaxValueLength is the maximum size of a value in bytes.
    	MaxValueLength = 1 << 30
    	// MaxKeys is the maximum numbers of keys in the DB.
    	MaxKeys = math.MaxUint32
    func removeAndOpen(path string, opts *pogreb.Options) ( *pogreb.DB, error) {
    	os.Remove(path + indexPostfix)
    	os.Remove(path + lockPostfix)
    	return pogreb.Open(path, opts)
    func fuzzDeletePutGet(ctx context.Context) {
    	cache, err := removeAndOpen("test.db", nil)
    	if err != nil {
    	var wg sync.WaitGroup
    	// Deleter
    	go func() {
    		defer wg.Done()
    		for {
    			select {
    			case <-ctx.Done():
    				r := uint8(rand.Int())
    				key := fmt.Sprintf("thekey%d", r)
    	// Setter
    	go func() {
    		defer wg.Done()
    		val := make([]byte, 1024)
    		for {
    			select {
    			case <-ctx.Done():
    				r := byte(rand.Int())
    				key := fmt.Sprintf("thekey%d", r)
    				for j := 0; j < len(val); j++ {
    					val[j] = r
    				cache.Put([]byte(key), []byte(val))
    	// Getter
    	go func() {
    		defer wg.Done()
    		var (
    			val    = make([]byte, 1024)
    			hits   = uint64(0)
    			misses = uint64(0)
    		for {
    			select {
    			case <-ctx.Done():
    				r := byte(rand.Int())
    				key := fmt.Sprintf("thekey%d", r)
    				for j := 0; j < len(val); j++ {
    					val[j] = r
    				if got, err := cache.Get([]byte(key)); got != nil && !bytes.Equal(got, val) {
    					errStr := fmt.Sprintf("got %s ->\n %x\n expected:\n %x\n ", key, got, val)
    				} else {
    					if err == nil {
    					} else {
    				if total := hits + misses; total%1000000 == 0 {
    					percentage := float64(100) * float64(hits) / float64(total)
    					fmt.Printf("Hits %d (%.2f%%) misses %d \n", hits, percentage, misses)
    func main() {
    	sigs := make(chan os.Signal, 1)
    	ctx, cancel := context.WithCancel(context.Background())
    	signal.Notify(sigs, syscall.SIGINT, syscall.SIGTERM)
    	fmt.Println("Press ctrl-c to exit")
    	go fuzzDeletePutGet(ctx)

    The program has three workers :

    • One that randomly deletes a key
    • One that randomly writes a key, where there's a well defined correlation between key and value.
    • One that randomly checks if a key/value mapping is consistent.

    When I ran it, it errorred out after about 4M or 5M tests:

    GOROOT=/rw/usrlocal/go #gosetup
    GOPATH=/home/user/go #gosetup
    /rw/usrlocal/go/bin/go build -o /tmp/___go_build_fuzzer_go /home/user/go/src/github.com/akrylysov/pogreb/fuzz/fuzzer.go #gosetup
    /tmp/___go_build_fuzzer_go #gosetup
    Press ctrl-c to exit
    Hits 1000000 (100.00%) misses 0 
    Hits 2000000 (100.00%) misses 0 
    Hits 3000000 (100.00%) misses 0 
    Hits 4000000 (100.00%) misses 0 
    Hits 5000000 (100.00%) misses 0 
    panic: got thekey112 ->
    goroutine 10 [running]:
    main.fuzzDeletePutGet.func3(0xc00001a650, 0x6ee480, 0xc0000601c0, 0xc00008b110)
    	/home/user/go/src/github.com/akrylysov/pogreb/fuzz/fuzzer.go:108 +0x656
    created by main.fuzzDeletePutGet
    	/home/user/go/src/github.com/akrylysov/pogreb/fuzz/fuzzer.go:88 +0x17a

    Looking into it a bit, I found that although the Get method is properly mutex:ed, the value is in fact a pointer to a slice, and not copied out into a new buffer.

    I hacked on a little fix:

    diff --git a/db.go b/db.go
    index 967bbf0..961add9 100644
    --- a/db.go
    +++ b/db.go
    @@ -288,7 +288,12 @@ func (db *DB) Get(key []byte) ([]byte, error) {
            if err != nil {
                    return nil, err
    -       return retValue, nil
    +       var safeRetValue []byte
    +       if retValue != nil{
    +               safeRetValue = make([]byte, len(retValue))
    +               copy(safeRetValue, retValue)
    +       }
    +       return safeRetValue, nil
     // Has returns true if the DB contains the given key.

    And with the attached fix, I couldn't reproduce it any longer (at least not for 10M+ tests.

    The benchmarks without and with the hacky fix are:

    BenchmarkGet-6   	10000000	       166 ns/op
    BenchmarkGet-6   	10000000	       182 ns/op

    Now, I'm not totally sure if the testcase is fair, as I'm not 100% sure what concurrency-guarantees pogreb has. My test has both a setter and a deleter, so basically two writers and one reader, which might not be a supported setup? (on the other hand, I'm guessing this flaw should be reproducible even with only one writer)

  • Extremely slow read speed while put speed is fine on Debian Machine

    Extremely slow read speed while put speed is fine on Debian Machine

    Hi there I am currently testing if pogreb fits my needs and am very impressed by its speed however I recently ran some benchmarks ( pogreb-benchmark ) on a Debian Server ./pogreb-bench -n 10_000_000 -p ./pogreb_test/ and am experiencing extremely slow read speed

    put: 503.882s 19845 ops/s I don't have a full duration for read speed since it would take too long to finish but it read about 630000 in 1500s

    I also made same test on Macbook where everything works great Any idea how this is possible? What can I do to pinpoint the issue?

    Edit: I tried it without mmap: put: 65.852s 151855 ops/s get: 25.389s 393876 ops/s

    However the issue persists at n=100_000_000

    Any idea why this is faster

    Thanks a lot!

  • add ReadOnly config option for read-only filesystems

    add ReadOnly config option for read-only filesystems

    This PR adds a ReadOnly config option to be able to put the database on a read-only filesystem. Enabling this config options disables the Lockfile mechanism and sets all file access flags to O_RDONLY.

  • Its safe for multiple go instance writes?

    Its safe for multiple go instance writes?

    Hi, from documentation its clear that storage can work with multiple goroutines inside one singleton application. But can it work in scaled applications?

    For example, i have N instances of go application. Each have X goroutines. N * X functions will write data to db file in parallel, its safe?

  • 4 billion records max?

    4 billion records max?

    I just realized that index.numKeys is a 32-bit uint, and there's MaxKeys = math.MaxUint32 😲

    I think it would make sense to change it to 64-bit (any reason why we wouldn't support max 64-bit number of records)? I assume it would break existing dbs (but is still necessary)?

    At least it should be clearly stated as limitation in the readme I would suggest.

    Our use case is to store billions of records. We've reached already 2 billion records with Pogreb - which means in a matter of weeks we'll hit the current upper limit 😢

An embedded key/value database for Go.

bbolt bbolt is a fork of Ben Johnson's Bolt key/value store. The purpose of this fork is to provide the Go community with an active maintenance and de

Jan 1, 2023
An embedded, hardened key/value database for Go.

Bolt Bolt is a pure Go key/value store inspired by Howard Chu's LMDB project. The goal of the project is to provide a simple, fast, and reliable datab

Nov 4, 2021
A simple, fast, embeddable, persistent key/value store written in pure Go. It supports fully serializable transactions and many data structures such as list, set, sorted set.

NutsDB English | 简体中文 NutsDB is a simple, fast, embeddable and persistent key/value store written in pure Go. It supports fully serializable transacti

Jan 1, 2023
Fast and simple key/value store written using Go's standard library
Fast and simple key/value store written using Go's standard library

Table of Contents Description Usage Cookbook Disadvantages Motivation Benchmarks Test 1 Test 4 Description Package pudge is a fast and simple key/valu

Nov 17, 2022
Eagle - Eagle is a fast and strongly encrypted key-value store written in pure Golang.

EagleDB EagleDB is a fast and simple key-value store written in Golang. It has been designed for handling an exaggerated read/write workload, which su

Dec 10, 2022
A SQLite-based hierarchical key-value store written in Go

camellia ?? A lightweight hierarchical key-value store camellia is a Go library that implements a simple, hierarchical, persistent key-value store, ba

Nov 9, 2022
Nipo is a powerful, fast, multi-thread, clustered and in-memory key-value database, with ability to configure token and acl on commands and key-regexes written by GO

Welcome to NIPO Nipo is a powerful, fast, multi-thread, clustered and in-memory key-value database, with ability to configure token and acl on command

Dec 28, 2022
A disk-backed key-value store.

What is diskv? Diskv (disk-vee) is a simple, persistent key-value store written in the Go language. It starts with an incredibly simple API for storin

Jan 7, 2023
An in-memory key:value store/cache (similar to Memcached) library for Go, suitable for single-machine applications.

go-cache go-cache is an in-memory key:value store/cache similar to memcached that is suitable for applications running on a single machine. Its major

Dec 29, 2022
Low-level key/value store in pure Go.
Low-level key/value store in pure Go.

Description Package slowpoke is a simple key/value store written using Go's standard library only. Keys are stored in memory (with persistence), value

Jan 2, 2023
Key-value store for temporary items :memo:

Tempdb TempDB is Redis-backed temporary key-value store for Go. Useful for storing temporary data such as login codes, authentication tokens, and temp

Sep 26, 2022
A distributed key-value store. On Disk. Able to grow or shrink without service interruption.

Vasto A distributed high-performance key-value store. On Disk. Eventual consistent. HA. Able to grow or shrink without service interruption. Vasto sca

Jan 6, 2023
Distributed reliable key-value store for the most critical data of a distributed system

etcd Note: The master branch may be in an unstable or even broken state during development. Please use releases instead of the master branch in order

Jan 9, 2023
a key-value store with multiple backends including leveldb, badgerdb, postgresql

Overview goukv is an abstraction layer for golang based key-value stores, it is easy to add any backend provider. Available Providers badgerdb: Badger

Jan 5, 2023
A minimalistic in-memory key value store.
A minimalistic in-memory key value store.

A minimalistic in-memory key value store. Overview You can think of Kiwi as thread safe global variables. This kind of library comes in helpful when y

Dec 6, 2021
Membin is an in-memory database that can be stored on disk. Data model smiliar to key-value but values store as JSON byte array.

Membin Docs | Contributing | License What is Membin? The Membin database system is in-memory database smiliar to key-value databases, target to effici

Jun 3, 2021
A simple Git Notes Key Value store

Gino Keva - Git Notes Key Values Gino Keva works as a simple Key Value store built on top of Git Notes, using an event sourcing architecture. Events a

Aug 14, 2022
A distributed key value store in under 1000 lines. Used in production at comma.ai

minikeyvalue Fed up with the complexity of distributed filesystems? minikeyvalue is a ~1000 line distributed key value store, with support for replica

Jan 9, 2023
Distributed cache and in-memory key/value data store.

Distributed cache and in-memory key/value data store. It can be used both as an embedded Go library and as a language-independent service.

Dec 30, 2022