Efficient cache for gigabytes of data written in Go.

Last update: Jan 4, 2023

Comments: 15

BigCache

Fast, concurrent, evicting in-memory cache written to keep big number of entries without impact on performance. BigCache keeps entries on heap but omits GC for them. To achieve that, operations on byte slices take place, therefore entries (de)serialization in front of the cache will be needed in most use cases.

Requires Go 1.12 or newer.

Usage

Simple initialization

import "github.com/allegro/bigcache"

cache, _ := bigcache.NewBigCache(bigcache.DefaultConfig(10 * time.Minute))

cache.Set("my-unique-key", []byte("value"))

entry, _ := cache.Get("my-unique-key")
fmt.Println(string(entry))

Custom initialization

When cache load can be predicted in advance then it is better to use custom initialization because additional memory allocation can be avoided in that way.

import (
	"log"

	"github.com/allegro/bigcache"
)

config := bigcache.Config {
		// number of shards (must be a power of 2)
		Shards: 1024,

		// time after which entry can be evicted
		LifeWindow: 10 * time.Minute,

		// Interval between removing expired entries (clean up).
		// If set to <= 0 then no action is performed.
		// Setting to < 1 second is counterproductive — bigcache has a one second resolution.
		CleanWindow: 5 * time.Minute,

		// rps * lifeWindow, used only in initial memory allocation
		MaxEntriesInWindow: 1000 * 10 * 60,

		// max entry size in bytes, used only in initial memory allocation
		MaxEntrySize: 500,

		// prints information about additional memory allocation
		Verbose: true,

		// cache will not allocate more memory than this limit, value in MB
		// if value is reached then the oldest entries can be overridden for the new ones
		// 0 value means no size limit
		HardMaxCacheSize: 8192,

		// callback fired when the oldest entry is removed because of its expiration time or no space left
		// for the new entry, or because delete was called. A bitmask representing the reason will be returned.
		// Default value is nil which means no callback and it prevents from unwrapping the oldest entry.
		OnRemove: nil,

		// OnRemoveWithReason is a callback fired when the oldest entry is removed because of its expiration time or no space left
		// for the new entry, or because delete was called. A constant representing the reason will be passed through.
		// Default value is nil which means no callback and it prevents from unwrapping the oldest entry.
		// Ignored if OnRemove is specified.
		OnRemoveWithReason: nil,
	}

cache, initErr := bigcache.NewBigCache(config)
if initErr != nil {
	log.Fatal(initErr)
}

cache.Set("my-unique-key", []byte("value"))

if entry, err := cache.Get("my-unique-key"); err == nil {
	fmt.Println(string(entry))
}

`LifeWindow` & `CleanWindow`

LifeWindow is a time. After that time, an entry can be called dead but not deleted.
CleanWindow is a time. After that time, all the dead entries will be deleted, but not the entries that still have life.

Benchmarks

Three caches were compared: bigcache, freecache and map. Benchmark tests were made using an i7-6700K CPU @ 4.00GHz with 32GB of RAM on Ubuntu 18.04 LTS (5.2.12-050212-generic).

Benchmarks source code can be found here

Writes and reads

go version
go version go1.13 linux/amd64

go test -bench=. -benchmem -benchtime=4s ./... -timeout 30m
goos: linux
goarch: amd64
pkg: github.com/allegro/bigcache/v2/caches_bench
BenchmarkMapSet-8                     	12999889	       376 ns/op	     199 B/op	       3 allocs/op
BenchmarkConcurrentMapSet-8           	 4355726	      1275 ns/op	     337 B/op	       8 allocs/op
BenchmarkFreeCacheSet-8               	11068976	       703 ns/op	     328 B/op	       2 allocs/op
BenchmarkBigCacheSet-8                	10183717	       478 ns/op	     304 B/op	       2 allocs/op
BenchmarkMapGet-8                     	16536015	       324 ns/op	      23 B/op	       1 allocs/op
BenchmarkConcurrentMapGet-8           	13165708	       401 ns/op	      24 B/op	       2 allocs/op
BenchmarkFreeCacheGet-8               	10137682	       690 ns/op	     136 B/op	       2 allocs/op
BenchmarkBigCacheGet-8                	11423854	       450 ns/op	     152 B/op	       4 allocs/op
BenchmarkBigCacheSetParallel-8        	34233472	       148 ns/op	     317 B/op	       3 allocs/op
BenchmarkFreeCacheSetParallel-8       	34222654	       268 ns/op	     350 B/op	       3 allocs/op
BenchmarkConcurrentMapSetParallel-8   	19635688	       240 ns/op	     200 B/op	       6 allocs/op
BenchmarkBigCacheGetParallel-8        	60547064	        86.1 ns/op	     152 B/op	       4 allocs/op
BenchmarkFreeCacheGetParallel-8       	50701280	       147 ns/op	     136 B/op	       3 allocs/op
BenchmarkConcurrentMapGetParallel-8   	27353288	       175 ns/op	      24 B/op	       2 allocs/op
PASS
ok  	github.com/allegro/bigcache/v2/caches_bench	256.257s

Writes and reads in bigcache are faster than in freecache. Writes to map are the slowest.

GC pause time

go version
go version go1.13 linux/amd64

go run caches_gc_overhead_comparison.go

Number of entries:  20000000
GC pause for bigcache:  1.506077ms
GC pause for freecache:  5.594416ms
GC pause for map:  9.347015ms

go version
go version go1.13 linux/arm64

go run caches_gc_overhead_comparison.go
Number of entries:  20000000
GC pause for bigcache:  22.382827ms
GC pause for freecache:  41.264651ms
GC pause for map:  72.236853ms

Test shows how long are the GC pauses for caches filled with 20mln of entries. Bigcache and freecache have very similar GC pause time.

Memory usage

You may encounter system memory reporting what appears to be an exponential increase, however this is expected behaviour. Go runtime allocates memory in chunks or 'spans' and will inform the OS when they are no longer required by changing their state to 'idle'. The 'spans' will remain part of the process resource usage until the OS needs to repurpose the address. Further reading available here.

How it works

BigCache relies on optimization presented in 1.5 version of Go (issue-9477). This optimization states that if map without pointers in keys and values is used then GC will omit its content. Therefore BigCache uses map[uint64]uint32 where keys are hashed and values are offsets of entries.

Entries are kept in byte slices, to omit GC again. Byte slices size can grow to gigabytes without impact on performance because GC will only see single pointer to it.

Collisions

BigCache does not handle collisions. When new item is inserted and it's hash collides with previously stored item, new item overwrites previously stored value.

Bigcache vs Freecache

Both caches provide the same core features but they reduce GC overhead in different ways. Bigcache relies on map[uint64]uint32, freecache implements its own mapping built on slices to reduce number of pointers.

Results from benchmark tests are presented above. One of the advantage of bigcache over freecache is that you don’t need to know the size of the cache in advance, because when bigcache is full, it can allocate additional memory for new entries instead of overwriting existing ones as freecache does currently. However hard max size in bigcache also can be set, check HardMaxCacheSize.

HTTP Server

This package also includes an easily deployable HTTP implementation of BigCache, which can be found in the server package.

Bigcache genesis is described in allegro.tech blog post: writing a very fast cache service in Go

License

BigCache is released under the Apache 2.0 license (see LICENSE)

Owner

Allegro Tech

Allegro Tech Open Source Projects

https://github.com/allegro/bigcache http://allegro.tech/2016/03/writing-fast-cache-service-in-go.html

Comments

panic out of range in bytes queue

https://github.com/allegro/bigcache/blob/bbf64ae21fc5555f4e9752825ee9ffe13b1e5fa0/queue/bytes_queue.go#L222

It appears there is a bounds check before the blockSize, but then there is an assumption that adding block size is not out of bounds.
Added basic HTTP server implementation.
Features: Basic HTTP server. Basic middleware implementation.

I wanted to provide some thoughts on this implementation.

API Surface:

GET /api/v1/cache/{key} PUT /api/v1/cache/{key}

The API path denotes it's an API which is versioned, so making changes/implementing new features allows it to be versioned easily. I felt PUT was more relevant than POST because bigcache.BigCache.Set() will overwrite an existing key of the same name. PUT more closely aligns with the semantic aspect of the feature as opposed to the spirit. When looking at the /cache/ heirarchy of the path, there was a PR I saw recently which looked to implement stats about the cache. If it gets approved, you can add /api/v1/stats to the API surface as an extra API. Overall, this API surface seemed easily sustainable to me.

HTTP Implementation:

What I love about this package is the focus is on performance and the standard library. My proposal suggested using the standard library, so I felt that was important. When looking at implementation options, I wanted something that could be easily extensible without pulling in dependencies. This post gave me a great idea, and I liked the adapter idea because it's highly extensible using only the standard library. If you want to implement request tracing, it's easy to do by adding a quick service adapter; if you want to implement some form of authentication or geo-limitation, just add another service adapter. The requestMetrics() feature, to me, is both a way to log basic metrics about the HTTP server, as well as an example for a basic service adapter.

Overall, my goal with this implementation was a focus on the standard library with no external dependencies in a highly extensible way.
Use uint64 intead of uint32

There are posibility we run into a problem of int32 overflow. To prevent this let's use uint64 everywhere.

https://github.com/allegro/bigcache/blob/21e5ca5c3d539f94e8dc563350acd97c5400154f/shard.go#L138

Fixes: https://github.com/allegro/bigcache/issues/148
Add remove reason signal to OnRemove callback

I'm still pretty new to golang, so this PR probably has a C-like odor to it (especially with regard to the bitmasks). The general purpose is to indicate in the OnRemove callback why a key is being removed. There are 3 reasons, represented as bitmasks by leftshifting iota in a const expression. I'm not sure if there's a more go-like approach, but I'd like to see support for this feature so I thought I'd open a PR and get a discussion going.

As for the API- I've made a small breaking change. Users of the library will have to add a parameter to their OnRemove callbacks, which can be ignored. If it's preferable to add new functions instead of changing the signature if existing ones, I'd be happy to adapt the code.
fix: panic when update and iterate simultaneously
Fix panic when update and iteration simultaneously. Related to https://github.com/allegro/bigcache/issues/222.

Add panic recover in cleanUp to prevent the program exited. Related to https://github.com/allegro/bigcache/issues/226, https://github.com/allegro/bigcache/issues/148, just protect the main program not exited.

Byte queue set full is false after allocated addition memory. Also added a test case reproduce this problem.
Memory usage grows indefinitely
Hello,

I've been playing around with bigcache and I've noticed that calling Set() with the same keys causes the memory usage to grow indefinitely. Here's an example:

cache, _ := bigcache.NewBigCache(bigcache.DefaultConfig(10 * time.Minute)) data := []byte("TESTDATATESTDATATESTDATATESTDATATESTDATATESTDATATESTDATA") for { for i := 0; i < 10000; i++ { cache.Set(strconv.Itoa(i), data) } time.Sleep(time.Second) }

Running that causes the memory usage of the application to grow indefinitely until I ran out of memory.. Is this expected behaviour? I'm just using bigcache as if it was a concurrent map and I would have expected the elements to get replaced and thus memory usage shouldn't grow beyond what is necessary for 10,000 elements.
Improve encapsulation of shards
Get, Set methods are moved from bigcache.go to shard.go

initNewShard has additional param clock

all locks logic exists only inside shard

get rid of copyCurrentShardMap

getIndex and oldest methods added to shard

introduced index variable in getShard

Next PR will introduce Shard interface, newShard method and additional changes.

HardMaxCacheSize exceeded

Sorry if this is a complete duplicate of https://github.com/allegro/bigcache/issues/18 , but that issue is 3 years old. And I see the same problem.

I even took the code from that issue. Here is my output:

$ go run .
2019/06/17 18:41:17 profile: memory profiling enabled (rate 4096), mem.pprof
Number of entries: 200000Alloc:      0 MB
Alloc:     46 MB
Alloc:     55 MB
Alloc:     78 MB
Alloc:     61 MB
Alloc:     87 MB
Alloc:     70 MB
Alloc:     50 MB
Alloc:     74 MB
Alloc:     57 MB
Alloc:     83 MB
2019/06/17 18:41:20 profile: memory profiling disabled, mem.pprof

$ go tool pprof mem.pprof
Type: inuse_space
Time: Jun 17, 2019 at 6:41pm (MSK)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 34.85MB, 99.33% of 35.09MB total
Dropped 13 nodes (cum <= 0.18MB)
      flat  flat%   sum%        cum   cum%
   31.56MB 89.96% 89.96%    31.56MB 89.96%  github.com/allegro/bigcache/queue.NewBytesQueue
    3.29MB  9.37% 99.33%    34.86MB 99.36%  github.com/allegro/bigcache.initNewShard
         0     0% 99.33%    34.86MB 99.36%  github.com/allegro/bigcache.NewBigCache
         0     0% 99.33%    34.86MB 99.36%  github.com/allegro/bigcache.newBigCache
         0     0% 99.33%    35.08MB   100%  main.main
         0     0% 99.33%    35.08MB   100%  runtime.main

Quite unexpected for HardMaxCacheSize: 1.

Data race when running test with 1000 goroutines on Travis CI

I'm developing a key-value store abstraction and implementation / wrapper package for Go and one of the implementations is for BigCache. I have a test that launches 1000 goroutines to concurrently interact with the underlying store. On my local machine it works fine all the time, but on Travis CI I sometimes get this warning and then a subsequent error:

WARNING: DATA RACE
Write at 0x00c43a932018 by goroutine 163:
  runtime.slicecopy()
      /home/travis/.gimme/versions/go1.10.linux.amd64/src/runtime/slice.go:192 +0x0
  github.com/allegro/bigcache/queue.(*BytesQueue).push()
      /home/travis/gopath/src/github.com/allegro/bigcache/queue/bytes_queue.go:129 +0x2ca
  github.com/allegro/bigcache/queue.(*BytesQueue).Push()
      /home/travis/gopath/src/github.com/allegro/bigcache/queue/bytes_queue.go:81 +0xf0
  github.com/allegro/bigcache.(*cacheShard).set()
      /home/travis/gopath/src/github.com/allegro/bigcache/shard.go:75 +0x209
  github.com/allegro/bigcache.(*BigCache).Set()
      /home/travis/gopath/src/github.com/allegro/bigcache/bigcache.go:117 +0x153
  github.com/philippgille/gokv/bigcache.Store.Set()
      /home/travis/gopath/src/github.com/philippgille/gokv/bigcache/bigcache.go:42 +0x1e3
  github.com/philippgille/gokv/bigcache.(*Store).Set()
      <autogenerated>:1 +0xa0
  github.com/philippgille/gokv/test.InteractWithStore()
      /home/travis/gopath/src/github.com/philippgille/gokv/test/test.go:306 +0x1d0

For the full test output see: https://travis-ci.org/philippgille/gokv/builds/468489707#L1206

The test itself is: https://github.com/philippgille/gokv/blob/e48dea7fdf56ca55fecd32be28d8fd895682ae3a/bigcache/bigcache_test.go#L42 The implementation is: https://github.com/philippgille/gokv/blob/e48dea7fdf56ca55fecd32be28d8fd895682ae3a/bigcache/bigcache.go

Is this an error in the way I use BigCache? Or is this a bug in BigCache itself?

Remove unsafe code for appengine
This function should be replaced with a safe versions and protected with a build tag:

func bytesToString(b []byte) string { return string(b) }

See: https://github.com/allegro/bigcache/issues/96
Implemented new Append() method

Implemented new Append() method with proper locking. Without this method you would need to wrap a Get()/Set() part with your own locks hurting performance.

Fixes #158
Error logs about bytes allocation?
What is the issue you are having?

I'm seeing this get printed out with a severity of error. I'm wondering is this something I need to worry about? If not, is there a reason why it reports as an error vs. a info or warn ?

2022/11/11 18:52:43 bytes_queue.go:117: Allocated new queue in 604.417µs; Capacity: 2522348

I'm basically using the default configuration but

LifeWindow = 30 * time.Minute Shards = 2048

I'm wondering is there a guide lines on how to properly configure the shards properly? Also sorry if this isn't the right forum to ask the question. I can move it somewhere else if you know a better forum for this quesiton.

Environment: github.com/allegro/bigcache v1.2.1 go 1.18

v3.0.2 bytesqueue peak panic

found a panic:

runtime error: slice bounds out of range [:31874] with capacity 24992
goroutine 1708235262 [running]:
panic({0x1618300, 0xc030365740})
        /data/golang/1.18/src/runtime/panic.go:838 +0x207
github.com/allegro/bigcache/v3/queue.(*BytesQueue).peek(0x140e960?, 0xc001730630?)
        /root/go/pkg/mod/github.com/allegro/bigcache/[email protected]/queue/bytes_queue.go:242 +0x177
github.com/allegro/bigcache/v3/queue.(*BytesQueue).Get(...)
        /root/go/pkg/mod/github.com/allegro/bigcache/[email protected]/queue/bytes_queue.go:193
github.com/allegro/bigcache/v3.(*cacheShard).getWrappedEntry(0xc0004f5200, 0xc06ab11d60?)
        /root/go/pkg/mod/github.com/allegro/bigcache/[email protected]/shard.go:92 +0x46
github.com/allegro/bigcache/v3.(*cacheShard).get(0xc0004f5200, {0x18a8321, 0x1}, 0x1000000000001?)
        /root/go/pkg/mod/github.com/allegro/bigcache/[email protected]/shard.go:64 +0x7e
github.com/allegro/bigcache/v3.(*BigCache).Get(0xc002c324e0, {0x18a8321, 0x1})
        /root/go/pkg/mod/github.com/allegro/bigcache/[email protected]/bigcache.go:123 +0x69

Environment:

Version (git sha or release): v3.0.2
OS (e.g. from /etc/os-release or winver.exe): centos
go version: 1.18

how about higher quality hash algorithm?
Although hash algorithm is not the Bottleneck of bigcache, the smhasher(the project of hash function quality and speed tests) show that FVNa has poor quality while other hash func has higher.

Should we evaluate the default hash func of bigcache again? After the question and answer of hash, lots of new hash functions come out, such as xxhash, wyhash, menhash, which have higher quality.

Due to some hash functions use hardware instructions provided by the CPU to accelerate (AVX2, SSE), we can use different hash func in different Platform, such as:

FVNa default

xxhash in amd

By the way, why bigcache has no dependence? Should only no dependence PR can be approve?

Look forward to your reply!
Reset() don't work
What is the issue you are having? Reset shards no longer works

What is BigCache doing that it shouldn't? Empty the shards.

Using stats method i can see the how many items are in cache

cache, initErr := bigcache.NewBigCache(config) cache.Stats() Output: {"hits":1222,"misses":2,"delete_hits":0,"delete_misses":0,"collisions":0}

but when i run the method Reset to flush the shards, they are not flushed / empty, , in previously versions this method works correctly.

cache, initErr := bigcache.NewBigCache(config) cache.Reset()

After i run the Reset() the stats are se same Output: {"hits":1222,"misses":2,"delete_hits":0,"delete_misses":0,"collisions":0}

Environment:

Version (git sha or release): [email protected]

OS (e.g. from /etc/os-release or winver.exe): Debian GNU/Linux 10 (buster)

go version: 1.18.3

Distributed cache and in-memory key/value data store.

Distributed cache and in-memory key/value data store. It can be used both as an embedded Go library and as a language-independent service.

Dec 30, 2022

The most concise and efficient algorithm of consistent hash based on golang

consistent This package of consistent is the most concise and efficient algorithm of consistent hash based on golang. Example Quick start: package mai

Dec 28, 2021

Eventually consistent distributed in-memory cache Go library

bcache A Go Library to create distributed in-memory cache inside your app. Features LRU cache with configurable maximum keys Eventual Consistency sync

Dec 2, 2022

Fast thread-safe inmemory cache for big number of entries in Go. Minimizes GC overhead

fastcache - fast thread-safe inmemory cache for big number of entries in Go Features Fast. Performance scales on multi-core CPUs. See benchmark result

Dec 30, 2022

An in-memory key:value store/cache (similar to Memcached) library for Go, suitable for single-machine applications.

go-cache go-cache is an in-memory key:value store/cache similar to memcached that is suitable for applications running on a single machine. Its major

Dec 29, 2022

groupcache is a caching and cache-filling library, intended as a replacement for memcached in many cases.

groupcache Summary groupcache is a distributed caching and cache-filling library, intended as a replacement for a pool of memcached nodes in many case

Dec 29, 2022

Distributed cache with gossip peer membership enrollment.

Autocache Groupcache enhanced with memberlist for distributed peer discovery. TL;DR See /_example/ for usage. Run docker-compose -f _example/docker-co

Dec 8, 2022

MyCache - A distributed cache based on GeeCache

借鉴GeeCache实现了MyCache（https://geektutu.com/post/geecache.html）主要功能： 1.实现了fifo和lr

Feb 18, 2022

A simple, fast, embeddable, persistent key/value store written in pure Go. It supports fully serializable transactions and many data structures such as list, set, sorted set.

NutsDB English | 简体中文 NutsDB is a simple, fast, embeddable and persistent key/value store written in pure Go. It supports fully serializable transacti

Jan 1, 2023

Lightweight RESTful database engine based on stack data structures

piladb [pee-lah-dee-bee]. pila means stack or battery in Spanish. piladb is a lightweight RESTful database engine based on stack data structures. Crea

Nov 27, 2022

Distributed reliable key-value store for the most critical data of a distributed system

etcd Note: The master branch may be in an unstable or even broken state during development. Please use releases instead of the master branch in order

Jan 9, 2023

VectorSQL is a free analytics DBMS for IoT & Big Data, compatible with ClickHouse.

NOTICE: This project have moved to fuse-query VectorSQL is a free analytics DBMS for IoT & Big Data, compatible with ClickHouse. Features High Perform

Jan 6, 2023

datatable is a Go package to manipulate tabular data, like an excel spreadsheet.

datatable is a Go package to manipulate tabular data, like an excel spreadsheet. datatable is inspired by the pandas python package and the data.frame R structure. Although it's production ready, be aware that we're still working on API improvements

Nov 23, 2022

This is a mongodb data comparison tool.

mongo-compare This is a mongodb data comparison tool. In the mongodb official tools, mongodb officially provides a series of tools such as mongodump,

Sep 13, 2022

Membin is an in-memory database that can be stored on disk. Data model smiliar to key-value but values store as JSON byte array.

Membin Docs | Contributing | License What is Membin? The Membin database system is in-memory database smiliar to key-value databases, target to effici

Jun 3, 2021

Owl is a db manager platform,committed to standardizing the data, index in the database and operations to the database, to avoid risks and failures.

Owl is a db manager platform,committed to standardizing the data, index in the database and operations to the database, to avoid risks and failures. capabilities which owl provides include Process approval、sql Audit、sql execute and execute as crontab、data backup and recover .

Nov 9, 2022

Mantil-template-form-to-dynamodb - Receive form data and write it to a DynamoDB table

This template is an example of serverless integration between Google Forms and DynamoDB

Jan 17, 2022

🔑A high performance Key/Value store written in Go with a predictable read/write performance and high throughput. Uses a Bitcask on-disk layout (LSM+WAL) similar to Riak.

bitcask A high performance Key/Value store written in Go with a predictable read/write performance and high throughput. Uses a Bitcask on-disk layout

Sep 26, 2022

Embedded key-value store for read-heavy workloads written in Go

Pogreb Pogreb is an embedded key-value store for read-heavy workloads written in Go. Key characteristics 100% Go. Optimized for fast random lookups an

Jan 3, 2023