groupcache is a caching and cache-filling library, intended as a replacement for memcached in many cases.

groupcache

Summary

groupcache is a distributed caching and cache-filling library, intended as a replacement for a pool of memcached nodes in many cases.

For API docs and examples, see http://godoc.org/github.com/golang/groupcache

Comparison to memcached

Like memcached, groupcache:

  • shards by key to select which peer is responsible for that key

Unlike memcached, groupcache:

  • does not require running a separate set of servers, thus massively reducing deployment/configuration pain. groupcache is a client library as well as a server. It connects to its own peers, forming a distributed cache.

  • comes with a cache filling mechanism. Whereas memcached just says "Sorry, cache miss", often resulting in a thundering herd of database (or whatever) loads from an unbounded number of clients (which has resulted in several fun outages), groupcache coordinates cache fills such that only one load in one process of an entire replicated set of processes populates the cache, then multiplexes the loaded value to all callers.

  • does not support versioned values. If key "foo" is value "bar", key "foo" must always be "bar". There are neither cache expiration times, nor explicit cache evictions. Thus there is also no CAS, nor Increment/Decrement. This also means that groupcache....

  • ... supports automatic mirroring of super-hot items to multiple processes. This prevents memcached hot spotting where a machine's CPU and/or NIC are overloaded by very popular keys/values.

  • is currently only available for Go. It's very unlikely that I (bradfitz@) will port the code to any other language.

Loading process

In a nutshell, a groupcache lookup of Get("foo") looks like:

(On machine #5 of a set of N machines running the same code)

  1. Is the value of "foo" in local memory because it's super hot? If so, use it.

  2. Is the value of "foo" in local memory because peer #5 (the current peer) is the owner of it? If so, use it.

  3. Amongst all the peers in my set of N, am I the owner of the key "foo"? (e.g. does it consistent hash to 5?) If so, load it. If other callers come in, via the same process or via RPC requests from peers, they block waiting for the load to finish and get the same answer. If not, RPC to the peer that's the owner and get the answer. If the RPC fails, just load it locally (still with local dup suppression).

Users

groupcache is in production use by dl.google.com (its original user), parts of Blogger, parts of Google Code, parts of Google Fiber, parts of Google production monitoring systems, etc.

Presentations

See http://talks.golang.org/2013/oscon-dl.slide

Help

Use the golang-nuts mailing list for any discussion or questions.

Owner
Go
The Go Programming Language
Go
Comments
  • defaultReplicas = 3 causes badly distributed hash rings

    defaultReplicas = 3 causes badly distributed hash rings

    Currently the hash ring has 3 replicas per node, that can cause statistical imbalances of key distribution, especially with low number of machines.

    The number of replicas should be configurable. For example, a similar Python hash ring library has 40 replicas per node. From tests I made, this is indeed the vicinity of the replicas number where the distribution stabilizes, regardless of the number of nodes.

    I made a few benchmarks directly on the consistent hash to illustrate this (I can post the code to generate this if you want, it uses random "ip addrsses" and keys). The bars are the number of random keys out of 10k keys that mapped to each random node. all tests on the same number of nodes were made with the same node "ips".

    2 Nodes X  3 Replicas:
        36.75% | ####################################
        63.25% | ###############################################################
    
    2 Nodes X  33 Replicas:
        49.77% | #################################################
        50.23% | ##################################################
    
    3 Nodes X  3 Replicas:
        23.05% | #######################
        33.45% | #################################
        43.50% | ###########################################
    
    3 Nodes X  33 Replicas:
        31.50% | ###############################
        31.14% | ###############################
        37.36% | #####################################
    
    7 Nodes X  3 Replicas:
        30.08% | ##############################
        11.24% | ###########
        6.44% | ######
        7.29% | #######
        27.42% | ###########################
        9.26% | #########
        8.27% | ########
    
    7 Nodes X  43 Replicas:
        13.53% | #############
        16.00% | ################
        16.92% | ################
        10.31% | ##########
        14.78% | ##############
        12.74% | ############
        15.72% | ###############
    
  • A bug in getting from peers

    A bug in getting from peers

    Reproduce steps

    pool := groupcache.NewHTTPPool("http://127.0.0.1:"+os.Getenv("PORT"))
    cache := groupcache.NewGroup("cacher", 64<<20, someGetterFunc)
    pool.Set("http://127.0.0.1:50000", "http://127.0.0.1:50001")
    

    and run it in two port :50000 and :50001 Now, if use /xxx as a key, then the groupcacher will act like there is no any peer. i.e. It will only get data from locally, instead of getting from peers.

    The problem is the slash in the head of the key (i.e. if the key is xxx then everything goes right), and I still can't figure out why.

  • Added prefix matching for consistenthash lookups

    Added prefix matching for consistenthash lookups

    Changes consistenthash's get() function from O(log n) to O(1), with a memory overhead of 6*n (where n is the number of virtual nodes).

    On an AWS t3.2xlarge dedicated instance, this reduces the consistenthash benchmark times by 56% (8 nodes) to 72% (512 nodes). Data available at https://docs.google.com/spreadsheets/d/1K_kmk0_Lqk6iaSDUytjkT8RNGPTptBwBEWO8q4uAn3w/edit?usp=sharing

  • allow all options of HTTPPool to be specified by users

    allow all options of HTTPPool to be specified by users

    The default constructor NewHTTPPool is making too many assumptions about how people would use the library. For example, for security purpose I'd want groupcache to listen on a different port by calling Handle on a different ServeMux other than the http.DefaultServeMux. Others have also requested to be able to specify the basePath perhaps to shorter paths for performance reasons https://github.com/golang/groupcache/issues/22 ?

  • fix HTTPPool can't find peers bug

    fix HTTPPool can't find peers bug

    Description

    The groupcache node canot download content from peer when peer has the right content When use groupcache.HTTPPool

    Details

    URL like http://10.246.14.51:5100/_groupcache/thumbnail/%2Fapi%2Fapks will return http status(301)

    client has to request to http://10.246.14.51:5100/_groupcache/thumbnail/api/apks again.

    But tr.RoundTrip cannot follow http redirect. So use http.Client will fix this problem.

  • Best practice for updating a cache entry frequently

    Best practice for updating a cache entry frequently

    My question is a bit similar to issue https://github.com/golang/groupcache/issues/3.

    I have a map that is currently managed in the RAM of the go application on a single instance. I want to share this map between multiple instances for scaling. I am already using consul for discovery of peer instances and I am currently solving this with redis, however I am not happy with the fact that I am not leveraging each machine's RAM (so in that sense I feel that redis is more a DB than a cache). This is one reason why I love groupcache.

    I have a constraint though: my map changes all the time (I'm getting requests to update it via http). So for a key K1 in the map, it is likely that m[K1] will be updated very frequently (possibly every one second or less).

    So my questions are:

    1. Am I choosing the wrong architecture? Should I use something like Redis or memecached instead?
    2. If groupcache is a good solution for my use case, do I have to constantly remove and add (say in an LRU cache) or is there a smarter way?

    Thanks!

  • Currently found in the use of HttpPool groupcache, key to

    Currently found in the use of HttpPool groupcache, key to "/" at the beginning of the peer can not exchange data between bug hope to solve the next?

    Currently found in the use of HttpPool groupcache, key to "/" at the beginning of the peer can not exchange data between bug hope to solve the next? thinks

  • consistenthash: replace linear search with binary search

    consistenthash: replace linear search with binary search

    The binary search quickly out-paces the linear search, even for a small number of shards and replicas.

    benchmark old ns/op new ns/op delta BenchmarkGet8 122 122 +0.00% BenchmarkGet32 471 137 -70.91% BenchmarkGet128 5619 254 -95.48% BenchmarkGet512 90302 406 -99.55%

  • question - anyone tried this on app engine ?

    question - anyone tried this on app engine ?

    would be very handy. based on the docs, it looks like it could run in the same application on app engine too.

    Anyway, has anyone gotten this far with this ?

  • implement peerPicker with grpc

    implement peerPicker with grpc

    this PR implements GRPCPool which follows PeerPicker interface but the implementation is based on google's grpc. there are some issues which haven't been addressed in this PR, we could have them in a separated PR

    ==1. interface: grpc uses protobuf version 3, the generated stub is different with the one generated before. Like v2:

    type GetRequest struct {
            Group            *string `protobuf:"bytes,1,req,name=group" json:"group,omitempty"`
            Key              *string `protobuf:"bytes,2,req,name=key" json:"key,omitempty"`
            XXX_unrecognized []byte  `json:"-"`
    }
    

    v3:

    type GetRequest struct {
           Group string `protobuf:"bytes,1,opt,name=group" json:"group,omitempty"`
           Key   string `protobuf:"bytes,2,opt,name=key" json:"key,omitempty"`
    }
    

    If we switched to protoc 3 directly, we may break something. so we just workaround this. The interface is untouched.

    ==2. Context: right now the context used in groupcache is self-defined Context

    // ProtoGetter is the interface that must be implemented by a peer.
    type ProtoGetter interface {
            Get(context Context, in *pb.GetRequest, out *pb.GetResponse) error
    }
    

    and grpc use "golang.org/x/net/context" as default context. so we can pick one.

    ==3. performance comparison: we will do that after the implementation sounds good.

  • Spread peers updates

    Spread peers updates

    Hello! First of all, thanks a lot for this project.

    My issue is: Is it possible to spread updates of the peers list initiated in one node to others automatically? You have this implementation

    func (p *HTTPPool) Set(peers ...string) {
        p.mu.Lock()
        defer p.mu.Unlock()
        p.peers = consistenthash.New(defaultReplicas, nil)
        p.peers.Add(peers...)
        p.httpGetters = make(map[string]*httpGetter, len(peers))
        for _, peer := range peers {
            p.httpGetters[peer] = &httpGetter{transport: p.Transport, baseURL: peer + p.basePath}
        }
    }
    

    I don't see anything to make it happens in code above. Or is it inconsistent with goals of the groupcache project? And if so why? Thanks!

  • may you add default replicas ?

    may you add default replicas ?

    1.add default replicas groupcache/consistenthash.go: line 34. replicas must has a default value. if replicas <=0, then it will do nothing when call func (m *Map) Add(keys ...string)

    and i suggest the default replicas value is 23

    2.add sync.Mutex. when call Add and Get,we can lock the kes and hashMap

  • fix function TestConsistency

    fix function TestConsistency

    The Add function does not add the hash value of the string key itself to the list keys, so what is the meaning of 'Direct match' in the sentence "Direct matches should always return the same entry", I think this is a bug in the test code, Does 'Direct match' mean that the queried string directly corresponds to a replicated node? So I made some changes to the code.

Related tags
build your own groupcache

Yocache: Your Own groupCache Borrowed code from groupcache and geecache, with modifications: Removed the protobuf message format and replaced with a C

Dec 9, 2021
Concurrency-safe Go caching library with expiration capabilities and access counters

cache2go Concurrency-safe golang caching library with expiration capabilities. Installation Make sure you have a working Go environment (Go 1.2 or hig

Dec 31, 2022
A RESTful caching micro-service in Go backed by Couchbase

Couchcache A caching service developed in Go. It provides REST APIs to access key-value pairs stored in Couchbase. You may also consider using couchca

Sep 26, 2022
Freebase - Proof of concept microservice for A2S INFO message caching
Freebase - Proof of concept microservice for A2S INFO message caching

Freebase A sensible albeit primitive A2S_INFO cache service written in Go. Proof

Feb 23, 2022
A simple, fast, embeddable, persistent key/value store written in pure Go. It supports fully serializable transactions and many data structures such as list, set, sorted set.

NutsDB English | 简体中文 NutsDB is a simple, fast, embeddable and persistent key/value store written in pure Go. It supports fully serializable transacti

Jan 1, 2023
Redwood is a highly-configurable, distributed, realtime database that manages a state tree shared among many peers

Redwood is a highly-configurable, distributed, realtime database that manages a state tree shared among many peers. Imagine something like a Redux store, but distributed across all users of an application, that offers offline editing and is resilient to poor connectivity.

Jan 8, 2023
Eventually consistent distributed in-memory cache Go library

bcache A Go Library to create distributed in-memory cache inside your app. Features LRU cache with configurable maximum keys Eventual Consistency sync

Dec 2, 2022
Distributed cache and in-memory key/value data store.

Distributed cache and in-memory key/value data store. It can be used both as an embedded Go library and as a language-independent service.

Dec 30, 2022
Efficient cache for gigabytes of data written in Go.

BigCache Fast, concurrent, evicting in-memory cache written to keep big number of entries without impact on performance. BigCache keeps entries on hea

Jan 4, 2023
:handbag: Cache arbitrary data with an expiration time.

cache Cache arbitrary data with an expiration time. Features 0 dependencies About 100 lines of code 100% test coverage Usage // New cache c := cache.N

Jan 5, 2023
Fast thread-safe inmemory cache for big number of entries in Go. Minimizes GC overhead

fastcache - fast thread-safe inmemory cache for big number of entries in Go Features Fast. Performance scales on multi-core CPUs. See benchmark result

Dec 30, 2022
Distributed cache with gossip peer membership enrollment.

Autocache Groupcache enhanced with memberlist for distributed peer discovery. TL;DR See /_example/ for usage. Run docker-compose -f _example/docker-co

Dec 8, 2022
MyCache - A distributed cache based on GeeCache

借鉴GeeCache实现了MyCache(https://geektutu.com/post/geecache.html) 主要功能: 1.实现了fifo和lr

Feb 18, 2022
Fast and simple key/value store written using Go's standard library
Fast and simple key/value store written using Go's standard library

Table of Contents Description Usage Cookbook Disadvantages Motivation Benchmarks Test 1 Test 4 Description Package pudge is a fast and simple key/valu

Nov 17, 2022
OcppManager-go - A library for dynamically managing OCPP configuration (variables). It can read, update, and validate OCPP variables.

?? ocppManager-go A library for dynamically managing OCPP configuration (variables). It can read, update, and validate OCPP variables. Currently, only

Jan 3, 2022
golang bigcache with clustering as a library.

clusteredBigCache This is a library based on bigcache with some modifications to support clustering and individual item expiration Bigcache is an exce

Sep 26, 2022
moss - a simple, fast, ordered, persistable, key-val storage library for golang

moss moss provides a simple, fast, persistable, ordered key-val collection implementation as a 100% golang library. moss stands for "memory-oriented s

Dec 18, 2022
Pure Go implementation of D. J. Bernstein's cdb constant database library.

Pure Go implementation of D. J. Bernstein's cdb constant database library.

Oct 19, 2022
A go library for testing Amazon DynamoDB.

minidyn Amazon DynamoDB testing library written in Go. Goals Make local testing for DynamoDB as accurate as possible. Run DynamoDB tests in a CI witho

Nov 9, 2022