A Go implementation of the 64-bit xxHash algorithm (XXH64)


GoDoc Build Status

xxhash is a Go implementation of the 64-bit xxHash algorithm, XXH64. This is a high-quality hashing algorithm that is much faster than anything in the Go standard library.

This package provides a straightforward API:

func Sum64(b []byte) uint64
func Sum64String(s string) uint64
type Digest struct{ ... }
    func New() *Digest

The Digest type implements hash.Hash64. Its key methods are:

func (*Digest) Write([]byte) (int, error)
func (*Digest) WriteString(string) (int, error)
func (*Digest) Sum64() uint64

This implementation provides a fast pure-Go implementation and an even faster assembly implementation for amd64.


This package is in a module and the latest code is in version 2 of the module. You need a version of Go with at least "minimal module compatibility" to use github.com/cespare/xxhash/v2:

  • 1.9.7+ for Go 1.9
  • 1.10.3+ for Go 1.10
  • Go 1.11 or later

I recommend using the latest release of Go.


Here are some quick benchmarks comparing the pure-Go and assembly implementations of Sum64.

input size purego asm
5 B 979.66 MB/s 1291.17 MB/s
100 B 7475.26 MB/s 7973.40 MB/s
4 KB 17573.46 MB/s 17602.65 MB/s
10 MB 17131.46 MB/s 17142.16 MB/s

These numbers were generated on Ubuntu 18.04 with an Intel i7-8700K CPU using the following commands under Go 1.11.2:

$ go test -tags purego -benchtime 10s -bench '/xxhash,direct,bytes'
$ go test -benchtime 10s -bench '/xxhash,direct,bytes'

Projects using this package

Caleb Spare
I make Go servers fast at @liftoffio.
Caleb Spare
  • missing v2 visibility

    missing v2 visibility

    Trying to build a govendor package and the github.com/prometheus/client_golang references a xxhash/v2, specically github.com/cespare/xxhash/v2 v2.1.0. Is that version available for build purposes?

  • reduce allocations when using New()

    reduce allocations when using New()

    Hey Caleb,

    Thanks for the library, and we should catch up soon. Anyway, we were benchmarking the following sequence:

    h := cespare.New()

    and found that this causes 1 memory allocation. We then changed New() from returning a hash.Hash64 to returning a *xxh. This removes the memory allocation and saves about 25ns per op on my MacBook Pro. Would you be open to that? It probably makes sense to make the xxh struct public at that point.

    Second and pending the above, one of our key use cases involves hashing a uint64 followed by a string. It'd be useful to have a WriteString on *xxh that would avoid a memory allocation (analogous to Sum64String).

    Thanks, Diego

  • Modify unsafe string to byte slice conversion in Sum64String

    Modify unsafe string to byte slice conversion in Sum64String

    Hello, I think there might be a race condition in Sum64String which could be mitigated with a minor change. The gist of the issue is that Go's garbage collector can collect objects that are still in scope but no longer referenced (see, for example, this issue and this post on reddit). Consequently, Sum64String may suffer from the following race condition:

    func Sum64String(s string) uint64 {
    	var b []byte
    	sh := (*reflect.StringHeader)(unsafe.Pointer(&s))
    Go's garbage collector runs here and interrupts this function after the previous line.
    It sees that `s` is not used after this point and determines that it is eligible for collection.
    `sh` contains a `uintptr`, so the garbage collector does not consider this pointer to be a
    live reference to an object on the heap. Consequently, the garbage collector determines
    that there are no live references to the bytes that `s` pointed to and frees them.
    	bh := (*reflect.SliceHeader)(unsafe.Pointer(&b))
    	bh.Data = uintptr(unsafe.Pointer(sh.Data))
    	bh.Len = sh.Len
    	bh.Cap = sh.Len
    	return Sum64(b)

    I believe we can get around this problem by using s after we set b to point to the same bytes with the line:

    l := len(s)

    In this way the garbage collector will see that s is used after b's Data field is updated to point to the underlying bytes and the bytes must be live until them.

  • cannot find package

    cannot find package "github.com/cespare/xxhash/v2"


    first. sorry for my english :)

    I am relatively new to go programming. but I wanted to deal with it. I wanted to get the following code to work because I deal with go and go-ethereum.

    But I always get the error below

    Despite intensive google search and my meager english knowledge I could not find a solution. Unfortunately, I do not know anybody who knows about go.

    Maybe you can help me

    Many thanks

    --Error- src\github.com\VictoriaMetrics\fastcache\bigcache.go:7:2: cannot find package "github.com/cespare/xxhash/v2" in any of: C:\Go\src\github.com\cespare\xxhash\v2 (from $GOROOT) C:\Users\go\src\github.com\cespare\xxhash\v2 (from $GOPATH) --Code-- `package main

    import ( "context" "fmt" "log" "math" "math/big"



    func main() { client, err := ethclient.Dial("https://mainnet.infura.io") if err != nil { log.Fatal(err) }

    account := common.HexToAddress("0x71c7656ec7ab88b098defb751b7401b5f6d8976f")
    balance, err := client.BalanceAt(context.Background(), account, nil)
    if err != nil {
    fmt.Println(balance) // 25893180161173005034
    blockNumber := big.NewInt(5532993)
    balanceAt, err := client.BalanceAt(context.Background(), account, blockNumber)
    if err != nil {
    fmt.Println(balanceAt) // 25729324269165216042
    fbalance := new(big.Float)
    ethValue := new(big.Float).Quo(fbalance, big.NewFloat(math.Pow10(18)))
    fmt.Println(ethValue) // 25.729324269165216041
    pendingBalance, err := client.PendingBalanceAt(context.Background(), account)
    fmt.Println(pendingBalance) // 25729324269165216042


  • Allow Sum64String and (*Digest).WriteString to be inlined

    Allow Sum64String and (*Digest).WriteString to be inlined

    This is an alternative approach to #42.

    Ideally the compiler would do mid-stack inlining for Sum64String since it's a minimal unsafe wrapper around Sum64:

    func Sum64String(s string) uint64 {
    	var b []byte
    	bh := (*reflect.SliceHeader)(unsafe.Pointer(&b))
    	bh.Data = (*reflect.StringHeader)(unsafe.Pointer(&s)).Data
    	bh.Len = len(s)
    	bh.Cap = len(s)
    	return Sum64(b)

    Unfortunately, the weight the inliner computes is too high. I filed https://github.com/golang/go/issues/42739.

    In the meantime, I found some tricks (with help from @josharian) to generate a lower cost that gets us below the threshold value.

    Additionally, add tests to confirm that Sum64String and (*Digest).WriteString are inlined.


    name                  old time/op    new time/op    delta
    Sum64String/4B-12       4.78ns ± 1%    3.57ns ± 4%  -25.27%  (p=0.000 n=8+10)
    Sum64String/100B-12     14.5ns ± 1%    12.9ns ± 0%  -10.76%  (p=0.000 n=9+10)
    Sum64String/4KB-12       229ns ± 0%     229ns ± 1%     ~     (p=0.395 n=7+10)
    Sum64String/10MB-12      628µs ± 1%     630µs ± 2%     ~     (p=1.000 n=9+10)
    DigestString/4B-12      11.4ns ± 1%     9.7ns ± 1%  -14.95%  (p=0.000 n=10+10)
    DigestString/100B-12    23.6ns ± 1%    21.3ns ± 2%   -9.65%  (p=0.000 n=10+10)
    DigestString/4KB-12      241ns ± 1%     239ns ± 0%   -0.67%  (p=0.001 n=10+7)
    DigestString/10MB-12     627µs ± 1%     628µs ± 1%     ~     (p=0.631 n=10+10)
    name                  old speed      new speed      delta
    Sum64String/4B-12      837MB/s ± 1%  1124MB/s ± 2%  +34.42%  (p=0.000 n=10+9)
    Sum64String/100B-12   6.88GB/s ± 2%  7.72GB/s ± 1%  +12.16%  (p=0.000 n=10+10)
    Sum64String/4KB-12    17.5GB/s ± 0%  17.5GB/s ± 1%     ~     (p=0.408 n=8+10)
    Sum64String/10MB-12   15.9GB/s ± 1%  15.9GB/s ± 2%     ~     (p=1.000 n=9+10)
    DigestString/4B-12     350MB/s ± 1%   411MB/s ± 1%  +17.55%  (p=0.000 n=10+10)
    DigestString/100B-12  4.23GB/s ± 1%  4.69GB/s ± 1%  +10.84%  (p=0.000 n=10+9)
    DigestString/4KB-12   16.6GB/s ± 1%  16.7GB/s ± 0%   +0.67%  (p=0.001 n=10+8)
    DigestString/10MB-12  16.0GB/s ± 1%  15.9GB/s ± 1%     ~     (p=0.631 n=10+10)

    And with -tags purego:

    name                  old time/op    new time/op    delta
    Sum64String/4B-12       5.57ns ± 1%    4.22ns ± 1%  -24.14%  (p=0.000 n=10+9)
    Sum64String/100B-12     16.0ns ± 1%    14.8ns ± 0%   -7.27%  (p=0.000 n=10+6)
    Sum64String/4KB-12       327ns ± 2%     325ns ± 1%     ~     (p=0.050 n=10+10)
    Sum64String/10MB-12      866µs ± 3%     856µs ± 0%   -1.05%  (p=0.002 n=9+8)
    DigestString/4B-12      11.2ns ± 1%    10.0ns ± 1%  -10.90%  (p=0.000 n=10+9)
    DigestString/100B-12    25.5ns ± 1%    22.8ns ± 0%  -10.62%  (p=0.000 n=10+9)
    DigestString/4KB-12      342ns ± 1%     340ns ± 1%   -0.56%  (p=0.018 n=9+10)
    DigestString/10MB-12     877µs ± 1%     878µs ± 2%     ~     (p=0.400 n=10+9)
    name                  old speed      new speed      delta
    Sum64String/4B-12      718MB/s ± 1%   947MB/s ± 1%  +31.82%  (p=0.000 n=10+9)
    Sum64String/100B-12   6.26GB/s ± 1%  6.75GB/s ± 1%   +7.81%  (p=0.000 n=10+10)
    Sum64String/4KB-12    12.2GB/s ± 2%  12.3GB/s ± 1%   +0.70%  (p=0.022 n=10+9)
    Sum64String/10MB-12   11.6GB/s ± 3%  11.7GB/s ± 0%   +1.05%  (p=0.002 n=9+8)
    DigestString/4B-12     357MB/s ± 1%   401MB/s ± 1%  +12.32%  (p=0.000 n=10+9)
    DigestString/100B-12  3.93GB/s ± 1%  4.40GB/s ± 0%  +11.95%  (p=0.000 n=10+9)
    DigestString/4KB-12   11.7GB/s ± 1%  11.8GB/s ± 1%   +0.68%  (p=0.011 n=10+10)
    DigestString/10MB-12  11.4GB/s ± 1%  11.4GB/s ± 2%     ~     (p=0.400 n=10+9)

    /cc @greatroar

  • cespare/xxhash/v2 was deleted

    cespare/xxhash/v2 was deleted

    @cespare we faced such error when building our project using glide [ERROR] Error scanning github.com/cespare/xxhash/v2: cannot find package "." in: /home/chamith/.glide/cache/src/https-github.com-cespare-xxhash/v2

  • go sum mismatch

    go sum mismatch

    while attempting to use the Prometheus client that depends on this package: github.com/prometheus/client_golang v1.11.0 doing a go get returns the following error.

    verifying github.com/cespare/xxhash/[email protected]: checksum mismatch downloaded: h1:47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU= go.sum: h1:6MnRN8NT7+YBpUIWxHtefFZOKTAPgGjpQSxqLNn0+qY=

    SECURITY ERROR This download does NOT match an earlier download recorded in go.sum. The bits may have been replaced on the origin server, or an attacker may have intercepted the download attempt.

    This was working last week, and now appears as though the checksum has changed, breaking my builds with this package.

    my go sum ENV variable is pointing to: GOSUMDB="sum.golang.org"

  • Replace ADDQ $c, r by LEAQ c(r), r in amd64 assembly

    Replace ADDQ $c, r by LEAQ c(r), r in amd64 assembly

    On older CPUs such as i7-3770K, this gives a small speedup:

    name                                  old speed      new speed      delta
    Hashes/xxhash,direct,bytes,n=5B-8      854MB/s ± 0%   851MB/s ± 1%    ~     (p=0.720 n=9+10)
    Hashes/xxhash,direct,string,n=5B-8     569MB/s ± 1%   571MB/s ± 1%    ~     (p=0.123 n=10+10)
    Hashes/xxhash,digest,bytes,n=5B-8      234MB/s ± 1%   235MB/s ± 1%  +0.43%  (p=0.029 n=10+10)
    Hashes/xxhash,digest,string,n=5B-8     210MB/s ± 0%   211MB/s ± 1%  +0.64%  (p=0.000 n=9+10)
    Hashes/xxhash,direct,bytes,n=100B-8   5.57GB/s ± 0%  5.68GB/s ± 1%  +1.99%  (p=0.000 n=7+10)
    Hashes/xxhash,direct,string,n=100B-8  4.91GB/s ± 0%  5.05GB/s ± 1%  +2.96%  (p=0.000 n=10+10)
    Hashes/xxhash,digest,bytes,n=100B-8   3.09GB/s ± 0%  3.15GB/s ± 0%  +1.82%  (p=0.000 n=10+10)
    Hashes/xxhash,digest,string,n=100B-8  2.58GB/s ± 0%  2.61GB/s ± 0%  +0.94%  (p=0.000 n=10+10)
    Hashes/xxhash,direct,bytes,n=4KB-8    14.5GB/s ± 0%  14.5GB/s ± 1%  +0.30%  (p=0.017 n=10+9)
    Hashes/xxhash,direct,string,n=4KB-8   14.3GB/s ± 0%  14.4GB/s ± 1%  +1.08%  (p=0.000 n=7+10)
    Hashes/xxhash,digest,bytes,n=4KB-8    13.4GB/s ± 1%  13.6GB/s ± 1%  +1.24%  (p=0.000 n=10+10)
    Hashes/xxhash,digest,string,n=4KB-8   13.2GB/s ± 0%  13.5GB/s ± 1%  +2.48%  (p=0.000 n=9+10)
    Hashes/xxhash,direct,bytes,n=10MB-8   13.1GB/s ± 1%  13.2GB/s ± 1%  +1.12%  (p=0.000 n=10+10)
    Hashes/xxhash,direct,string,n=10MB-8  13.1GB/s ± 0%  13.2GB/s ± 0%  +0.50%  (p=0.000 n=10+10)
    Hashes/xxhash,digest,bytes,n=10MB-8   13.1GB/s ± 1%  13.1GB/s ± 1%    ~     (p=0.053 n=9+10)
    Hashes/xxhash,digest,string,n=10MB-8  13.1GB/s ± 1%  13.2GB/s ± 0%  +0.61%  (p=0.008 n=10+9)
  • v2 module indirectly imports v1 module

    v2 module indirectly imports v1 module

    github.com/cespare/xxhash/v2 currently requires the github.com/cespare/xxhash (v1) module:

    $ go mod why github.com/cespare/xxhash
    # github.com/cespare/xxhash

    A later version of github.com/OneOfOne/xxhash (v1.2.5) has moved their benchmarks to a separate sub-module which prevents the indirect import of github.com/cespare/xxhash.

    Please update your github.com/OneOfOne/xxhash dependency -- this will remove the unnecessary github.com/cespare/xxhash module reference.

    Also, please consider moving your benchmarks to a sub-module since most users of your module won't be interested in the alternate modules used for benchmarking.

  • Separate unsafe code for App Engine support.

    Separate unsafe code for App Engine support.

    Unfortunately App Engine doesn't allow unsafe. This PR makes the library compatible with App Engine by falling back to a safe implementation with the appengine build tag.

  • checksum

    checksum "drift"

    Hi for the same file (XML) I get 2 different xx-hashes using v1 and v2

    I am using os.Open to open the file and this is how I get and return the checksum:

    h := xxhash.New() if _, err := io.Copy(h, file); err != nil {log.Fatal(err)} r = fmt.Sprintf("%X", h.Sum(nil)) // Convert into Base16-Uppercase (QuickHash Default)

    Is this wrong?

  • Make it impossible to accidentally use an uninitialized Digest

    Make it impossible to accidentally use an uninitialized Digest

    I noticed a sharp edge of the API. It's tempting (especially if xxhash.Digest is embedded in a larger structure) to do this:

    var d xxhash.Digest
    ... d.Write(...) ...
    h := d.Sum64()

    But this is broken: the zero value of Digest is not usable. You must call Reset first.

    We should fix this, either by making the incorrect usage crash or by automatically calling Reset when an uninitialized Digest is used. Hopefully the branch is very predictable and doesn't add much cost.

  • Assembler implementation for arm64

    Assembler implementation for arm64

    Fixes #45, but I haven't yet benchmarked this on actual ARM hardware. On amd64 with qemu, I get the following results after cherry-picking #50 (I have a version that includes a variant of #42 as well):

    name                 old speed      new speed      delta
    Sum64/4B-8            104MB/s ± 1%   142MB/s ± 1%  +36.00%  (p=0.000 n=10+10)
    Sum64/100B-8         1.51GB/s ± 2%  1.97GB/s ± 1%  +30.42%  (p=0.000 n=10+10)
    Sum64/4KB-8          4.66GB/s ± 1%  5.48GB/s ± 1%  +17.56%  (p=0.000 n=9+9)
    Sum64/10MB-8         4.79GB/s ± 0%  5.45GB/s ± 0%  +13.78%  (p=0.000 n=10+9)
    Sum64String/4B-8      108MB/s ± 1%   149MB/s ± 1%  +38.39%  (p=0.000 n=10+8)
    Sum64String/100B-8   1.59GB/s ± 2%  2.01GB/s ± 1%  +25.84%  (p=0.000 n=9+9)
    Sum64String/4KB-8    4.67GB/s ± 1%  5.54GB/s ± 1%  +18.76%  (p=0.000 n=9+10)
    Sum64String/10MB-8   4.78GB/s ± 1%  5.46GB/s ± 0%  +14.08%  (p=0.000 n=9+10)
    DigestBytes/4B-8     37.4MB/s ± 1%  37.7MB/s ± 1%     ~     (p=0.093 n=10+10)
    DigestBytes/100B-8    634MB/s ± 0%   657MB/s ± 0%   +3.57%  (p=0.000 n=10+9)
    DigestBytes/4KB-8    4.33GB/s ± 0%  4.87GB/s ± 1%  +12.50%  (p=0.000 n=10+9)
    DigestBytes/10MB-8   4.90GB/s ± 0%  5.47GB/s ± 1%  +11.54%  (p=0.000 n=10+10)
    DigestString/4B-8    31.5MB/s ± 1%  31.6MB/s ± 1%     ~     (p=0.447 n=10+10)
    DigestString/100B-8   561MB/s ± 1%   576MB/s ± 1%   +2.67%  (p=0.000 n=10+9)
    DigestString/4KB-8   4.22GB/s ± 1%  4.72GB/s ± 0%  +11.77%  (p=0.000 n=9+9)
    DigestString/10MB-8  4.90GB/s ± 0%  5.47GB/s ± 0%  +11.56%  (p=0.000 n=9+9)

    I've yet to try if NEON instructions provide any further speedup. The code assumes that unaligned access is safe; there's a system control register flag that can forbid unaligned access, but the Go compiler also assumes it's turned off.

  • Add Copy method

    Add Copy method

    The standard way to copy a hash is via (Un)MarshalBinary, which xxhash supports. But that's both inconvenient and inefficient for a small, fast hash like xxhash. A Copy method would be simple and fast:

    func (d *Digest) Copy() *Digest
  • Rename Digest to Hash

    Rename Digest to Hash

    Digest is not a good name. Normally "digest" refers to the output of the hash function.

    In v3 we could consider renaming it; perhaps to Hash.

    Typical code probably doesn't refer to Digest by name and doesn't have to change:

    d := xxhash.New()
    ... call d.Write() ...
    x := d.Sum64()
  • Remove special appengine/safe code support

    Remove special appengine/safe code support

    Appengine supports unsafe since they rolled out their new runtime a couple of years ago.

    I don't think we should need any special mention of appengine now, or any build tag configuration which avoids unsafe.

IntSet - Integer based Set based on a bit-vector

IntSet - Integer based Set based on a bit-vector Every integer that is stored will be converted to a bit in a word in which its located. The words are

Feb 2, 2022
Sroar - 64-bit Roaring Bitmaps in Go

sroar: Serialized Roaring Bitmaps This is a fork of dgraph-io/sroar, being maint

Jun 6, 2022
Implementation of Boyer-Moore fast string search algorithm in Go

boyermoore Implementation of Boyer-Moore fast string search algorithm in Go

Jun 25, 2022
A Go implementation of the core algorithm in paper

Boolean Expression Indexer Go library A Go implementation of the core algorithm in paper <Indexing Boolean Expression>, which already supports the fol

May 22, 2022
Data structure and algorithm library for go, designed to provide functions similar to C++ STL

GoSTL English | 简体中文 Introduction GoSTL is a data structure and algorithm library for go, designed to provide functions similar to C++ STL, but more p

Jun 23, 2022
Multi-String Pattern Matching Algorithm Using TrieHashNode

Multi-String Pattern Matching algorithm. This implementation is inspired from Aho-Corasick algorithm Getting Started modelA = mspm.NewModel("mspm_mode

Jan 23, 2022
golang sorting algorithm and data construction.

data-structures-questions 算法和程序结构是我们学习编程的基础,但是很多的时候,我们很多都是只是在应用,而没有深入的去研究这些,所以自己也在不断的思考和探索,然后分析,学习,总结自己学习的过程,希望可以和大家一起学习和交流下算法! 目录 网络协议 数据结构 算法 数据库 Go

Jul 2, 2022
记录算法学习和LeetCode、LintCode、codewars的学习路程。A record of algorithm learning.

Problem List Leetcode、LintCode、Codewars Algorithm problem solution written by golang. LeetCode id Name(Github) Name(Gitee) 00001 TwoSum TwoSum 00003 L

Nov 3, 2021
The simplest sorting algorithm that sorts in quadratic time
The simplest sorting algorithm that sorts in quadratic time

Simplest sort Showcases the simplest sorting algorithm that works in quadratic time. From here. The pseudocode for this algo can be seen below (sorts

May 8, 2022
May 15, 2022
Smartsort - A smart sorting algorithm for Go to sort filename containing digits that is not zero padded

smartsort A smart sorting algorithm for Go to sort filename containing digits th

Mar 26, 2022
Randomly generated tile maps using Oskar Stålberg's wave function collapse algorithm
Randomly generated tile maps using Oskar Stålberg's wave function collapse algorithm

go-wfc Procedurally-generated tile maps using wave function collapse. Demos Live demo (wasm): https://zfedoran.github.io/go-wfc-example/ Live algorith

Jun 29, 2022
Go implementation of Count-Min-Log

Count-Min-Log Count-Min-Log sketch: Approximately counting with approximate counters - Guillaume Pitel & Geoffroy Fouquier TL;DR: Count-Min-Log Sketch

Apr 6, 2022
A Go implementation of the Elias-Fano encoding

go-ef A Go implementation of the Elias-Fano encoding Example package main import ( "fmt" "github.com/amallia/go-ef" "os" ) func main() {

May 22, 2022
Set is a useful collection but there is no built-in implementation in Go lang.

goset Set is a useful collection but there is no built-in implementation in Go lang. Why? The only one pkg which provides set operations now is golang

Apr 4, 2022
A skip list implementation in Go

About This is a library implementing skip lists for the Go programming language (http://golang.org/). Skip lists are a data structure that can be used

Jun 1, 2022
Go implementation of C++ STL iterators and algorithms.

iter Go implementation of C++ STL iterators and algorithms. Less hand-written loops, more expressive code. README translations: 简体中文 Motivation Althou

Jun 15, 2022
Go implementation to calculate Levenshtein Distance.

levenshtein Go package to calculate the Levenshtein Distance The library is fully capable of working with non-ascii strings. But the strings are not n

Jun 29, 2022
A Merkle Tree implementation written in Go.
A Merkle Tree implementation written in Go.

Merkle Tree in Golang An implementation of a Merkle Tree written in Go. A Merkle Tree is a hash tree that provides an efficient way to verify the cont

Jun 22, 2022