A Go implementation of the 64-bit xxHash algorithm (XXH64)

Last update: Dec 22, 2022

Comments: 17

xxhash

xxhash is a Go implementation of the 64-bit xxHash algorithm, XXH64. This is a high-quality hashing algorithm that is much faster than anything in the Go standard library.

This package provides a straightforward API:

func Sum64(b []byte) uint64
func Sum64String(s string) uint64
type Digest struct{ ... }
    func New() *Digest

The Digest type implements hash.Hash64. Its key methods are:

func (*Digest) Write([]byte) (int, error)
func (*Digest) WriteString(string) (int, error)
func (*Digest) Sum64() uint64

This implementation provides a fast pure-Go implementation and an even faster assembly implementation for amd64.

Compatibility

This package is in a module and the latest code is in version 2 of the module. You need a version of Go with at least "minimal module compatibility" to use github.com/cespare/xxhash/v2:

1.9.7+ for Go 1.9
1.10.3+ for Go 1.10
Go 1.11 or later

I recommend using the latest release of Go.

Benchmarks

Here are some quick benchmarks comparing the pure-Go and assembly implementations of Sum64.

input size	purego	asm
5 B	979.66 MB/s	1291.17 MB/s
100 B	7475.26 MB/s	7973.40 MB/s
4 KB	17573.46 MB/s	17602.65 MB/s
10 MB	17131.46 MB/s	17142.16 MB/s

These numbers were generated on Ubuntu 18.04 with an Intel i7-8700K CPU using the following commands under Go 1.11.2:

$ go test -tags purego -benchtime 10s -bench '/xxhash,direct,bytes'
$ go test -benchtime 10s -bench '/xxhash,direct,bytes'

Projects using this package

Owner

Caleb Spare

I make Go servers fast at @liftoffio.

https://github.com/cespare/xxhash

Comments

Assembler implementation for arm64

Fixes #45, but I haven't yet benchmarked this on actual ARM hardware. On amd64 with qemu, I get the following results after cherry-picking #50 (I have a version that includes a variant of #42 as well):

name                 old speed      new speed      delta
Sum64/4B-8            104MB/s ± 1%   142MB/s ± 1%  +36.00%  (p=0.000 n=10+10)
Sum64/100B-8         1.51GB/s ± 2%  1.97GB/s ± 1%  +30.42%  (p=0.000 n=10+10)
Sum64/4KB-8          4.66GB/s ± 1%  5.48GB/s ± 1%  +17.56%  (p=0.000 n=9+9)
Sum64/10MB-8         4.79GB/s ± 0%  5.45GB/s ± 0%  +13.78%  (p=0.000 n=10+9)
Sum64String/4B-8      108MB/s ± 1%   149MB/s ± 1%  +38.39%  (p=0.000 n=10+8)
Sum64String/100B-8   1.59GB/s ± 2%  2.01GB/s ± 1%  +25.84%  (p=0.000 n=9+9)
Sum64String/4KB-8    4.67GB/s ± 1%  5.54GB/s ± 1%  +18.76%  (p=0.000 n=9+10)
Sum64String/10MB-8   4.78GB/s ± 1%  5.46GB/s ± 0%  +14.08%  (p=0.000 n=9+10)
DigestBytes/4B-8     37.4MB/s ± 1%  37.7MB/s ± 1%     ~     (p=0.093 n=10+10)
DigestBytes/100B-8    634MB/s ± 0%   657MB/s ± 0%   +3.57%  (p=0.000 n=10+9)
DigestBytes/4KB-8    4.33GB/s ± 0%  4.87GB/s ± 1%  +12.50%  (p=0.000 n=10+9)
DigestBytes/10MB-8   4.90GB/s ± 0%  5.47GB/s ± 1%  +11.54%  (p=0.000 n=10+10)
DigestString/4B-8    31.5MB/s ± 1%  31.6MB/s ± 1%     ~     (p=0.447 n=10+10)
DigestString/100B-8   561MB/s ± 1%   576MB/s ± 1%   +2.67%  (p=0.000 n=10+9)
DigestString/4KB-8   4.22GB/s ± 1%  4.72GB/s ± 0%  +11.77%  (p=0.000 n=9+9)
DigestString/10MB-8  4.90GB/s ± 0%  5.47GB/s ± 0%  +11.56%  (p=0.000 n=9+9)

I've yet to try if NEON instructions provide any further speedup. The code assumes that unaligned access is safe; there's a system control register flag that can forbid unaligned access, but the Go compiler also assumes it's turned off.

missing v2 visibility

Trying to build a govendor package and the github.com/prometheus/client_golang references a xxhash/v2, specically github.com/cespare/xxhash/v2 v2.1.0. Is that version available for build purposes?
reduce allocations when using New()
Hey Caleb,

Thanks for the library, and we should catch up soon. Anyway, we were benchmarking the following sequence:

h := cespare.New() h.Write(p) h.Write(o) h.Sum64()

and found that this causes 1 memory allocation. We then changed New() from returning a hash.Hash64 to returning a *xxh. This removes the memory allocation and saves about 25ns per op on my MacBook Pro. Would you be open to that? It probably makes sense to make the xxh struct public at that point.

Second and pending the above, one of our key use cases involves hashing a uint64 followed by a string. It'd be useful to have a WriteString on *xxh that would avoid a memory allocation (analogous to Sum64String).

Thanks, Diego

Modify unsafe string to byte slice conversion in Sum64String

Hello, I think there might be a race condition in Sum64String which could be mitigated with a minor change. The gist of the issue is that Go's garbage collector can collect objects that are still in scope but no longer referenced (see, for example, this issue and this post on reddit). Consequently, Sum64String may suffer from the following race condition:

func Sum64String(s string) uint64 {
	var b []byte
	sh := (*reflect.StringHeader)(unsafe.Pointer(&s))
>>>
Go's garbage collector runs here and interrupts this function after the previous line.
It sees that `s` is not used after this point and determines that it is eligible for collection.
`sh` contains a `uintptr`, so the garbage collector does not consider this pointer to be a
live reference to an object on the heap. Consequently, the garbage collector determines
that there are no live references to the bytes that `s` pointed to and frees them.
>>>
	bh := (*reflect.SliceHeader)(unsafe.Pointer(&b))
	bh.Data = uintptr(unsafe.Pointer(sh.Data))
	bh.Len = sh.Len
	bh.Cap = sh.Len
	return Sum64(b)
}

I believe we can get around this problem by using s after we set b to point to the same bytes with the line:

l := len(s)

In this way the garbage collector will see that s is used after b's Data field is updated to point to the underlying bytes and the bytes must be live until them.

cannot find package "github.com/cespare/xxhash/v2"
hi,

first. sorry for my english :)

I am relatively new to go programming. but I wanted to deal with it. I wanted to get the following code to work because I deal with go and go-ethereum.

But I always get the error below

Despite intensive google search and my meager english knowledge I could not find a solution. Unfortunately, I do not know anybody who knows about go.

Maybe you can help me

Many thanks

--Error- src\github.com\VictoriaMetrics\fastcache\bigcache.go:7:2: cannot find package "github.com/cespare/xxhash/v2" in any of: C:\Go\src\github.com\cespare\xxhash\v2 (from $GOROOT) C:\Users\go\src\github.com\cespare\xxhash\v2 (from $GOPATH) --Code-- `package main

import ( "context" "fmt" "log" "math" "math/big"

"github.com/ethereum/go-ethereum/common" "github.com/ethereum/go-ethereum/ethclient"

)

func main() { client, err := ethclient.Dial("https://mainnet.infura.io") if err != nil { log.Fatal(err) }

account := common.HexToAddress("0x71c7656ec7ab88b098defb751b7401b5f6d8976f") balance, err := client.BalanceAt(context.Background(), account, nil) if err != nil { log.Fatal(err) } fmt.Println(balance) // 25893180161173005034 blockNumber := big.NewInt(5532993) balanceAt, err := client.BalanceAt(context.Background(), account, blockNumber) if err != nil { log.Fatal(err) } fmt.Println(balanceAt) // 25729324269165216042 fbalance := new(big.Float) fbalance.SetString(balanceAt.String()) ethValue := new(big.Float).Quo(fbalance, big.NewFloat(math.Pow10(18))) fmt.Println(ethValue) // 25.729324269165216041 pendingBalance, err := client.PendingBalanceAt(context.Background(), account) fmt.Println(pendingBalance) // 25729324269165216042

}`

Allow Sum64String and (*Digest).WriteString to be inlined

This is an alternative approach to #42.

Ideally the compiler would do mid-stack inlining for Sum64String since it's a minimal unsafe wrapper around Sum64:

func Sum64String(s string) uint64 {
	var b []byte
	bh := (*reflect.SliceHeader)(unsafe.Pointer(&b))
	bh.Data = (*reflect.StringHeader)(unsafe.Pointer(&s)).Data
	bh.Len = len(s)
	bh.Cap = len(s)
	return Sum64(b)
}

Unfortunately, the weight the inliner computes is too high. I filed https://github.com/golang/go/issues/42739.

In the meantime, I found some tricks (with help from @josharian) to generate a lower cost that gets us below the threshold value.

Additionally, add tests to confirm that Sum64String and (*Digest).WriteString are inlined.

Benchmarks:

name                  old time/op    new time/op    delta
Sum64String/4B-12       4.78ns ± 1%    3.57ns ± 4%  -25.27%  (p=0.000 n=8+10)
Sum64String/100B-12     14.5ns ± 1%    12.9ns ± 0%  -10.76%  (p=0.000 n=9+10)
Sum64String/4KB-12       229ns ± 0%     229ns ± 1%     ~     (p=0.395 n=7+10)
Sum64String/10MB-12      628µs ± 1%     630µs ± 2%     ~     (p=1.000 n=9+10)
DigestString/4B-12      11.4ns ± 1%     9.7ns ± 1%  -14.95%  (p=0.000 n=10+10)
DigestString/100B-12    23.6ns ± 1%    21.3ns ± 2%   -9.65%  (p=0.000 n=10+10)
DigestString/4KB-12      241ns ± 1%     239ns ± 0%   -0.67%  (p=0.001 n=10+7)
DigestString/10MB-12     627µs ± 1%     628µs ± 1%     ~     (p=0.631 n=10+10)

name                  old speed      new speed      delta
Sum64String/4B-12      837MB/s ± 1%  1124MB/s ± 2%  +34.42%  (p=0.000 n=10+9)
Sum64String/100B-12   6.88GB/s ± 2%  7.72GB/s ± 1%  +12.16%  (p=0.000 n=10+10)
Sum64String/4KB-12    17.5GB/s ± 0%  17.5GB/s ± 1%     ~     (p=0.408 n=8+10)
Sum64String/10MB-12   15.9GB/s ± 1%  15.9GB/s ± 2%     ~     (p=1.000 n=9+10)
DigestString/4B-12     350MB/s ± 1%   411MB/s ± 1%  +17.55%  (p=0.000 n=10+10)
DigestString/100B-12  4.23GB/s ± 1%  4.69GB/s ± 1%  +10.84%  (p=0.000 n=10+9)
DigestString/4KB-12   16.6GB/s ± 1%  16.7GB/s ± 0%   +0.67%  (p=0.001 n=10+8)
DigestString/10MB-12  16.0GB/s ± 1%  15.9GB/s ± 1%     ~     (p=0.631 n=10+10)

And with -tags purego:

name                  old time/op    new time/op    delta
Sum64String/4B-12       5.57ns ± 1%    4.22ns ± 1%  -24.14%  (p=0.000 n=10+9)
Sum64String/100B-12     16.0ns ± 1%    14.8ns ± 0%   -7.27%  (p=0.000 n=10+6)
Sum64String/4KB-12       327ns ± 2%     325ns ± 1%     ~     (p=0.050 n=10+10)
Sum64String/10MB-12      866µs ± 3%     856µs ± 0%   -1.05%  (p=0.002 n=9+8)
DigestString/4B-12      11.2ns ± 1%    10.0ns ± 1%  -10.90%  (p=0.000 n=10+9)
DigestString/100B-12    25.5ns ± 1%    22.8ns ± 0%  -10.62%  (p=0.000 n=10+9)
DigestString/4KB-12      342ns ± 1%     340ns ± 1%   -0.56%  (p=0.018 n=9+10)
DigestString/10MB-12     877µs ± 1%     878µs ± 2%     ~     (p=0.400 n=10+9)

name                  old speed      new speed      delta
Sum64String/4B-12      718MB/s ± 1%   947MB/s ± 1%  +31.82%  (p=0.000 n=10+9)
Sum64String/100B-12   6.26GB/s ± 1%  6.75GB/s ± 1%   +7.81%  (p=0.000 n=10+10)
Sum64String/4KB-12    12.2GB/s ± 2%  12.3GB/s ± 1%   +0.70%  (p=0.022 n=10+9)
Sum64String/10MB-12   11.6GB/s ± 3%  11.7GB/s ± 0%   +1.05%  (p=0.002 n=9+8)
DigestString/4B-12     357MB/s ± 1%   401MB/s ± 1%  +12.32%  (p=0.000 n=10+9)
DigestString/100B-12  3.93GB/s ± 1%  4.40GB/s ± 0%  +11.95%  (p=0.000 n=10+9)
DigestString/4KB-12   11.7GB/s ± 1%  11.8GB/s ± 1%   +0.68%  (p=0.011 n=10+10)
DigestString/10MB-12  11.4GB/s ± 1%  11.4GB/s ± 2%     ~     (p=0.400 n=10+9)

/cc @greatroar

cespare/xxhash/v2 was deleted

@cespare we faced such error when building our project using glide [ERROR] Error scanning github.com/cespare/xxhash/v2: cannot find package "." in: /home/chamith/.glide/cache/src/https-github.com-cespare-xxhash/v2
go sum mismatch

while attempting to use the Prometheus client that depends on this package: github.com/prometheus/client_golang v1.11.0 doing a go get returns the following error.

verifying github.com/cespare/xxhash/[email protected]: checksum mismatch downloaded: h1:47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU= go.sum: h1:6MnRN8NT7+YBpUIWxHtefFZOKTAPgGjpQSxqLNn0+qY=

SECURITY ERROR This download does NOT match an earlier download recorded in go.sum. The bits may have been replaced on the origin server, or an attacker may have intercepted the download attempt.

This was working last week, and now appears as though the checksum has changed, breaking my builds with this package.

my go sum ENV variable is pointing to: GOSUMDB="sum.golang.org"

Replace ADDQ $c, r by LEAQ c(r), r in amd64 assembly

On older CPUs such as i7-3770K, this gives a small speedup:

name                                  old speed      new speed      delta
Hashes/xxhash,direct,bytes,n=5B-8      854MB/s ± 0%   851MB/s ± 1%    ~     (p=0.720 n=9+10)
Hashes/xxhash,direct,string,n=5B-8     569MB/s ± 1%   571MB/s ± 1%    ~     (p=0.123 n=10+10)
Hashes/xxhash,digest,bytes,n=5B-8      234MB/s ± 1%   235MB/s ± 1%  +0.43%  (p=0.029 n=10+10)
Hashes/xxhash,digest,string,n=5B-8     210MB/s ± 0%   211MB/s ± 1%  +0.64%  (p=0.000 n=9+10)
Hashes/xxhash,direct,bytes,n=100B-8   5.57GB/s ± 0%  5.68GB/s ± 1%  +1.99%  (p=0.000 n=7+10)
Hashes/xxhash,direct,string,n=100B-8  4.91GB/s ± 0%  5.05GB/s ± 1%  +2.96%  (p=0.000 n=10+10)
Hashes/xxhash,digest,bytes,n=100B-8   3.09GB/s ± 0%  3.15GB/s ± 0%  +1.82%  (p=0.000 n=10+10)
Hashes/xxhash,digest,string,n=100B-8  2.58GB/s ± 0%  2.61GB/s ± 0%  +0.94%  (p=0.000 n=10+10)
Hashes/xxhash,direct,bytes,n=4KB-8    14.5GB/s ± 0%  14.5GB/s ± 1%  +0.30%  (p=0.017 n=10+9)
Hashes/xxhash,direct,string,n=4KB-8   14.3GB/s ± 0%  14.4GB/s ± 1%  +1.08%  (p=0.000 n=7+10)
Hashes/xxhash,digest,bytes,n=4KB-8    13.4GB/s ± 1%  13.6GB/s ± 1%  +1.24%  (p=0.000 n=10+10)
Hashes/xxhash,digest,string,n=4KB-8   13.2GB/s ± 0%  13.5GB/s ± 1%  +2.48%  (p=0.000 n=9+10)
Hashes/xxhash,direct,bytes,n=10MB-8   13.1GB/s ± 1%  13.2GB/s ± 1%  +1.12%  (p=0.000 n=10+10)
Hashes/xxhash,direct,string,n=10MB-8  13.1GB/s ± 0%  13.2GB/s ± 0%  +0.50%  (p=0.000 n=10+10)
Hashes/xxhash,digest,bytes,n=10MB-8   13.1GB/s ± 1%  13.1GB/s ± 1%    ~     (p=0.053 n=9+10)
Hashes/xxhash,digest,string,n=10MB-8  13.1GB/s ± 1%  13.2GB/s ± 0%  +0.61%  (p=0.008 n=10+9)

v2 module indirectly imports v1 module
github.com/cespare/xxhash/v2 currently requires the github.com/cespare/xxhash (v1) module:

$ go mod why github.com/cespare/xxhash # github.com/cespare/xxhash example/import github.com/cespare/xxhash/v2 github.com/cespare/xxhash/v2.test github.com/OneOfOne/xxhash github.com/OneOfOne/xxhash.test github.com/cespare/xxhash

A later version of github.com/OneOfOne/xxhash (v1.2.5) has moved their benchmarks to a separate sub-module which prevents the indirect import of github.com/cespare/xxhash.

Please update your github.com/OneOfOne/xxhash dependency -- this will remove the unnecessary github.com/cespare/xxhash module reference.

Also, please consider moving your benchmarks to a sub-module since most users of your module won't be interested in the alternate modules used for benchmarking.
Separate unsafe code for App Engine support.

Unfortunately App Engine doesn't allow unsafe. This PR makes the library compatible with App Engine by falling back to a safe implementation with the appengine build tag.
Make it impossible to accidentally use an uninitialized Digest
I noticed a sharp edge of the API. It's tempting (especially if xxhash.Digest is embedded in a larger structure) to do this:

var d xxhash.Digest ... d.Write(...) ... h := d.Sum64()

But this is broken: the zero value of Digest is not usable. You must call Reset first.

We should fix this, either by making the incorrect usage crash or by automatically calling Reset when an uninitialized Digest is used. Hopefully the branch is very predictable and doesn't add much cost.
Add Copy method
The standard way to copy a hash is via (Un)MarshalBinary, which xxhash supports. But that's both inconvenient and inefficient for a small, fast hash like xxhash. A Copy method would be simple and fast:

func (d *Digest) Copy() *Digest
Rename Digest to Hash
Digest is not a good name. Normally "digest" refers to the output of the hash function.

In v3 we could consider renaming it; perhaps to Hash.

Typical code probably doesn't refer to Digest by name and doesn't have to change:

d := xxhash.New() ... call d.Write() ... x := d.Sum64()
Remove special appengine/safe code support

Appengine supports unsafe since they rolled out their new runtime a couple of years ago.

I don't think we should need any special mention of appengine now, or any build tag configuration which avoids unsafe.
Remove amd64 assembly if it isn't faster than generated Go code

Last I looked at the benchmarks (several Go releases ago), the compiler had closed a lot of ground and the amd64 assembly version was only a little faster. We should check again, and more comprehensively (i.e. against a wider array of amd64 CPUs). If the asm version has no comprehensive advantage on the latest Go version, we should delete the assembly code and stick to pure Go.

If we did that, it would be in a new major version of the package.

If the hand-rolled assembly still has an advantage, consider filing compiler bugs for any obvious deficiencies in the generated code.
Remove duplicated function definitions

After Go 1.13 comes out, we can freely use "forwarding" functions and avoid duplicated definitions since these trivial mid-stack functions should be inlined now: https://github.com/golang/go/issues/8421

Related tags

Data Structures xxhash

A Go implementation of the 64-bit xxHash algorithm (XXH64)

xxhash

Compatibility

Benchmarks

Projects using this package

Owner

Caleb Spare

Comments

Assembler implementation for arm64

missing v2 visibility

reduce allocations when using New()

Modify unsafe string to byte slice conversion in Sum64String

cannot find package "github.com/cespare/xxhash/v2"

Allow Sum64String and (*Digest).WriteString to be inlined

cespare/xxhash/v2 was deleted

go sum mismatch

Replace ADDQ $c, r by LEAQ c(r), r in amd64 assembly

v2 module indirectly imports v1 module

Separate unsafe code for App Engine support.

Make it impossible to accidentally use an uninitialized Digest

Add Copy method

Rename Digest to Hash

Remove special appengine/safe code support

Remove amd64 assembly if it isn't faster than generated Go code

Remove duplicated function definitions

Related tags

IntSet - Integer based Set based on a bit-vector

Sroar - 64-bit Roaring Bitmaps in Go

Implementation of Boyer-Moore fast string search algorithm in Go

A Go implementation of the core algorithm in paper

Data structure and algorithm library for go, designed to provide functions similar to C++ STL

Multi-String Pattern Matching Algorithm Using TrieHashNode

golang sorting algorithm and data construction.

记录算法学习和LeetCode、LintCode、codewars的学习路程。A record of algorithm learning.

The simplest sorting algorithm that sorts in quadratic time

Is this the simplest (and most surprising) sorting algorithm ever?

Smartsort - A smart sorting algorithm for Go to sort filename containing digits that is not zero padded

Randomly generated tile maps using Oskar Stålberg's wave function collapse algorithm

Go implementation of Count-Min-Log

A Go implementation of the Elias-Fano encoding

Set is a useful collection but there is no built-in implementation in Go lang.

A skip list implementation in Go

Go implementation of C++ STL iterators and algorithms.

Go implementation to calculate Levenshtein Distance.

A Merkle Tree implementation written in Go.