Reed-Solomon Erasure Coding in Go

Reed-Solomon

GoDoc Build Status

Reed-Solomon Erasure Coding in Go, with speeds exceeding 1GB/s/cpu core implemented in pure Go.

This is a Go port of the JavaReedSolomon library released by Backblaze, with some additional optimizations.

For an introduction on erasure coding, see the post on the Backblaze blog.

Package home: https://github.com/klauspost/reedsolomon

Godoc: https://pkg.go.dev/github.com/klauspost/reedsolomon?tab=doc

Installation

To get the package use the standard:

go get -u github.com/klauspost/reedsolomon

Using Go modules recommended.

Changes

May 2020

  • ARM64 optimizations, up to 2.5x faster.
  • Added WithFastOneParityMatrix for faster operation with 1 parity shard.
  • Much better performance when using a limited number of goroutines.
  • AVX512 is now using multiple cores.
  • Stream processing overhaul, big speedups in most cases.
  • AVX512 optimizations

March 6, 2019

The pure Go implementation is about 30% faster. Minor tweaks to assembler implementations.

February 8, 2019

AVX512 accelerated version added for Intel Skylake CPUs. This can give up to a 4x speed improvement as compared to AVX2. See here for more details.

December 18, 2018

Assembly code for ppc64le has been contributed, this boosts performance by about 10x on this platform.

November 18, 2017

Added WithAutoGoroutines which will attempt to calculate the optimal number of goroutines to use based on your expected shard size and detected CPU.

October 1, 2017

  • Cauchy Matrix is now an option. Thanks to templexxx for the basis of this.

  • Default maximum number of goroutines has been increased for better multi-core scaling.

  • After several requests the Reconstruct and ReconstructData now slices of zero length but sufficient capacity to be used instead of allocating new memory.

August 26, 2017

  • The Encoder() now contains an Update function contributed by chenzhongtao.

  • Frank Wessels kindly contributed ARM 64 bit assembly, which gives a huge performance boost on this platform.

July 20, 2017

ReconstructData added to Encoder interface. This can cause compatibility issues if you implement your own Encoder. A simple workaround can be added:

func (e *YourEnc) ReconstructData(shards [][]byte) error {
	return ReconstructData(shards)
}

You can of course also do your own implementation. The StreamEncoder handles this without modifying the interface. This is a good lesson on why returning interfaces is not a good design.

Usage

This section assumes you know the basics of Reed-Solomon encoding. A good start is this Backblaze blog post.

This package performs the calculation of the parity sets. The usage is therefore relatively simple.

First of all, you need to choose your distribution of data and parity shards. A 'good' distribution is very subjective, and will depend a lot on your usage scenario. A good starting point is above 5 and below 257 data shards (the maximum supported number), and the number of parity shards to be 2 or above, and below the number of data shards.

To create an encoder with 10 data shards (where your data goes) and 3 parity shards (calculated):

    enc, err := reedsolomon.New(10, 3)

This encoder will work for all parity sets with this distribution of data and parity shards. The error will only be set if you specify 0 or negative values in any of the parameters, or if you specify more than 256 data shards.

If you will primarily be using it with one shard size it is recommended to use WithAutoGoroutines(shardSize) as an additional parameter. This will attempt to calculate the optimal number of goroutines to use for the best speed. It is not required that all shards are this size.

The you send and receive data is a simple slice of byte slices; [][]byte. In the example above, the top slice must have a length of 13.

    data := make([][]byte, 13)

You should then fill the 10 first slices with equally sized data, and create parity shards that will be populated with parity data. In this case we create the data in memory, but you could for instance also use mmap to map files.

    // Create all shards, size them at 50000 each
    for i := range input {
      data[i] := make([]byte, 50000)
    }
    
    
  // Fill some data into the data shards
    for i, in := range data[:10] {
      for j:= range in {
         in[j] = byte((i+j)&0xff)
      }
    }

To populate the parity shards, you simply call Encode() with your data.

    err = enc.Encode(data)

The only cases where you should get an error is, if the data shards aren't of equal size. The last 3 shards now contain parity data. You can verify this by calling Verify():

    ok, err = enc.Verify(data)

The final (and important) part is to be able to reconstruct missing shards. For this to work, you need to know which parts of your data is missing. The encoder does not know which parts are invalid, so if data corruption is a likely scenario, you need to implement a hash check for each shard.

If a byte has changed in your set, and you don't know which it is, there is no way to reconstruct the data set.

To indicate missing data, you set the shard to nil before calling Reconstruct():

    // Delete two data shards
    data[3] = nil
    data[7] = nil
    
    // Reconstruct the missing shards
    err := enc.Reconstruct(data)

The missing data and parity shards will be recreated. If more than 3 shards are missing, the reconstruction will fail.

If you are only interested in the data shards (for reading purposes) you can call ReconstructData():

    // Delete two data shards
    data[3] = nil
    data[7] = nil
    
    // Reconstruct just the missing data shards
    err := enc.ReconstructData(data)

So to sum up reconstruction:

  • The number of data/parity shards must match the numbers used for encoding.
  • The order of shards must be the same as used when encoding.
  • You may only supply data you know is valid.
  • Invalid shards should be set to nil.

For complete examples of an encoder and decoder see the examples folder.

Splitting/Joining Data

You might have a large slice of data. To help you split this, there are some helper functions that can split and join a single byte slice.

   bigfile, _ := ioutil.Readfile("myfile.data")
   
   // Split the file
   split, err := enc.Split(bigfile)

This will split the file into the number of data shards set when creating the encoder and create empty parity shards.

An important thing to note is that you have to keep track of the exact input size. If the size of the input isn't divisible by the number of data shards, extra zeros will be inserted in the last shard.

To join a data set, use the Join() function, which will join the shards and write it to the io.Writer you supply:

   // Join a data set and write it to io.Discard.
   err = enc.Join(io.Discard, data, len(bigfile))

Streaming/Merging

It might seem like a limitation that all data should be in memory, but an important property is that as long as the number of data/parity shards are the same, you can merge/split data sets, and they will remain valid as a separate set.

    // Split the data set of 50000 elements into two of 25000
    splitA := make([][]byte, 13)
    splitB := make([][]byte, 13)
    
    // Merge into a 100000 element set
    merged := make([][]byte, 13)
    
    for i := range data {
      splitA[i] = data[i][:25000]
      splitB[i] = data[i][25000:]
      
      // Concatenate it to itself
	  merged[i] = append(make([]byte, 0, len(data[i])*2), data[i]...)
	  merged[i] = append(merged[i], data[i]...)
    }
    
    // Each part should still verify as ok.
    ok, err := enc.Verify(splitA)
    if ok && err == nil {
        log.Println("splitA ok")
    }
    
    ok, err = enc.Verify(splitB)
    if ok && err == nil {
        log.Println("splitB ok")
    }
    
    ok, err = enc.Verify(merge)
    if ok && err == nil {
        log.Println("merge ok")
    }

This means that if you have a data set that may not fit into memory, you can split processing into smaller blocks. For the best throughput, don't use too small blocks.

This also means that you can divide big input up into smaller blocks, and do reconstruction on parts of your data. This doesn't give the same flexibility of a higher number of data shards, but it will be much more performant.

Streaming API

There has been added support for a streaming API, to help perform fully streaming operations, which enables you to do the same operations, but on streams. To use the stream API, use NewStream function to create the encoding/decoding interfaces.

You can use WithConcurrentStreams to ready an interface that reads/writes concurrently from the streams.

You can specify the size of each operation using WithStreamBlockSize. This will set the size of each read/write operation.

Input is delivered as []io.Reader, output as []io.Writer, and functionality corresponds to the in-memory API. Each stream must supply the same amount of data, similar to how each slice must be similar size with the in-memory API. If an error occurs in relation to a stream, a StreamReadError or StreamWriteError will help you determine which stream was the offender.

There is no buffering or timeouts/retry specified. If you want to add that, you need to add it to the Reader/Writer.

For complete examples of a streaming encoder and decoder see the examples folder.

Advanced Options

You can modify internal options which affects how jobs are split between and processed by goroutines.

To create options, use the WithXXX functions. You can supply options to New, NewStream. If no Options are supplied, default options are used.

Example of how to supply options:

    enc, err := reedsolomon.New(10, 3, WithMaxGoroutines(25))

Performance

Performance depends mainly on the number of parity shards. In rough terms, doubling the number of parity shards will double the encoding time.

Here are the throughput numbers with some different selections of data and parity shards. For reference each shard is 1MB random data, and 16 CPU cores are used for encoding.

Data Parity Go MB/s SSSE3 MB/s AVX2 MB/s
5 2 14287 66355 108755
8 8 5569 34298 70516
10 4 6766 48237 93875
50 20 1540 12130 22090

The throughput numbers here is the size of the encoded data and parity shards.

If runtime.GOMAXPROCS() is set to a value higher than 1, the encoder will use multiple goroutines to perform the calculations in Verify, Encode and Reconstruct.

Example of performance scaling on AMD Ryzen 3950X - 16 physical cores, 32 logical cores, AVX 2. The example uses 10 blocks with 1MB data each and 4 parity blocks.

Threads Speed
1 9979 MB/s
2 18870 MB/s
4 33697 MB/s
8 51531 MB/s
16 59204 MB/s

Benchmarking Reconstruct() followed by a Verify() (=all) versus just calling ReconstructData() (=data) gives the following result:

benchmark                            all MB/s     data MB/s    speedup
BenchmarkReconstruct10x2x10000-8     2011.67      10530.10     5.23x
BenchmarkReconstruct50x5x50000-8     4585.41      14301.60     3.12x
BenchmarkReconstruct10x2x1M-8        8081.15      28216.41     3.49x
BenchmarkReconstruct5x2x1M-8         5780.07      28015.37     4.85x
BenchmarkReconstruct10x4x1M-8        4352.56      14367.61     3.30x
BenchmarkReconstruct50x20x1M-8       1364.35      4189.79      3.07x
BenchmarkReconstruct10x4x16M-8       1484.35      5779.53      3.89x

Performance on AVX512

The performance on AVX512 has been accelerated for Intel CPUs. This gives speedups on a per-core basis typically up to 2x compared to AVX2 as can be seen in the following table:

[...]

This speedup has been achieved by computing multiple parity blocks in parallel as opposed to one after the other. In doing so it is possible to minimize the memory bandwidth required for loading all data shards. At the same time the calculations are performed in the 512-bit wide ZMM registers and the surplus of ZMM registers (32 in total) is used to keep more data around (most notably the matrix coefficients).

Performance on ARM64 NEON

By exploiting NEON instructions the performance for ARM has been accelerated. Below are the performance numbers for a single core on an EC2 m6g.16xlarge (Graviton2) instance (Amazon Linux 2):

BenchmarkGalois128K-64        119562     10028 ns/op        13070.78 MB/s
BenchmarkGalois1M-64           14380     83424 ns/op        12569.22 MB/s
BenchmarkGaloisXor128K-64      96508     12432 ns/op        10543.29 MB/s
BenchmarkGaloisXor1M-64        10000    100322 ns/op        10452.13 MB/s

Performance on ppc64le

The performance for ppc64le has been accelerated. This gives roughly a 10x performance improvement on this architecture as can been seen below:

benchmark                      old MB/s     new MB/s     speedup
BenchmarkGalois128K-160        948.87       8878.85      9.36x
BenchmarkGalois1M-160          968.85       9041.92      9.33x
BenchmarkGaloisXor128K-160     862.02       7905.00      9.17x
BenchmarkGaloisXor1M-160       784.60       6296.65      8.03x

asm2plan9s

asm2plan9s is used for assembling the AVX2 instructions into their BYTE/WORD/LONG equivalents.

Links

License

This code, as the original JavaReedSolomon is published under an MIT license. See LICENSE file for more information.

Comments
  • Inversion cache leaking memory?

    Inversion cache leaking memory?

    Hey! I'll start off and say that everything said here is nothing but an assumption/not verified yet supported by facts we've concluded during our attempt to solve our issue. I'll make this as short as I possibly can; we're using the package inside our project which is meant to support high rates of data transmission. What we're doing is as follows: we transfer files from one end to another, each file eventually to be reconstructed by reedsolomon using this package. The way we do is we assign a suitable amount of shards and parity per each file and on the receiving end we grab these numbers and use a map to map a pair of shards & parity to an encoder. I.e. reuse encoders. The problem is, when we reuse encoders, our memory constantly goes up until we reach out of memory. When we do not reuse and create a new encoder per each file we receive, our memory doesn't go up but stays still. We're talking massive amounts of RAM caught up by this. Reaching up to 360GB RAM in less than an hour. I would love to know if there's something we're missing, perhaps is it possible that our encoders hold the data of our matrices? Are the encoders stateless that they allow reuse or do they keep data stored after they're used? Any help is appreciated

  • not scalable well with multiple go routines

    not scalable well with multiple go routines

    I restrict each coder to use one goroutine and I launch 8 goroutines for decoding different stuff, but the performance does not scale well. Here are my test cases, 3 data chunk, 1 parity chunk, chunk size 128B - 16K.

    Using aws c5.2xlarge (AVX512 support), only test decoding.

    If chunk size is 128 Byte, single thread throughput is 635 MB/s, while 8 goroutines give around 921 MB/s (aggregated throughput).

    If chunk size is 1024 Byte, single thread throughput is around 3704 MB/s, while 8 goroutines give around 5730 MB/s (aggregated throughput).

    If chunk size is 16384 Byte, single thread throughput is around 9370 MB/s, while 8 goroutines give around 17466 MB/s (aggregated throughput).

  • matrix is singular error

    matrix is singular error

    Hi, I am using the example's files to test the code. When I take a 100MB file with 250 data shards and 125 pairity shards and then delete shards 0-49 and 270-279 reconstruction fails with matrix is singular error. Is this suppose to happen? Cause I thought I can lose up to 125 shards and still recover the data. What am I missing? Thanks, Amit

  • Slice bounds out of range at ReconstructData

    Slice bounds out of range at ReconstructData

    Hey. I’m getting a slice bounds out of range with capacity 0 at galMulAVX512Parallel84 (galoisAvx512_amd64.go:194)

    It only happens once in very very few, since everything runs smoothly and processes many matrices but after a while panics for one that catches this case and I can’t seem to understand what’s happening I’ll provide some code in a few But there isn’t much to see since it’s a 116 shards matrix, 106 shards, 10 parity

    Would love some help, thanks!

  • There's a performance improved fork of this project

    There's a performance improved fork of this project

    I'm debian pkg maintainer of your reedsolomon project. It's got my attention that there's a fork of your project that claims big performance boost:

    • https://github.com/templexxx/reedsolomon

    I asked the fork author to send the improvement patch to you upstream, but he/she seems kinda not interested in it.

    • https://github.com/templexxx/reedsolomon/issues/9

    I looked at two project and find there's much differences that beyond my ability to make such patches. So maybe you're interested in those improvements and can take a look at the project? Thanks!

  • Benchmarks are wrong as you're testing very small chunks

    Benchmarks are wrong as you're testing very small chunks

    Hi. Your benchmark data is wrong because when you're testing 10kb chunks everything resides in cache and you get very high performance. For example BenchmarkEncode5x2x1M = 14 GB/s on my laptop with GOMAXPROCS=1, but if I raise chunk size to 1G it becomes 5.5 GB/s. And if I set it to 10000 the result becomes 23.5 GB/s which is over the head :)) 5.5 GB/s is still good of course, ISA-L gives something like 5.6 GB/s with the same test on 1 core. But your numbers are slightly misleading :)

  • how to deal with partially overwriting

    how to deal with partially overwriting

    I'm check examples but have no luck. I have for example 4Mb of data and want to use this package to split it for 1Mb data blocks and add parity to it. After that i need to overwrite this 4Mb of data from 2Mb and to 3Mb (1Mb of data). I know that i can read back 4Mb of data, modify it and write again, but this is significant overhead in case of massive read/write. How to deal in this case and not re-read again all data block?

  • adjust minSplitSize to 64K

    adjust minSplitSize to 64K

    I'm using reedsolomon for UDP packet(<1500Bytes for each packet), I don't want multiple goroutines started for each packet I received, it's too expensive. I think setting the minSplitSize to 64K will be appropriate.

  • PAR2 Creation

    PAR2 Creation

    I've tried to find one but is there a project which uses this repo to create PAR2 files? I've been looking for a faster alternative to https://github.com/animetosho/ParPar.

  • Export the `reedSolomon` struct

    Export the `reedSolomon` struct

    I believe a more idiomatic Go approach would be to make the reedSolomon struct exported, e.g. by renaming it to ReedSolomon. Then the New() function should also be changed to return *ReedSolomon, making the already capitalized fields of this struct visible to its users.

    As it is now, these fields (DataShards, ParityShards, Shards) are not visible because New returns an Encoder interface, and this causes me to have to store these same values also in my own struct, because I can't access the fields of the reedSolomon struct.

    I've implemented this change locally, and I'd be happy to provide a pull request (pending a response to this issue). It requires three minor additional changes (essentially avoiding a few type assertions), but I believe this change is non-breaking, since the returned *ReedSolomon object will implement the Encoder interface. The tests still pass after my change.

  • Allow 0 parity shards

    Allow 0 parity shards

    Although it doesn't allow a way to reconstruct the data, I think it should be allowed to pass 0 parity shards to the code. First, it doesn't "harm" the code, it would still work, but will require that every dataShard is still intact - which in my opinion is ok. Second, right now if a program use the library, and for some "input type" it should use parity and for some it shouldn't, it required from the programmer to write entire section dealing with 0 parity which would do the same as the library.

    Am I missing something basic about the library or is it simple as removing the "if" in file reedsolomon.go(line 107)

  • Align allocations

    Align allocations

    For AMD64 aligned inputs can make a big speed difference.

    This is an example of the speed difference when inputs are unaligned/aligned:

    BenchmarkEncode100x20x10000-32    	    7058	    172648 ns/op	6950.57 MB/s
    BenchmarkEncode100x20x10000-32    	    8406	    137911 ns/op	8701.24 MB/s
    

    This is mostly the case when dealing with odd-sized shards.

    To facilitate this the package provides an AllocAligned(shards, each int) [][]byte. This will allocate a number of shards, each with the size each. Each shard will then be aligned to a 64 byte boundary.

    Each encoder also has a AllocAligned(each int) [][]byte as an extended interface which will return the same, but with the shard count configured in the encoder.

    It is not possible to re-aligned already allocated slices, for example when using Split. When it is not possible to write to aligned shards, you should not copy to them.

    Full (but rather noisy) benchmark:

    benchmark                                                old ns/op      new ns/op      delta
    BenchmarkGalois128K-32                                   2284           2254           -1.31%
    BenchmarkGalois1M-32                                     21925          19042          -13.15%
    BenchmarkGaloisXor128K-32                                2810           2782           -1.00%
    BenchmarkGaloisXor1M-32                                  24223          22716          -6.22%
    BenchmarkEncode2x1x1M-32                                 38969          33115          -15.02%
    BenchmarkEncode800x200/64-32                             29007          28090          -3.16%
    BenchmarkEncode800x200/256-32                            65858          64747          -1.69%
    BenchmarkEncode800x200/1024-32                           207661         203905         -1.81%
    BenchmarkEncode800x200/4096-32                           806579         789913         -2.07%
    BenchmarkEncode800x200/16384-32                          4088967        3688426        -9.80%
    BenchmarkEncode800x200/65536-32                          27241951       24104804       -11.52%
    BenchmarkEncode800x200/262144-32                         120608789      113648633      -5.77%
    BenchmarkEncode800x200/1048576-32                        451364367      420720500      -6.79%
    BenchmarkEncode1K/4+4/cauchy-32                          335            345            +2.96%
    BenchmarkEncode1K/4+4/leopard-gf8-32                     640            632            -1.33%
    BenchmarkEncode1K/4+4/leopard-gf16-32                    455            436            -4.16%
    BenchmarkEncode1K/8+8/cauchy-32                          1099           1081           -1.64%
    BenchmarkEncode1K/8+8/leopard-gf8-32                     1831           1792           -2.13%
    BenchmarkEncode1K/8+8/leopard-gf16-32                    1608           1586           -1.37%
    BenchmarkEncode1K/16+16/cauchy-32                        4340           4372           +0.74%
    BenchmarkEncode1K/16+16/leopard-gf8-32                   3330           3280           -1.50%
    BenchmarkEncode1K/16+16/leopard-gf16-32                  2637           2614           -0.87%
    BenchmarkEncode1K/32+32/cauchy-32                        17257          17397          +0.81%
    BenchmarkEncode1K/32+32/leopard-gf8-32                   9849           9623           -2.29%
    BenchmarkEncode1K/32+32/leopard-gf16-32                  8903           8806           -1.09%
    BenchmarkEncode1K/64+64/cauchy-32                        68672          68374          -0.43%
    BenchmarkEncode1K/64+64/leopard-gf8-32                   18283          17992          -1.59%
    BenchmarkEncode1K/64+64/leopard-gf16-32                  15558          15541          -0.11%
    BenchmarkEncode1K/128+128/cauchy-32                      270881         270547         -0.12%
    BenchmarkEncode1K/128+128/leopard-gf8-32                 49601          48871          -1.47%
    BenchmarkEncode1K/128+128/leopard-gf16-32                46158          45735          -0.92%
    BenchmarkEncode1K/256+256/leopard-gf16-32                84268          83318          -1.13%
    BenchmarkEncode1K/512+512/leopard-gf16-32                235278         231775         -1.49%
    BenchmarkEncode1K/1024+1024/leopard-gf16-32              436245         430979         -1.21%
    BenchmarkEncode1K/2048+2048/leopard-gf16-32              1227665        1108337        -9.72%
    BenchmarkEncode1K/4096+4096/leopard-gf16-32              2573166        2273580        -11.64%
    BenchmarkEncode1K/8192+8192/leopard-gf16-32              7377235        6443287        -12.66%
    BenchmarkEncode1K/16384+16384/leopard-gf16-32            20045895       17426286       -13.07%
    BenchmarkEncode1K/32768+32768/leopard-gf16-32            53570005       50222577       -6.25%
    BenchmarkDecode1K/4+4/cauchy-32                          2160           2262           +4.72%
    BenchmarkDecode1K/4+4/cauchy-inv-32                      1369           1461           +6.72%
    BenchmarkDecode1K/4+4/cauchy-single-32                   1300           1304           +0.31%
    BenchmarkDecode1K/4+4/cauchy-single-inv-32               604            633            +4.71%
    BenchmarkDecode1K/4+4/leopard-gf8-32                     4277           4176           -2.36%
    BenchmarkDecode1K/4+4/leopard-gf8-inv-32                 2455           2330           -5.09%
    BenchmarkDecode1K/4+4/leopard-gf8-single-32              3891           3750           -3.62%
    BenchmarkDecode1K/4+4/leopard-gf8-single-inv-32          1974           1914           -3.04%
    BenchmarkDecode1K/4+4/leopard-gf16-32                    794838         792366         -0.31%
    BenchmarkDecode1K/4+4/leopard-gf16-single-32             791991         793335         +0.17%
    BenchmarkDecode1K/8+8/cauchy-32                          5445           5651           +3.78%
    BenchmarkDecode1K/8+8/cauchy-inv-32                      2920           3039           +4.08%
    BenchmarkDecode1K/8+8/cauchy-single-32                   2290           2285           -0.22%
    BenchmarkDecode1K/8+8/cauchy-single-inv-32               788            814            +3.30%
    BenchmarkDecode1K/8+8/leopard-gf8-32                     7270           7134           -1.87%
    BenchmarkDecode1K/8+8/leopard-gf8-inv-32                 5433           5156           -5.10%
    BenchmarkDecode1K/8+8/leopard-gf8-single-32              6168           6123           -0.73%
    BenchmarkDecode1K/8+8/leopard-gf8-single-inv-32          4302           4365           +1.46%
    BenchmarkDecode1K/8+8/leopard-gf16-32                    792261         787290         -0.63%
    BenchmarkDecode1K/8+8/leopard-gf16-single-32             800258         793031         -0.90%
    BenchmarkDecode1K/16+16/cauchy-32                        21492          21664          +0.80%
    BenchmarkDecode1K/16+16/cauchy-inv-32                    7805           8140           +4.29%
    BenchmarkDecode1K/16+16/cauchy-single-32                 5176           5105           -1.37%
    BenchmarkDecode1K/16+16/cauchy-single-inv-32             1211           1194           -1.40%
    BenchmarkDecode1K/16+16/leopard-gf8-32                   16062          15813          -1.55%
    BenchmarkDecode1K/16+16/leopard-gf8-inv-32               14737          13697          -7.06%
    BenchmarkDecode1K/16+16/leopard-gf8-single-32            13899          13393          -3.64%
    BenchmarkDecode1K/16+16/leopard-gf8-single-inv-32        11865          11487          -3.19%
    BenchmarkDecode1K/16+16/leopard-gf16-32                  805833         794984         -1.35%
    BenchmarkDecode1K/16+16/leopard-gf16-single-32           808125         799816         -1.03%
    BenchmarkDecode1K/32+32/cauchy-32                        117646         117088         -0.47%
    BenchmarkDecode1K/32+32/cauchy-inv-32                    23975          24454          +2.00%
    BenchmarkDecode1K/32+32/cauchy-single-32                 14250          13861          -2.73%
    BenchmarkDecode1K/32+32/cauchy-single-inv-32             2018           1994           -1.19%
    BenchmarkDecode1K/32+32/leopard-gf8-32                   31886          32203          +0.99%
    BenchmarkDecode1K/32+32/leopard-gf8-inv-32               30959          30270          -2.23%
    BenchmarkDecode1K/32+32/leopard-gf8-single-32            20726          22461          +8.37%
    BenchmarkDecode1K/32+32/leopard-gf8-single-inv-32        20717          22592          +9.05%
    BenchmarkDecode1K/32+32/leopard-gf16-32                  823015         808697         -1.74%
    BenchmarkDecode1K/32+32/leopard-gf16-single-32           816765         809475         -0.89%
    BenchmarkDecode1K/64+64/cauchy-32                        813174         804646         -1.05%
    BenchmarkDecode1K/64+64/cauchy-inv-32                    80979          82487          +1.86%
    BenchmarkDecode1K/64+64/cauchy-single-32                 45979          44580          -3.04%
    BenchmarkDecode1K/64+64/cauchy-single-inv-32             3432           3286           -4.25%
    BenchmarkDecode1K/64+64/leopard-gf8-32                   78343          74364          -5.08%
    BenchmarkDecode1K/64+64/leopard-gf8-inv-32               73370          63120          -13.97%
    BenchmarkDecode1K/64+64/leopard-gf8-single-32            50130          43959          -12.31%
    BenchmarkDecode1K/64+64/leopard-gf8-single-inv-32        51308          41208          -19.69%
    BenchmarkDecode1K/64+64/leopard-gf16-32                  864012         846280         -2.05%
    BenchmarkDecode1K/64+64/leopard-gf16-single-32           850149         830551         -2.31%
    BenchmarkDecode1K/128+128/cauchy-32                      5929095        5900026        -0.49%
    BenchmarkDecode1K/128+128/cauchy-inv-32                  304087         306298         +0.73%
    BenchmarkDecode1K/128+128/cauchy-single-32               164090         160974         -1.90%
    BenchmarkDecode1K/128+128/cauchy-single-inv-32           5850           5625           -3.85%
    BenchmarkDecode1K/128+128/leopard-gf8-32                 158429         145910         -7.90%
    BenchmarkDecode1K/128+128/leopard-gf8-inv-32             152309         141621         -7.02%
    BenchmarkDecode1K/128+128/leopard-gf8-single-32          112267         94170          -16.12%
    BenchmarkDecode1K/128+128/leopard-gf8-single-inv-32      104646         96275          -8.00%
    BenchmarkDecode1K/128+128/leopard-gf16-32                927823         920083         -0.83%
    BenchmarkDecode1K/128+128/leopard-gf16-single-32         893019         885971         -0.79%
    BenchmarkDecode1K/256+256/leopard-gf16-32                1132479        1105774        -2.36%
    BenchmarkDecode1K/256+256/leopard-gf16-single-32         1017945        1003342        -1.43%
    BenchmarkDecode1K/512+512/leopard-gf16-32                1495247        1457558        -2.52%
    BenchmarkDecode1K/512+512/leopard-gf16-single-32         1276089        1239965        -2.83%
    BenchmarkDecode1K/1024+1024/leopard-gf16-32              2511310        2355124        -6.22%
    BenchmarkDecode1K/1024+1024/leopard-gf16-single-32       1926875        1786114        -7.31%
    BenchmarkDecode1K/2048+2048/leopard-gf16-32              4574758        4051357        -11.44%
    BenchmarkDecode1K/2048+2048/leopard-gf16-single-32       3404487        2936912        -13.73%
    BenchmarkDecode1K/4096+4096/leopard-gf16-32              9917650        9381317        -5.41%
    BenchmarkDecode1K/4096+4096/leopard-gf16-single-32       7439868        6237255        -16.16%
    BenchmarkDecode1K/8192+8192/leopard-gf16-32              27173871       22125130       -18.58%
    BenchmarkDecode1K/8192+8192/leopard-gf16-single-32       19590423       15888578       -18.90%
    BenchmarkDecode1K/16384+16384/leopard-gf16-32            65490106       60630937       -7.42%
    BenchmarkDecode1K/16384+16384/leopard-gf16-single-32     43015162       40455732       -5.95%
    BenchmarkDecode1K/32768+32768/leopard-gf16-32            137665712      121400489      -11.82%
    BenchmarkDecode1K/32768+32768/leopard-gf16-single-32     89620746       84439785       -5.78%
    BenchmarkEncodeLeopard/83840-32                          38754790       34248175       -11.63%
    BenchmarkEncode10x2x10000-32                             3146           3104           -1.34%
    BenchmarkEncode100x20x10000-32                           156083         132442         -15.15%
    BenchmarkEncode17x3x1M-32                                266619         549537         +106.11%
    BenchmarkEncode10x4x16M-32                               8589080        10118010       +17.80%
    BenchmarkEncode5x2x1M-32                                 69710          67893          -2.61%
    BenchmarkEncode10x2x1M-32                                105670         107245         +1.49%
    BenchmarkEncode10x4x1M-32                                163118         176928         +8.47%
    BenchmarkEncode50x20x1M-32                               3118381        14567976       +367.16%
    BenchmarkEncodeLeopard50x20x1M-32                        10595922       11908254       +12.39%
    BenchmarkEncode17x3x16M-32                               10172963       11955679       +17.52%
    BenchmarkEncode_8x4x8M-32                                3728191        3439514        -7.74%
    BenchmarkEncode_12x4x12M-32                              6803056        6497932        -4.49%
    BenchmarkEncode_16x4x16M-32                              10882620       10934306       +0.47%
    BenchmarkEncode_16x4x32M-32                              21758418       21431559       -1.50%
    BenchmarkEncode_16x4x64M-32                              45258777       43288619       -4.35%
    BenchmarkEncode_8x5x8M-32                                4385065        4089020        -6.75%
    BenchmarkEncode_8x6x8M-32                                4826290        4502941        -6.70%
    BenchmarkEncode_8x7x8M-32                                5403098        4968727        -8.04%
    BenchmarkEncode_8x9x8M-32                                6302487        6002242        -4.76%
    BenchmarkEncode_8x10x8M-32                               6913882        6637816        -3.99%
    BenchmarkEncode_8x11x8M-32                               7326503        7059232        -3.65%
    BenchmarkEncode_8x8x05M-32                               144080         311435         +116.15%
    BenchmarkEncode_8x8x1M-32                                291361         274498         -5.79%
    BenchmarkEncode_8x8x8M-32                                5802282        5384482        -7.20%
    BenchmarkEncode_8x8x32M-32                               24160839       23571850       -2.44%
    BenchmarkEncode_24x8x24M-32                              36720647       30780145       -16.18%
    BenchmarkEncode_24x8x48M-32                              63394650       63509811       +0.18%
    BenchmarkVerify800x200/64-32                             40375          38159          -5.49%
    BenchmarkVerify800x200/256-32                            83771          78225          -6.62%
    BenchmarkVerify800x200/1024-32                           271752         240426         -11.53%
    BenchmarkVerify800x200/4096-32                           1082853        929348         -14.18%
    BenchmarkVerify800x200/16384-32                          5725732        4615986        -19.38%
    BenchmarkVerify800x200/65536-32                          32934571       26172560       -20.53%
    BenchmarkVerify800x200/262144-32                         155253386      126563067      -18.48%
    BenchmarkVerify800x200/1048576-32                        561659750      490361633      -12.69%
    BenchmarkVerify10x2x10000-32                             5904           5500           -6.84%
    BenchmarkVerify50x5x100000-32                            211479         171168         -19.06%
    BenchmarkVerify10x2x1M-32                                504104         425526         -15.59%
    BenchmarkVerify5x2x1M-32                                 395883         335088         -15.36%
    BenchmarkVerify10x4x1M-32                                1089718        938154         -13.91%
    BenchmarkVerify50x20x1M-32                               7889230        8431484        +6.87%
    BenchmarkVerify10x4x16M-32                               23219342       17756713       -23.53%
    BenchmarkReconstruct10x2x10000-32                        3222           3117           -3.26%
    BenchmarkReconstruct800x200/64-32                        1179363        1100846        -6.66%
    BenchmarkReconstruct800x200/256-32                       1294499        1214204        -6.20%
    BenchmarkReconstruct800x200/1024-32                      1881303        1737606        -7.64%
    BenchmarkReconstruct800x200/4096-32                      4420506        3860332        -12.67%
    BenchmarkReconstruct800x200/16384-32                     27171535       22810506       -16.05%
    BenchmarkReconstruct800x200/65536-32                     119789411      108652470      -9.30%
    BenchmarkReconstruct800x200/262144-32                    531271300      505694550      -4.81%
    BenchmarkReconstruct800x200/1048576-32                   2918642700     2408563700     -17.48%
    BenchmarkReconstruct50x5x50000-32                        148109         134759         -9.01%
    BenchmarkReconstruct10x2x1M-32                           185262         186595         +0.72%
    BenchmarkReconstruct5x2x1M-32                            129164         126902         -1.75%
    BenchmarkReconstruct10x4x1M-32                           276915         268351         -3.09%
    BenchmarkReconstruct50x20x1M-32                          3541594        6128024        +73.03%
    BenchmarkReconstructLeopard50x20x1M-32                   30241625       29009541       -4.07%
    BenchmarkReconstruct10x4x16M-32                          9437829        8883464        -5.87%
    BenchmarkReconstructData10x2x10000-32                    3060           2988           -2.35%
    BenchmarkReconstructData800x200/64-32                    1147537        1091256        -4.90%
    BenchmarkReconstructData800x200/256-32                   1263484        1212899        -4.00%
    BenchmarkReconstructData800x200/1024-32                  1839242        1725180        -6.20%
    BenchmarkReconstructData800x200/4096-32                  4362936        3849980        -11.76%
    BenchmarkReconstructData800x200/16384-32                 26168836       22671938       -13.36%
    BenchmarkReconstructData800x200/65536-32                 120528889      107130140      -11.12%
    BenchmarkReconstructData800x200/262144-32                563569000      475725700      -15.59%
    BenchmarkReconstructData800x200/1048576-32               2615405500     2429356800     -7.11%
    BenchmarkReconstructData50x5x50000-32                    143226         129456         -9.61%
    BenchmarkReconstructData10x2x1M-32                       172211         237621         +37.98%
    BenchmarkReconstructData5x2x1M-32                        112454         100520         -10.61%
    BenchmarkReconstructData10x4x1M-32                       228400         304907         +33.50%
    BenchmarkReconstructData50x20x1M-32                      2314132        4094795        +76.95%
    BenchmarkReconstructData10x4x16M-32                      7298997        6668680        -8.64%
    BenchmarkReconstructP10x2x10000-32                       826            829            +0.35%
    BenchmarkReconstructP10x5x20000-32                       1474           1574           +6.78%
    BenchmarkSplit10x4x160M-32                               5724818        5133214        -10.33%
    BenchmarkSplit5x2x5M-32                                  185120         158360         -14.46%
    BenchmarkSplit10x2x1M-32                                 30804          26726          -13.24%
    BenchmarkSplit10x4x10M-32                                362056         329514         -8.99%
    BenchmarkSplit50x20x50M-32                               1822737        1658791        -8.99%
    BenchmarkSplit17x3x272M-32                               4211272        3692685        -12.31%
    BenchmarkParallel_8x8x64K-32                             6069           9427           +55.33%
    BenchmarkParallel_8x8x05M-32                             363357         355407         -2.19%
    BenchmarkParallel_20x10x05M-32                           597304         595410         -0.32%
    BenchmarkParallel_8x8x1M-32                              714469         709469         -0.70%
    BenchmarkParallel_8x8x8M-32                              5690194        5753258        +1.11%
    BenchmarkParallel_8x8x32M-32                             22776537       22795771       +0.08%
    BenchmarkParallel_8x3x1M-32                              404962         405967         +0.25%
    BenchmarkParallel_8x4x1M-32                              466195         467154         +0.21%
    BenchmarkParallel_8x5x1M-32                              528804         529565         +0.14%
    BenchmarkStreamEncode10x2x10000-32                       5614           5579           -0.62%
    BenchmarkStreamEncode100x20x10000-32                     270235         254176         -5.94%
    BenchmarkStreamEncode17x3x1M-32                          1517849        1472684        -2.98%
    BenchmarkStreamEncode10x4x16M-32                         19262797       18293434       -5.03%
    BenchmarkStreamEncode5x2x1M-32                           417544         403906         -3.27%
    BenchmarkStreamEncode10x2x1M-32                          823367         821781         -0.19%
    BenchmarkStreamEncode10x4x1M-32                          884722         856973         -3.14%
    BenchmarkStreamEncode50x20x1M-32                         6553097        12518249       +91.03%
    BenchmarkStreamEncode17x3x16M-32                         28583679       27318121       -4.43%
    BenchmarkStreamVerify10x2x10000-32                       8255           8086           -2.05%
    BenchmarkStreamVerify50x5x50000-32                       659940         636418         -3.56%
    BenchmarkStreamVerify10x2x1M-32                          1239043        1195332        -3.53%
    BenchmarkStreamVerify5x2x1M-32                           765149         737183         -3.65%
    BenchmarkStreamVerify10x4x1M-32                          1617370        1584146        -2.05%
    BenchmarkStreamVerify50x20x1M-32                         9435350        11808476       +25.15%
    BenchmarkStreamVerify10x4x16M-32                         29359025       27062443       -7.82%
    
    benchmark                                                old MB/s      new MB/s      speedup
    BenchmarkGalois128K-32                                   57379.29      58153.95      1.01x
    BenchmarkGalois1M-32                                     47824.71      55066.66      1.15x
    BenchmarkGaloisXor128K-32                                46639.10      47116.53      1.01x
    BenchmarkGaloisXor1M-32                                  43287.93      46160.20      1.07x
    BenchmarkEncode2x1x1M-32                                 80724.26      94994.70      1.18x
    BenchmarkEncode800x200/64-32                             2206.35       2278.42       1.03x
    BenchmarkEncode800x200/256-32                            3887.13       3953.86       1.02x
    BenchmarkEncode800x200/1024-32                           4931.12       5021.94       1.02x
    BenchmarkEncode800x200/4096-32                           5078.24       5185.38       1.02x
    BenchmarkEncode800x200/16384-32                          4006.88       4442.00       1.11x
    BenchmarkEncode800x200/65536-32                          2405.70       2718.79       1.13x
    BenchmarkEncode800x200/262144-32                         2173.51       2306.62       1.06x
    BenchmarkEncode800x200/1048576-32                        2323.13       2492.33       1.07x
    BenchmarkEncode1K/4+4/cauchy-32                          24475.48      23772.24      0.97x
    BenchmarkEncode1K/4+4/leopard-gf8-32                     12789.95      12961.00      1.01x
    BenchmarkEncode1K/4+4/leopard-gf16-32                    18014.46      18797.28      1.04x
    BenchmarkEncode1K/8+8/cauchy-32                          14909.02      15154.24      1.02x
    BenchmarkEncode1K/8+8/leopard-gf8-32                     8946.12       9142.99       1.02x
    BenchmarkEncode1K/8+8/leopard-gf16-32                    10190.76      10330.89      1.01x
    BenchmarkEncode1K/16+16/cauchy-32                        7549.49       7494.83       0.99x
    BenchmarkEncode1K/16+16/leopard-gf8-32                   9840.18       9990.30       1.02x
    BenchmarkEncode1K/16+16/leopard-gf16-32                  12423.98      12533.28      1.01x
    BenchmarkEncode1K/32+32/cauchy-32                        3797.58       3767.08       0.99x
    BenchmarkEncode1K/32+32/leopard-gf8-32                   6654.34       6810.61       1.02x
    BenchmarkEncode1K/32+32/leopard-gf16-32                  7361.49       7442.01       1.01x
    BenchmarkEncode1K/64+64/cauchy-32                        1908.67       1916.98       1.00x
    BenchmarkEncode1K/64+64/leopard-gf8-32                   7169.14       7285.10       1.02x
    BenchmarkEncode1K/64+64/leopard-gf16-32                  8424.57       8433.80       1.00x
    BenchmarkEncode1K/128+128/cauchy-32                      967.74        968.94        1.00x
    BenchmarkEncode1K/128+128/leopard-gf8-32                 5285.04       5364.05       1.01x
    BenchmarkEncode1K/128+128/leopard-gf16-32                5679.29       5731.83       1.01x
    BenchmarkEncode1K/256+256/leopard-gf16-32                6221.66       6292.60       1.01x
    BenchmarkEncode1K/512+512/leopard-gf16-32                4456.75       4524.11       1.02x
    BenchmarkEncode1K/1024+1024/leopard-gf16-32              4807.28       4866.02       1.01x
    BenchmarkEncode1K/2048+2048/leopard-gf16-32              3416.49       3784.32       1.11x
    BenchmarkEncode1K/4096+4096/leopard-gf16-32              3260.03       3689.60       1.13x
    BenchmarkEncode1K/8192+8192/leopard-gf16-32              2274.19       2603.83       1.14x
    BenchmarkEncode1K/16384+16384/leopard-gf16-32            1673.88       1925.51       1.15x
    BenchmarkEncode1K/32768+32768/leopard-gf16-32            1252.73       1336.23       1.07x
    BenchmarkDecode1K/4+4/cauchy-32                          3792.93       3622.19       0.95x
    BenchmarkDecode1K/4+4/cauchy-inv-32                      5983.86       5606.62       0.94x
    BenchmarkDecode1K/4+4/cauchy-single-32                   6300.83       6280.54       1.00x
    BenchmarkDecode1K/4+4/cauchy-single-inv-32               13551.36      12940.92      0.95x
    BenchmarkDecode1K/4+4/leopard-gf8-32                     1915.14       1961.54       1.02x
    BenchmarkDecode1K/4+4/leopard-gf8-inv-32                 3337.34       3516.48       1.05x
    BenchmarkDecode1K/4+4/leopard-gf8-single-32              2105.18       2184.65       1.04x
    BenchmarkDecode1K/4+4/leopard-gf8-single-inv-32          4150.58       4281.05       1.03x
    BenchmarkDecode1K/4+4/leopard-gf16-32                    10.31         10.34         1.00x
    BenchmarkDecode1K/4+4/leopard-gf16-single-32             10.34         10.33         1.00x
    BenchmarkDecode1K/8+8/cauchy-32                          3009.11       2899.32       0.96x
    BenchmarkDecode1K/8+8/cauchy-inv-32                      5611.04       5390.93       0.96x
    BenchmarkDecode1K/8+8/cauchy-single-32                   7155.24       7171.21       1.00x
    BenchmarkDecode1K/8+8/cauchy-single-inv-32               20782.54      20118.64      0.97x
    BenchmarkDecode1K/8+8/leopard-gf8-32                     2253.61       2296.49       1.02x
    BenchmarkDecode1K/8+8/leopard-gf8-inv-32                 3015.54       3177.67       1.05x
    BenchmarkDecode1K/8+8/leopard-gf8-single-32              2656.40       2675.83       1.01x
    BenchmarkDecode1K/8+8/leopard-gf8-single-inv-32          3808.73       3753.53       0.99x
    BenchmarkDecode1K/8+8/leopard-gf16-32                    20.68         20.81         1.01x
    BenchmarkDecode1K/8+8/leopard-gf16-single-32             20.47         20.66         1.01x
    BenchmarkDecode1K/16+16/cauchy-32                        1524.69       1512.59       0.99x
    BenchmarkDecode1K/16+16/cauchy-inv-32                    4198.25       4025.31       0.96x
    BenchmarkDecode1K/16+16/cauchy-single-32                 6330.39       6419.31       1.01x
    BenchmarkDecode1K/16+16/cauchy-single-inv-32             27068.69      27452.92      1.01x
    BenchmarkDecode1K/16+16/leopard-gf8-32                   2040.15       2072.28       1.02x
    BenchmarkDecode1K/16+16/leopard-gf8-inv-32               2223.54       2392.34       1.08x
    BenchmarkDecode1K/16+16/leopard-gf8-single-32            2357.57       2446.68       1.04x
    BenchmarkDecode1K/16+16/leopard-gf8-single-inv-32        2761.64       2852.71       1.03x
    BenchmarkDecode1K/16+16/leopard-gf16-32                  40.66         41.22         1.01x
    BenchmarkDecode1K/16+16/leopard-gf16-single-32           40.55         40.97         1.01x
    BenchmarkDecode1K/32+32/cauchy-32                        557.06        559.72        1.00x
    BenchmarkDecode1K/32+32/cauchy-inv-32                    2733.51       2679.96       0.98x
    BenchmarkDecode1K/32+32/cauchy-single-32                 4599.08       4728.23       1.03x
    BenchmarkDecode1K/32+32/cauchy-single-inv-32             32476.37      32874.63      1.01x
    BenchmarkDecode1K/32+32/leopard-gf8-32                   2055.34       2035.06       0.99x
    BenchmarkDecode1K/32+32/leopard-gf8-inv-32               2116.85       2165.02       1.02x
    BenchmarkDecode1K/32+32/leopard-gf8-single-32            3162.09       2917.71       0.92x
    BenchmarkDecode1K/32+32/leopard-gf8-single-inv-32        3163.38       2900.87       0.92x
    BenchmarkDecode1K/32+32/leopard-gf16-32                  79.63         81.04         1.02x
    BenchmarkDecode1K/32+32/leopard-gf16-single-32           80.24         80.96         1.01x
    BenchmarkDecode1K/64+64/cauchy-32                        161.19        162.89        1.01x
    BenchmarkDecode1K/64+64/cauchy-inv-32                    1618.58       1589.00       0.98x
    BenchmarkDecode1K/64+64/cauchy-single-32                 2850.67       2940.14       1.03x
    BenchmarkDecode1K/64+64/cauchy-single-inv-32             38190.74      39891.62      1.04x
    BenchmarkDecode1K/64+64/leopard-gf8-32                   1673.05       1762.57       1.05x
    BenchmarkDecode1K/64+64/leopard-gf8-inv-32               1786.44       2076.54       1.16x
    BenchmarkDecode1K/64+64/leopard-gf8-single-32            2614.63       2981.72       1.14x
    BenchmarkDecode1K/64+64/leopard-gf8-single-inv-32        2554.62       3180.75       1.25x
    BenchmarkDecode1K/64+64/leopard-gf16-32                  151.70        154.88        1.02x
    BenchmarkDecode1K/64+64/leopard-gf16-single-32           154.18        157.81        1.02x
    BenchmarkDecode1K/128+128/cauchy-32                      44.21         44.43         1.00x
    BenchmarkDecode1K/128+128/cauchy-inv-32                  862.07        855.85        0.99x
    BenchmarkDecode1K/128+128/cauchy-single-32               1597.56       1628.49       1.02x
    BenchmarkDecode1K/128+128/cauchy-single-inv-32           44812.17      46602.36      1.04x
    BenchmarkDecode1K/128+128/leopard-gf8-32                 1654.65       1796.62       1.09x
    BenchmarkDecode1K/128+128/leopard-gf8-inv-32             1721.13       1851.02       1.08x
    BenchmarkDecode1K/128+128/leopard-gf8-single-32          2335.01       2783.74       1.19x
    BenchmarkDecode1K/128+128/leopard-gf8-single-inv-32      2505.05       2722.86       1.09x
    BenchmarkDecode1K/128+128/leopard-gf16-32                282.54        284.91        1.01x
    BenchmarkDecode1K/128+128/leopard-gf16-single-32         293.55        295.88        1.01x
    BenchmarkDecode1K/256+256/leopard-gf16-32                462.96        474.14        1.02x
    BenchmarkDecode1K/256+256/leopard-gf16-single-32         515.05        522.54        1.01x
    BenchmarkDecode1K/512+512/leopard-gf16-32                701.27        719.41        1.03x
    BenchmarkDecode1K/512+512/leopard-gf16-single-32         821.71        845.65        1.03x
    BenchmarkDecode1K/1024+1024/leopard-gf16-32              835.08        890.46        1.07x
    BenchmarkDecode1K/1024+1024/leopard-gf16-single-32       1088.37       1174.14       1.08x
    BenchmarkDecode1K/2048+2048/leopard-gf16-32              916.84        1035.28       1.13x
    BenchmarkDecode1K/2048+2048/leopard-gf16-single-32       1231.99       1428.13       1.16x
    BenchmarkDecode1K/4096+4096/leopard-gf16-32              845.83        894.18        1.06x
    BenchmarkDecode1K/4096+4096/leopard-gf16-single-32       1127.52       1344.92       1.19x
    BenchmarkDecode1K/8192+8192/leopard-gf16-32              617.40        758.29        1.23x
    BenchmarkDecode1K/8192+8192/leopard-gf16-single-32       856.40        1055.93       1.23x
    BenchmarkDecode1K/16384+16384/leopard-gf16-32            512.36        553.42        1.08x
    BenchmarkDecode1K/16384+16384/leopard-gf16-single-32     780.06        829.41        1.06x
    BenchmarkDecode1K/32768+32768/leopard-gf16-32            487.48        552.79        1.13x
    BenchmarkDecode1K/32768+32768/leopard-gf16-single-32     748.81        794.75        1.06x
    BenchmarkEncodeLeopard/83840-32                          2163.35       2448.01       1.13x
    BenchmarkEncode10x2x10000-32                             38138.59      38660.50      1.01x
    BenchmarkEncode100x20x10000-32                           7688.19       9060.59       1.18x
    BenchmarkEncode17x3x1M-32                                78657.27      38162.16      0.49x
    BenchmarkEncode10x4x16M-32                               27346.47      23214.15      0.85x
    BenchmarkEncode5x2x1M-32                                 105293.15     108112.25     1.03x
    BenchmarkEncode10x2x1M-32                                119077.13     117328.97     0.99x
    BenchmarkEncode10x4x1M-32                                89996.44      82971.98      0.92x
    BenchmarkEncode50x20x1M-32                               23537.96      5038.47       0.21x
    BenchmarkEncodeLeopard50x20x1M-32                        6927.22       6163.82       0.89x
    BenchmarkEncode17x3x16M-32                               32983.93      28065.68      0.85x
    BenchmarkEncode_8x4x8M-32                                27000.57      29266.72      1.08x
    BenchmarkEncode_12x4x12M-32                              29593.55      30983.18      1.05x
    BenchmarkEncode_16x4x16M-32                              30833.05      30687.30      1.00x
    BenchmarkEncode_16x4x32M-32                              30842.71      31313.10      1.02x
    BenchmarkEncode_16x4x64M-32                              29655.62      31005.32      1.05x
    BenchmarkEncode_8x5x8M-32                                24868.94      26669.44      1.07x
    BenchmarkEncode_8x6x8M-32                                24333.50      26080.85      1.07x
    BenchmarkEncode_8x7x8M-32                                23288.33      25324.21      1.09x
    BenchmarkEncode_8x9x8M-32                                22626.99      23758.85      1.05x
    BenchmarkEncode_8x10x8M-32                               21839.39      22747.68      1.04x
    BenchmarkEncode_8x11x8M-32                               21754.38      22578.03      1.04x
    BenchmarkEncode_8x8x05M-32                               58222.07      26935.36      0.46x
    BenchmarkEncode_8x8x1M-32                                57582.31      61119.59      1.06x
    BenchmarkEncode_8x8x8M-32                                23131.88      24926.77      1.08x
    BenchmarkEncode_8x8x32M-32                               22220.71      22775.93      1.02x
    BenchmarkEncode_24x8x24M-32                              21930.61      26163.18      1.19x
    BenchmarkEncode_24x8x48M-32                              25406.13      25360.06      1.00x
    BenchmarkVerify800x200/64-32                             1585.15       1677.20       1.06x
    BenchmarkVerify800x200/256-32                            3055.95       3272.60       1.07x
    BenchmarkVerify800x200/1024-32                           3768.14       4259.11       1.13x
    BenchmarkVerify800x200/4096-32                           3782.60       4407.39       1.17x
    BenchmarkVerify800x200/16384-32                          2861.47       3549.40       1.24x
    BenchmarkVerify800x200/65536-32                          1989.88       2504.00       1.26x
    BenchmarkVerify800x200/262144-32                         1688.49       2071.25       1.23x
    BenchmarkVerify800x200/1048576-32                        1866.92       2138.37       1.15x
    BenchmarkVerify10x2x10000-32                             20326.26      21818.19      1.07x
    BenchmarkVerify50x5x100000-32                            26007.30      32132.24      1.24x
    BenchmarkVerify10x2x1M-32                                24960.93      29570.24      1.18x
    BenchmarkVerify5x2x1M-32                                 18540.92      21904.78      1.18x
    BenchmarkVerify10x4x1M-32                                13471.44      15647.82      1.16x
    BenchmarkVerify50x20x1M-32                               9303.86       8705.50       0.94x
    BenchmarkVerify10x4x16M-32                               10115.75      13227.73      1.31x
    BenchmarkReconstruct10x2x10000-32                        37249.17      38497.52      1.03x
    BenchmarkReconstruct800x200/64-32                        54.27         58.14         1.07x
    BenchmarkReconstruct800x200/256-32                       197.76        210.84        1.07x
    BenchmarkReconstruct800x200/1024-32                      544.30        589.32        1.08x
    BenchmarkReconstruct800x200/4096-32                      926.59        1061.05       1.15x
    BenchmarkReconstruct800x200/16384-32                     602.98        718.27        1.19x
    BenchmarkReconstruct800x200/65536-32                     547.09        603.17        1.10x
    BenchmarkReconstruct800x200/262144-32                    493.43        518.38        1.05x
    BenchmarkReconstruct800x200/1048576-32                   359.27        435.35        1.21x
    BenchmarkReconstruct50x5x50000-32                        37134.76      40813.72      1.10x
    BenchmarkReconstruct10x2x1M-32                           67919.63      67434.39      0.99x
    BenchmarkReconstruct5x2x1M-32                            56827.25      57840.28      1.02x
    BenchmarkReconstruct10x4x1M-32                           53012.91      54704.77      1.03x
    BenchmarkReconstruct50x20x1M-32                          20725.22      11977.81      0.58x
    BenchmarkReconstructLeopard50x20x1M-32                   2427.13       2530.21       1.04x
    BenchmarkReconstruct10x4x16M-32                          24887.19      26440.25      1.06x
    BenchmarkReconstructData10x2x10000-32                    39218.24      40158.36      1.02x
    BenchmarkReconstructData800x200/64-32                    55.77         58.65         1.05x
    BenchmarkReconstructData800x200/256-32                   202.61        211.06        1.04x
    BenchmarkReconstructData800x200/1024-32                  556.75        593.56        1.07x
    BenchmarkReconstructData800x200/4096-32                  938.82        1063.90       1.13x
    BenchmarkReconstructData800x200/16384-32                 626.09        722.66        1.15x
    BenchmarkReconstructData800x200/65536-32                 543.74        611.74        1.13x
    BenchmarkReconstructData800x200/262144-32                465.15        551.04        1.18x
    BenchmarkReconstructData800x200/1048576-32               400.92        431.63        1.08x
    BenchmarkReconstructData50x5x50000-32                    38400.98      42485.55      1.11x
    BenchmarkReconstructData10x2x1M-32                       73066.67      52953.80      0.72x
    BenchmarkReconstructData5x2x1M-32                        65271.49      73020.37      1.12x
    BenchmarkReconstructData10x4x1M-32                       64273.61      48146.03      0.75x
    BenchmarkReconstructData50x20x1M-32                      31718.30      17925.27      0.57x
    BenchmarkReconstructData10x4x16M-32                      32179.90      35221.52      1.09x
    BenchmarkReconstructP10x2x10000-32                       145267.14     144760.51     1.00x
    BenchmarkReconstructP10x5x20000-32                       203564.07     190537.34     0.94x
    BenchmarkParallel_8x8x64K-32                             172764.77     111229.29     0.64x
    BenchmarkParallel_8x8x05M-32                             23086.39      23602.81      1.02x
    BenchmarkParallel_20x10x05M-32                           26332.73      26416.51      1.00x
    BenchmarkParallel_8x8x1M-32                              23482.07      23647.57      1.01x
    BenchmarkParallel_8x8x8M-32                              23587.55      23329.00      0.99x
    BenchmarkParallel_8x8x32M-32                             23571.23      23551.34      1.00x
    BenchmarkParallel_8x3x1M-32                              28482.50      28411.98      1.00x
    BenchmarkParallel_8x4x1M-32                              26990.67      26935.25      1.00x
    BenchmarkParallel_8x5x1M-32                              25777.97      25740.90      1.00x
    BenchmarkStreamEncode10x2x10000-32                       17811.48      17925.46      1.01x
    BenchmarkStreamEncode100x20x10000-32                     3700.48       3934.28       1.06x
    BenchmarkStreamEncode17x3x1M-32                          11744.12      12104.29      1.03x
    BenchmarkStreamEncode10x4x16M-32                         8709.65       9171.17       1.05x
    BenchmarkStreamEncode5x2x1M-32                           12556.48      12980.44      1.03x
    BenchmarkStreamEncode10x2x1M-32                          12735.21      12759.79      1.00x
    BenchmarkStreamEncode10x4x1M-32                          11852.03      12235.81      1.03x
    BenchmarkStreamEncode50x20x1M-32                         8000.61       4188.19       0.52x
    BenchmarkStreamEncode17x3x16M-32                         9978.17       10440.42      1.05x
    BenchmarkStreamVerify10x2x10000-32                       12114.06      12367.66      1.02x
    BenchmarkStreamVerify50x5x50000-32                       7576.44       7856.47       1.04x
    BenchmarkStreamVerify10x2x1M-32                          8462.79       8772.25       1.04x
    BenchmarkStreamVerify5x2x1M-32                           6852.10       7112.05       1.04x
    BenchmarkStreamVerify10x4x1M-32                          6483.22       6619.19       1.02x
    BenchmarkStreamVerify50x20x1M-32                         5556.64       4439.93       0.80x
    BenchmarkStreamVerify10x4x16M-32                         5714.50       6199.45       1.08x 
    
  • Avoid copy on Leopard GF8

    Avoid copy on Leopard GF8

    WIP: Still a subtle bug

    Speed is encouraging though:

    BenchmarkEncodeLeopard50x20x1M-32    	     100	  10201946 ns/op	7194.74 MB/s	    3403 B/op	       3 allocs/op
    BenchmarkEncodeLeopard50x20x1M-32            147           8052620 ns/op        9115.09 MB/s         816 B/op         34 allocs/op
    
Related tags
Fake server, Consumer Driven Contracts and help with testing performance from one configuration file with zero system dependencies and no coding whatsoever
Fake server, Consumer Driven Contracts and help with testing performance from one configuration file with zero system dependencies and no coding whatsoever

mockingjay server Mockingjay lets you define the contract between a consumer and producer and with just a configuration file you get: A fast to launch

Jan 6, 2023
Coding challenge for fullstack and backend developer candidates

Backend/API Developer Code Challenge Overview In this task, you will develop a commandline interface for working with a standard todo.txt file. (todo.

Sep 19, 2022
Live coding a basic Go compiler with LLVM in 20 minutes

go2ll-talk The code presented at Sheffield Go, 7th March. Slides link To run, just say make. To take a look at the output of the program, run go run .

Jul 2, 2022
Go programming language secure coding practices guide

You can download this book in the following formats: PDF, Mobi and ePub. Introduction Go Language - Web Application Secure Coding Practices is a guide

Jan 9, 2023
Creative coding in Go
Creative coding in Go

Ink is a framework for creative 2D graphics in Go, based on OpenGL. Visit buchanae.github.io/ink for more. Example: a simple triangle Install: go get

Dec 5, 2022
101+ coding interview problems in Go

116+ Coding Interview Problems with Detailed Solutions The Ultimate Go Study Guide eBook version → Join my mailing list to get the latest updates here

Dec 31, 2022
💯 Materials to help you rock your next coding interview

Tech Interview Handbook Credits: Illustration by @leftaligned Read on the website Black Lives Matter. Support the Equal Justice Initiative What is thi

Jan 4, 2023
high performance coding with golang(Go 语言高性能编程,Go 语言陷阱,Gotchas,Traps)
high performance coding with golang(Go 语言高性能编程,Go 语言陷阱,Gotchas,Traps)

Go 语言高性能编程 订阅 最新动态可以关注:知乎 Go语言 或微博 极客兔兔 订阅方式:watch geektutu/blog ,每篇文章都能收到邮件通知,或通过 RSS 订阅。

Dec 28, 2022
A repository for the X-Team community to collaborate and learn solutions to most coding challenges to help prepare for their interviews.
A repository for the X-Team community to collaborate and learn solutions to most coding challenges to help prepare for their interviews.

Community Coding Challenge Handbook This repository focuses on helping X-Teamers and community members to thrive through coding challenges offering so

Sep 6, 2022
Coding assessment to create Todo app given by Percipia

Coding assessment to create Todo app given by Percipia

Oct 20, 2021
Coding along the book
Coding along the book

Learn Go with Tests Art by Denise Formats Gitbook EPUB or PDF Translations 中文 Português 日本語 한국어 Türkçe Support me I am proud to offer this resource fo

Oct 30, 2021
Coding Challenge for Fullstacklabs Hiring Process

Cuboids Challenge This API manages bags and cuboids. A cuboid is a three-dimensional rectangular box. Each face of a cuboid is a rectangle and adjacen

Mar 31, 2022
Fake server, Consumer Driven Contracts and help with testing performance from one configuration file with zero system dependencies and no coding whatsoever
Fake server, Consumer Driven Contracts and help with testing performance from one configuration file with zero system dependencies and no coding whatsoever

mockingjay server Mockingjay lets you define the contract between a consumer and producer and with just a configuration file you get: A fast to launch

Jan 6, 2023
GFG Coding Challenge With Golang

Software Engineer Coding challenge How to submit your code Init git repository and commit initial code there (it is not needed to push it somewhere);

Dec 7, 2021
Coding challenge - Word Counts

Word Count Challenge Run the program Build the application with go build. The application accepts input from filename parameters, or from STDIN. For e

Dec 14, 2021
Coding challenge for fullstack and backend developer candidates

todotxt Yet another a Go library for Gina Trapani's todo.txt files. ✅ Features Based on go-todotxt from Fabio Berchtold with: Go mod support Segments

Dec 10, 2022
Coding challenge for fullstack and backend developer candidates

Backend/API Developer Code Challenge Overview In this task, you will develop a commandline interface for working with a standard todo.txt file. (todo.

Sep 19, 2022
Veritone coding challenge for golang

Usage Install go get github.com/mvlipka/veritone_coding_challenge Example packag

Dec 16, 2021