The Snappy compression format in the Go programming language.

Last update: Jan 4, 2023

Comments: 15

The Snappy compression format in the Go programming language.

To download and install from source:
$ go get github.com/golang/snappy

Unless otherwise noted, the Snappy-Go source files are distributed
under the BSD-style license found in the LICENSE file.



Benchmarks.

The golang/snappy benchmarks include compressing (Z) and decompressing (U) ten
or so files, the same set used by the C++ Snappy code (github.com/google/snappy
and note the "google", not "golang"). On an "Intel(R) Core(TM) i7-3770 CPU @
3.40GHz", Go's GOARCH=amd64 numbers as of 2016-05-29:

"go test -test.bench=."

_UFlat0-8         2.19GB/s ± 0%  html
_UFlat1-8         1.41GB/s ± 0%  urls
_UFlat2-8         23.5GB/s ± 2%  jpg
_UFlat3-8         1.91GB/s ± 0%  jpg_200
_UFlat4-8         14.0GB/s ± 1%  pdf
_UFlat5-8         1.97GB/s ± 0%  html4
_UFlat6-8          814MB/s ± 0%  txt1
_UFlat7-8          785MB/s ± 0%  txt2
_UFlat8-8          857MB/s ± 0%  txt3
_UFlat9-8          719MB/s ± 1%  txt4
_UFlat10-8        2.84GB/s ± 0%  pb
_UFlat11-8        1.05GB/s ± 0%  gaviota

_ZFlat0-8         1.04GB/s ± 0%  html
_ZFlat1-8          534MB/s ± 0%  urls
_ZFlat2-8         15.7GB/s ± 1%  jpg
_ZFlat3-8          740MB/s ± 3%  jpg_200
_ZFlat4-8         9.20GB/s ± 1%  pdf
_ZFlat5-8          991MB/s ± 0%  html4
_ZFlat6-8          379MB/s ± 0%  txt1
_ZFlat7-8          352MB/s ± 0%  txt2
_ZFlat8-8          396MB/s ± 1%  txt3
_ZFlat9-8          327MB/s ± 1%  txt4
_ZFlat10-8        1.33GB/s ± 1%  pb
_ZFlat11-8         605MB/s ± 1%  gaviota



"go test -test.bench=. -tags=noasm"

_UFlat0-8          621MB/s ± 2%  html
_UFlat1-8          494MB/s ± 1%  urls
_UFlat2-8         23.2GB/s ± 1%  jpg
_UFlat3-8         1.12GB/s ± 1%  jpg_200
_UFlat4-8         4.35GB/s ± 1%  pdf
_UFlat5-8          609MB/s ± 0%  html4
_UFlat6-8          296MB/s ± 0%  txt1
_UFlat7-8          288MB/s ± 0%  txt2
_UFlat8-8          309MB/s ± 1%  txt3
_UFlat9-8          280MB/s ± 1%  txt4
_UFlat10-8         753MB/s ± 0%  pb
_UFlat11-8         400MB/s ± 0%  gaviota

_ZFlat0-8          409MB/s ± 1%  html
_ZFlat1-8          250MB/s ± 1%  urls
_ZFlat2-8         12.3GB/s ± 1%  jpg
_ZFlat3-8          132MB/s ± 0%  jpg_200
_ZFlat4-8         2.92GB/s ± 0%  pdf
_ZFlat5-8          405MB/s ± 1%  html4
_ZFlat6-8          179MB/s ± 1%  txt1
_ZFlat7-8          170MB/s ± 1%  txt2
_ZFlat8-8          189MB/s ± 1%  txt3
_ZFlat9-8          164MB/s ± 1%  txt4
_ZFlat10-8         479MB/s ± 1%  pb
_ZFlat11-8         270MB/s ± 1%  gaviota



For comparison (Go's encoded output is byte-for-byte identical to C++'s), here
are the numbers from C++ Snappy's

make CXXFLAGS="-O2 -DNDEBUG -g" clean snappy_unittest.log && cat snappy_unittest.log

BM_UFlat/0     2.4GB/s  html
BM_UFlat/1     1.4GB/s  urls
BM_UFlat/2    21.8GB/s  jpg
BM_UFlat/3     1.5GB/s  jpg_200
BM_UFlat/4    13.3GB/s  pdf
BM_UFlat/5     2.1GB/s  html4
BM_UFlat/6     1.0GB/s  txt1
BM_UFlat/7   959.4MB/s  txt2
BM_UFlat/8     1.0GB/s  txt3
BM_UFlat/9   864.5MB/s  txt4
BM_UFlat/10    2.9GB/s  pb
BM_UFlat/11    1.2GB/s  gaviota

BM_ZFlat/0   944.3MB/s  html (22.31 %)
BM_ZFlat/1   501.6MB/s  urls (47.78 %)
BM_ZFlat/2    14.3GB/s  jpg (99.95 %)
BM_ZFlat/3   538.3MB/s  jpg_200 (73.00 %)
BM_ZFlat/4     8.3GB/s  pdf (83.30 %)
BM_ZFlat/5   903.5MB/s  html4 (22.52 %)
BM_ZFlat/6   336.0MB/s  txt1 (57.88 %)
BM_ZFlat/7   312.3MB/s  txt2 (61.91 %)
BM_ZFlat/8   353.1MB/s  txt3 (54.99 %)
BM_ZFlat/9   289.9MB/s  txt4 (66.26 %)
BM_ZFlat/10    1.2GB/s  pb (19.68 %)
BM_ZFlat/11  527.4MB/s  gaviota (37.72 %)

Owner

Go

The Go Programming Language

https://github.com/golang/snappy

Comments

Weird failure when building on Raspbian / Debian 10.11

Getting this error upon attempting to run:

`go get github.com/golang/snappy

github.com/golang/snappy

asm: 00001 (/root/go/src/github.com/golang/snappy/encode_arm64.s:30) TEXT "".emitLiteral(SB), NOSPLIT, $32-56: unaligned frame size 32 - must be 8 mod 16 (or 0) asm: 00119 (/root/go/src/github.com/golang/snappy/encode_arm64.s:264) TEXT "".encodeBlock(SB), $32896-56: unaligned frame size 32896 - must be 8 mod 16 (or 0) asm: assembly failed`

Fast skipping on uncompressible data

This simple change will allow the encoder to skip checks if there are many consecutive match misses. This gives about 3% performance decrease on ordinary data, and in some cases up to 1%, but usually 0.1% compression loss. However, the speedup on uncompressible data is >20x.

Considering the typical use case of Snappy, I think this is a reasonable tradeoff.

I have added an additional "random data" test, as well as a benchmark that print the compression ratio.

Benchmark and size comparison:

benchmark               old ns/op     new ns/op     delta
Benchmark_ZFlat0-4      386453        395917        +2.45%
Benchmark_ZFlat1-4      5243380       5430495       +3.57%
Benchmark_ZFlat2-4      1219780       41927         -96.56%
Benchmark_ZFlat3-4      1219781       41859         -96.57%
Benchmark_ZFlat4-4      876567        323408        -63.11%
Benchmark_ZFlat5-4      1511665       1558024       +3.07%
Benchmark_ZFlat6-4      1297856       1317876       +1.54%
Benchmark_ZFlat7-4      1134592       1161773       +2.40%
Benchmark_ZFlat8-4      3508325       3570376       +1.77%
Benchmark_ZFlat9-4      4529655       4626386       +2.14%
Benchmark_ZFlat10-4     392659        403983        +2.88%
Benchmark_ZFlat11-4     1060197       1071549       +1.07%
Benchmark_ZFlat12-4     10637050      226674        -97.87%

benchmark               old MB/s     new MB/s     speedup
Benchmark_ZFlat0-4      264.97       258.64       0.98x
Benchmark_ZFlat1-4      133.90       129.29       0.97x
Benchmark_ZFlat2-4      100.91       2935.84      29.09x
Benchmark_ZFlat3-4      100.91       2940.62      29.14x
Benchmark_ZFlat4-4      116.82       316.63       2.71x
Benchmark_ZFlat5-4      270.96       262.90       0.97x
Benchmark_ZFlat6-4      117.18       115.40       0.98x
Benchmark_ZFlat7-4      110.33       107.75       0.98x
Benchmark_ZFlat8-4      121.64       119.53       0.98x
Benchmark_ZFlat9-4      106.38       104.15       0.98x
Benchmark_ZFlat10-4     302.01       293.55       0.97x
Benchmark_ZFlat11-4     173.85       172.01       0.99x
Benchmark_ZFlat12-4     98.58        4625.91      46.93x

This is the compression loss - percentage added to compressed size:

| Dataset | Loss % | | --- | --- | | html | 0.04% | | urls | 0.07% | | jpg | 0.00% | | jpg_200 | 0.00% | | pdf | 0.74% | | html4 | 0.02% | | txt1 | -0.06% | | txt2 | 0.16% | | txt3 | 0.04% | | txt4 | 0.15% | | pb | 0.13% | | gaviota | 0.00% | | random | 0.00% |

Sheet of compression loss data

Not working on apple m1

version 0.0.2, 0.0.3: crash version 0.0.1: OK crashes in encode_arm64.s

error cause is
related to snappy, arm64
runtime.sigpanic()
        /usr/local/go/src/runtime/signal_unix.go:741 +0x230 fp=0x1400159e140 sp=0x1400159e100 pc=0x10303b9f0
github.com/golang/snappy.encodeBlock(0x14000dc7502, 0x1304, 0x1304, 0x14000dc6000, 0x102f, 0x13bc, 0x14000195b01)
        $HOME/go/pkg/mod/github.com/golang/[email protected]/encode_arm64.s:666 +0x360 fp=0x140015a61e0 sp=0x1400159e150 pc=0x1037118c0
github.com/golang/snappy.Encode(0x14000dc7500, 0x1306, 0x1306, 0x0, 0x0, 0x0, 0x2, 0x4, 0x140015a62f8)
       $HOME/go/pkg/mod/github.com/golang/[email protected]/encode.go:39 +0x17c fp=0x140015a6230 sp=0x140015a61e0 pc=0x103710dfc

port amd64 assembly to arm64

This change was produced by taking the amd64 assembly and reproducing it as closely as possible for the arm64 arch.

The main differences:

arm64 uses registers R1-R17 which are mapped directly onto an amd64 counterpart
arm64 requires 8 additional bytes of stack so callee args are displaced by 8 bytes from amd64
operands to CMP instructions are reversed except in a few cases where arm64 uses a BLS (branch less-same) instead of JAE (jump above-equal)
immediates in some cases have to be split to a separate MOVD instruction
shifts can be combined with another instruction, such as an ADD, in some cases
The amd64 BSFQ instruction is implemented with a bit reversal and leading zero count instruction
memclear on arm64 makes use of the SIMD instructions to clear 64 bytes at a time and uses a pointer comparison instead of a counter to reduce the number of instructions in the loop

Tested on an AWS m6g.large (ARMv8.2):

name              old time/op    new time/op     delta
WordsDecode1e1-2    29.2ns ± 0%     26.2ns ± 1%   -10.51%  (p=0.000 n=9+10)
WordsDecode1e2-2     187ns ± 0%      107ns ± 0%   -42.78%  (p=0.000 n=7+10)
WordsDecode1e3-2    2.16µs ± 1%     0.95µs ± 0%   -55.85%  (p=0.000 n=10+10)
WordsDecode1e4-2    30.1µs ± 0%     10.4µs ± 2%   -65.40%  (p=0.000 n=10+10)
WordsDecode1e5-2     348µs ± 0%      168µs ± 0%   -51.86%  (p=0.000 n=10+9)
WordsDecode1e6-2    3.47ms ± 0%     1.71ms ± 0%   -50.66%  (p=0.000 n=10+10)
WordsEncode1e1-2    19.4ns ± 0%     21.7ns ± 1%   +12.06%  (p=0.000 n=8+10)
WordsEncode1e2-2    2.09µs ± 0%     0.25µs ± 0%   -88.14%  (p=0.000 n=9+10)
WordsEncode1e3-2    6.67µs ± 1%     2.49µs ± 0%   -62.63%  (p=0.000 n=10+10)
WordsEncode1e4-2    63.5µs ± 1%     29.4µs ± 1%   -53.63%  (p=0.000 n=10+9)
WordsEncode1e5-2     722µs ± 0%      345µs ± 0%   -52.21%  (p=0.000 n=10+10)
WordsEncode1e6-2    7.17ms ± 0%     3.41ms ± 0%   -52.46%  (p=0.000 n=10+8)
RandomEncode-2       106µs ± 2%       78µs ± 0%   -26.02%  (p=0.000 n=10+10)
_UFlat0-2            152µs ± 0%       69µs ± 1%   -54.90%  (p=0.000 n=10+9)
_UFlat1-2           1.57ms ± 0%     0.77ms ± 0%   -51.10%  (p=0.000 n=9+10)
_UFlat2-2           6.84µs ± 0%     6.55µs ± 0%    -4.25%  (p=0.000 n=10+8)
_UFlat3-2            312ns ± 0%      183ns ± 0%   -41.35%  (p=0.000 n=10+9)
_UFlat4-2           15.4µs ± 1%      9.7µs ± 1%   -36.79%  (p=0.000 n=10+10)
_UFlat5-2            625µs ± 0%      301µs ± 1%   -51.88%  (p=0.000 n=9+10)
_UFlat6-2            570µs ± 0%      278µs ± 0%   -51.18%  (p=0.000 n=10+9)
_UFlat7-2            490µs ± 0%      240µs ± 1%   -50.95%  (p=0.000 n=10+10)
_UFlat8-2           1.52ms ± 0%     0.74ms ± 0%   -51.01%  (p=0.000 n=8+7)
_UFlat9-2           2.00ms ± 0%     1.01ms ± 0%   -49.49%  (p=0.000 n=10+10)
_UFlat10-2           132µs ± 0%       62µs ± 2%   -53.19%  (p=0.000 n=10+10)
_UFlat11-2           497µs ± 0%      258µs ± 0%   -48.11%  (p=0.000 n=10+9)
_ZFlat0-2            346µs ± 1%      136µs ± 5%   -60.70%  (p=0.000 n=10+9)
_ZFlat1-2           3.63ms ± 0%     1.76ms ± 0%   -51.60%  (p=0.000 n=10+8)
_ZFlat2-2           13.2µs ± 0%      9.5µs ± 0%   -27.62%  (p=0.000 n=8+9)
_ZFlat3-2           2.49µs ± 0%     0.45µs ± 0%   -81.96%  (p=0.002 n=8+10)
_ZFlat4-2           50.5µs ± 0%     15.7µs ± 1%   -68.96%  (p=0.000 n=10+9)
_ZFlat5-2           1.40ms ± 0%     0.56ms ± 0%   -60.20%  (p=0.000 n=9+9)
_ZFlat6-2           1.13ms ± 0%     0.54ms ± 0%   -52.39%  (p=0.000 n=10+9)
_ZFlat7-2            961µs ± 0%      472µs ± 0%   -50.83%  (p=0.000 n=10+10)
_ZFlat8-2           3.03ms ± 0%     1.43ms ± 0%   -52.90%  (p=0.000 n=9+10)
_ZFlat9-2           3.88ms ± 0%     1.95ms ± 0%   -49.72%  (p=0.000 n=10+10)
_ZFlat10-2           339µs ± 0%      123µs ± 3%   -63.82%  (p=0.000 n=10+10)
_ZFlat11-2           973µs ± 0%      433µs ± 0%   -55.49%  (p=0.000 n=10+10)
ExtendMatch-2       22.1µs ± 1%      9.8µs ± 0%   -55.63%  (p=0.000 n=10+10)

name              old speed      new speed       delta
WordsDecode1e1-2   342MB/s ± 0%    382MB/s ± 1%   +11.77%  (p=0.000 n=9+10)
WordsDecode1e2-2   535MB/s ± 0%    934MB/s ± 0%   +74.43%  (p=0.000 n=10+10)
WordsDecode1e3-2   463MB/s ± 1%   1049MB/s ± 0%  +126.52%  (p=0.000 n=10+10)
WordsDecode1e4-2   333MB/s ± 0%    961MB/s ± 2%  +189.04%  (p=0.000 n=10+10)
WordsDecode1e5-2   287MB/s ± 0%    597MB/s ± 0%  +107.72%  (p=0.000 n=10+9)
WordsDecode1e6-2   288MB/s ± 0%    584MB/s ± 0%  +102.67%  (p=0.000 n=10+10)
WordsEncode1e1-2   515MB/s ± 0%    460MB/s ± 0%   -10.70%  (p=0.000 n=10+10)
WordsEncode1e2-2  47.8MB/s ± 0%  403.3MB/s ± 0%  +743.40%  (p=0.000 n=10+10)
WordsEncode1e3-2   150MB/s ± 1%    401MB/s ± 0%  +167.66%  (p=0.000 n=10+9)
WordsEncode1e4-2   157MB/s ± 1%    340MB/s ± 1%  +115.66%  (p=0.000 n=10+9)
WordsEncode1e5-2   138MB/s ± 0%    290MB/s ± 0%  +109.24%  (p=0.000 n=10+10)
WordsEncode1e6-2   139MB/s ± 0%    293MB/s ± 0%  +110.35%  (p=0.000 n=10+8)
RandomEncode-2    9.93GB/s ± 2%  13.42GB/s ± 0%   +35.15%  (p=0.000 n=10+10)
_UFlat0-2          672MB/s ± 0%   1489MB/s ± 1%  +121.75%  (p=0.000 n=10+9)
_UFlat1-2          446MB/s ± 0%    913MB/s ± 0%  +104.48%  (p=0.000 n=9+10)
_UFlat2-2         18.0GB/s ± 0%   18.8GB/s ± 0%    +4.44%  (p=0.000 n=8+8)
_UFlat3-2          641MB/s ± 0%   1091MB/s ± 0%   +70.19%  (p=0.000 n=10+10)
_UFlat4-2         6.66GB/s ± 1%  10.53GB/s ± 1%   +58.19%  (p=0.000 n=10+10)
_UFlat5-2          655MB/s ± 0%   1362MB/s ± 1%  +107.80%  (p=0.000 n=9+10)
_UFlat6-2          267MB/s ± 0%    547MB/s ± 0%  +104.82%  (p=0.000 n=10+9)
_UFlat7-2          255MB/s ± 0%    521MB/s ± 1%  +103.89%  (p=0.000 n=10+10)
_UFlat8-2          281MB/s ± 0%    574MB/s ± 0%  +104.14%  (p=0.000 n=8+7)
_UFlat9-2          241MB/s ± 0%    478MB/s ± 0%   +97.97%  (p=0.000 n=10+10)
_UFlat10-2         896MB/s ± 0%   1914MB/s ± 2%  +113.64%  (p=0.000 n=10+10)
_UFlat11-2         371MB/s ± 0%    715MB/s ± 0%   +92.72%  (p=0.000 n=10+9)
_ZFlat0-2          296MB/s ± 1%    754MB/s ± 5%  +154.57%  (p=0.000 n=10+9)
_ZFlat1-2          194MB/s ± 0%    400MB/s ± 0%  +106.63%  (p=0.000 n=10+8)
_ZFlat2-2         9.35GB/s ± 0%  12.92GB/s ± 0%   +38.17%  (p=0.000 n=8+10)
_ZFlat3-2         80.3MB/s ± 0%  445.6MB/s ± 0%  +454.64%  (p=0.000 n=10+10)
_ZFlat4-2         2.03GB/s ± 0%   6.54GB/s ± 1%  +222.19%  (p=0.000 n=10+9)
_ZFlat5-2          292MB/s ± 0%    733MB/s ± 0%  +151.25%  (p=0.000 n=9+9)
_ZFlat6-2          135MB/s ± 0%    284MB/s ± 0%  +110.05%  (p=0.000 n=10+9)
_ZFlat7-2          130MB/s ± 0%    265MB/s ± 0%  +103.38%  (p=0.000 n=10+10)
_ZFlat8-2          141MB/s ± 0%    299MB/s ± 0%  +112.30%  (p=0.000 n=9+10)
_ZFlat9-2          124MB/s ± 0%    247MB/s ± 0%   +98.90%  (p=0.000 n=10+10)
_ZFlat10-2         350MB/s ± 0%    967MB/s ± 3%  +176.44%  (p=0.000 n=10+10)
_ZFlat11-2         189MB/s ± 0%    426MB/s ± 0%  +124.65%  (p=0.000 n=10+10)

slice bounds out of range in decode

I'm trying to get a full working repro, but my current best guess is this was while trying to read a file which had experienced a partial write of the compressed data:

panic: runtime error: slice bounds out of range

goroutine 69 [running]:
panic(0xf4cf00, 0xc82000e0b0)
    /go/src/runtime/panic.go:464 +0x3e6
github.com/golang/snappy.(*Reader).Read(0xc89eb43730, 0xcb1ae08000, 0x10000, 0x10000, 0xa, 0x0, 0x0)
    /gopath/src/github.com/golang/snappy/decode.go:198 +0xba4
bufio.(*Reader).fill(0xc9eb4603c0)
    /go/src/bufio/bufio.go:97 +0x1e9
bufio.(*Reader).Read(0xc9eb4603c0, 0xc93f20326d, 0xd93, 0xd93, 0xc82b48fd00, 0x0, 0x0)
    /go/src/bufio/bufio.go:207 +0x260
io/ioutil.(*nopCloser).Read(0xc91fe04ec0, 0xc93f20326d, 0xd93, 0xd93, 0x20e1843c1a693, 0x0, 0x0)
    <autogenerated>:4 +0x82
bufio.(*Scanner).Scan(0xc837df2500, 0x11b6370)
    /go/src/bufio/scan.go:208 +0x97b
...

Integrated sync.Pool + Multiple Optimizations
I've made the following optimizations:

sync.Pool is now used to avoid reallocating buffers for both Reads and Writes. This should drastically speed up all cases where snappy is used more than once within a reasonable time. There is very little disadvantage, even if the buffers are never reused and always expire from the pool; the overhead for that is negligible.

I prepared the header slice of bytes to avoid it being recalculated every time.

I added additional constants to avoid runtime calculations.

I reduced the number of writes to the underlying writer from 2 to 1 on an average write by including the chunk header in the same slice of bytes as the chunk body (uncompressed chunks are still written with 2 writes, but compressed chunks are now written in 1 write.) In a normal case this would half the number of writes to the underlying writer. Writes can be expensive, especially to disk, so this should give a significant performance increase.

I added a buffer on the Writer which is used to group small writes into one larger write. Writes that are already over half of the max chunk size are written directly without copying to the buffer. Small writes are copied to the buffer and written together as one chunk. This was done carefully so as not to give a performance decrease in any circumstance: only small writes are buffered, large writes are written directly, and a combination of small and large writes will work fine. This feature should enhance both write speed and compression for many small writes without any decrease in performance for large writes.

I added the function WriteOnce, which makes encoding a single slice of bytes with snappy more efficient. This is the same but much faster than NewWriter -> Write -> Close.

I made a number of minor optimizations, including separating Encode into two functions, which saves some redundant calculations if encode is called via the Write function.

Note on compatibility: due to the addition of the buffer on the Writer it is now necessary to Close() the writer. This makes this version of snappy incompatible with the previous version. However, the performance gain is significant and so I believe it is reasonable to expect people to update their code. The Reader does not have to be closed, but not doing so will mean the buffers are not repooled.
all: simpler import path

this CL simplifies the import path from: github.com/golang/snappy/snappy to: github.com/golang/snappy

it also adds the github.com/golang/snappy "vanity" import path to make sure we get only one version of this code.

Fix wrong arm64 scaled register format

Arm64 does not have scaled register format, casue snappy test failed for current go tip:

	$ go version
	go version devel go1.17-24875e3880 Tue Apr 20 15:14:05 2021 +0000 darwin/arm64
	$ go test
	# github.com/golang/snappy
	./encode_arm64.s:385: arm64 doesn't support scaled register format
	./encode_arm64.s:675: arm64 doesn't support scaled register format
	asm: assembly of ./encode_arm64.s failed
	FAIL	github.com/golang/snappy [build failed]

See https://go-review.googlesource.com/c/go/+/289589

Avoid allocating table on stack.

The current implementation allocates a 64-128KB table on the stack on every call to Encode.

While this is reasonably fast, with a slight modification we can reuse this table across calls. This is a particular gain for the framing encoder, but even for the stateless Encode function this gives a slight speedup.

Furthermore the table is now int32, which saves 50% memory on 64 bit systems. This change does not itself affect the speed, but only the memory usage.

To be able to reuse buffers across calls to Encode we store the used tables in a sync.Pool.

The table size is now fixed. We can experiment with the size later.

>go test -bench=ZFlat >new.txt && benchcmp old.txt new.txt
benchmark               old ns/op     new ns/op     delta
Benchmark_ZFlat0-8      347674        334371        -3.83%
Benchmark_ZFlat1-8      4742511       4677414       -1.37%
Benchmark_ZFlat2-8      1098529       1043970       -4.97%
Benchmark_ZFlat3-8      1101338       1052056       -4.47%
Benchmark_ZFlat4-8      789205        769194        -2.54%
Benchmark_ZFlat5-8      1330515       1349258       +1.41%
Benchmark_ZFlat6-8      1145645       1139680       -0.52%
Benchmark_ZFlat7-8      1051315       988243        -6.00%
Benchmark_ZFlat8-8      3153910       3096453       -1.82%
Benchmark_ZFlat9-8      3997054       4006660       +0.24%
Benchmark_ZFlat10-8     348186        339939        -2.37%
Benchmark_ZFlat11-8     935487        925993        -1.01%

benchmark               old MB/s     new MB/s     speedup
Benchmark_ZFlat0-8      294.53       306.25       1.04x
Benchmark_ZFlat1-8      148.04       150.10       1.01x
Benchmark_ZFlat2-8      112.05       117.91       1.05x
Benchmark_ZFlat3-8      111.77       117.00       1.05x
Benchmark_ZFlat4-8      129.75       133.13       1.03x
Benchmark_ZFlat5-8      307.85       303.57       0.99x
Benchmark_ZFlat6-8      132.75       133.45       1.01x
Benchmark_ZFlat7-8      119.07       126.67       1.06x
Benchmark_ZFlat8-8      135.31       137.82       1.02x
Benchmark_ZFlat9-8      120.55       120.26       1.00x
Benchmark_ZFlat10-8     340.59       348.85       1.02x
Benchmark_ZFlat11-8     197.03       199.05       1.01x

asm: invalid instruction

Hi,

this is a follow-up to #30. The same build but for golang 1.6 fails for another reason:

# github.com/golang/snappy
asm: invalid instruction: 00228 (/home/travis/gopath/src/github.com/golang/snappy/encode_amd64.s:338)   MOVWQZX table+120(SP)(R11*2), R15
asm: invalid instruction: 00234 (/home/travis/gopath/src/github.com/golang/snappy/encode_amd64.s:343)   MOVW    AX, table+120(SP)(R11*2)
asm: invalid instruction: 00589 (/home/travis/gopath/src/github.com/golang/snappy/encode_amd64.s:506)   MOVW    AX, table+120(SP)(R11*2)
asm: invalid instruction: 00606 (/home/travis/gopath/src/github.com/golang/snappy/encode_amd64.s:515)   MOVWQZX table+120(SP)(R11*2), R15
asm: invalid instruction: 00610 (/home/travis/gopath/src/github.com/golang/snappy/encode_amd64.s:519)   MOVW    AX, table+120(SP)(R11*2)
asm: asm: assembly of ../../golang/snappy/encode_amd64.s failed

Full log is located here, with the error message starting on line 348 - https://gist.github.com/serejja/b0c40abc844ab4ce1c8630afa988e3bb

Just to provide a bit more info (which is in that log anyway):

go version go1.4 linux/amd64
Ubuntu 12.04.5 LTS, 3.13.0-29-generic
gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3

Please let me know if this is not related to golang/snappy itself and I should open a ticket on Travis CI. Thanks!

Fix max block size check

Since Go 1.1, int has been 64-bits on 64-bit platforms instead of 32-bits. This patch fixes the check to make sure the uncompressed length is at most 2^32-1 bytes.

Fixes #15
README: update instructions to install CLI
Fix the go get command, explaining that it now only works for installing libraries, and add the now-mandatory @<version> suffix.

Add instructions for using go install to run the binary.
How should I validate encoded data?

Hey,

I try to use snappy with golang, I see JS and Java has a function named isValidCompressed which can validate the encoded buffer, but I can't see that function with golang. I am thinking to use decode function directly. And if it returns error, it means the encoded buffer is invalid. But I am not sure that's safe/correct or not. Such as below: byte[] bytes = Snappy.isValidCompressedBuffer(recordBytes) ? Snappy.uncompress(recordBytes) : recordBytes; Anyone have any ideas on how to safe and quickly validate the encoded buffer?

Test failure on 32 bits arches

With version 0.0.2 on Golang 1.15, the tests are failing on ARMv7hl and i686:

Testing    in: /builddir/build/BUILD/snappy-0.0.2/_build/src
         PATH: /builddir/build/BUILD/snappy-0.0.2/_build/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/sbin
       GOPATH: /builddir/build/BUILD/snappy-0.0.2/_build:/usr/share/gocode
  GO111MODULE: off
      command: go test -buildmode pie -compiler gc -ldflags " -X github.com/golang/snappy/version=0.0.2 -extldflags '-Wl,-z,relro -Wl,--as-needed  -Wl,-z,now -specs=/usr/lib/rpm/redhat/redhat-hardened-ld  '"
      testing: github.com/golang/snappy
github.com/golang/snappy
--- FAIL: TestDecode (0.00s)
    snappy_test.go:365: #29 (decodedLen=0; tagCopy4, 4 extra length|offset bytes; with msb set (0x93); discovered by go-fuzz):
        got  "", snappy: unsupported literal length
        want "", snappy: corrupt input
FAIL
exit status 1
FAIL	github.com/golang/snappy	0.628s

How come snappytool does not support stream format?

I looked at the history of snappytool and it seems like the tool was purposely rewritten to use block format instead of stream format. What's the reason behind this? Why can't the snappytool support both stream and block format?

I would love to have stream format support because, then, I can use the tool to decode a snappy encoded stream (that is stored in some DB) for troubleshooting and debugging purposes.

Related tags

Serialization snappy

Musgo is a Go code generator for binary MUS format with validation support.

Musgo is a Go code generator for binary MUS format with validation support. Generated code converts data to and from MUS format.

Dec 29, 2022

A library that provides dynamic features of Go language.

go-dynamic go-dynamic is a library that provides dynamic features of Go language. Installation To install go-dynamic, use go get: go get -u github.com

Dec 8, 2021

Seekable ZSTD compression format implemented in Golang.

ZSTD seekable compression format implementation in Go Seekable ZSTD compression format implemented in Golang. This library provides a random access re

Jan 7, 2023

Floppa programming language inspired by the brainfck programming language. Created just for fun and you can convert your brainfck code to floppa code.

Floppa Programming Language Created just for fun. But if you want to contribute, why not? Floppa p.l. inspired by the brainf*ck programming language.

Oct 20, 2022

T# Programming Language. Something like Porth, Forth but written in Go. Stack-oriented programming language.

The T# Programming Language WARNING! THIS LANGUAGE IS A WORK IN PROGRESS! ANYTHING CAN CHANGE AT ANY MOMENT WITHOUT ANY NOTICE! Something like Forth a

Jun 29, 2022

Yayx programming language is begginer friendly programming language.

Yayx Yayx programming language is begginer friendly programming language. What have yayx: Easy syntax Dynamic types Can be compiled to outhers program

Dec 27, 2021

Yayx programming language is begginer friendly programming language.

Yayx Yayx programming language is begginer friendly programming language. What have yayx: Easy syntax Dynamic types Can be compiled to outhers program

May 20, 2022

indodate is a plugin for golang programming language for date convertion on indonesian format

indodate is a package for golang programming language for date conversion on indonesian format

Oct 23, 2021

Advent of Code is an Advent calendar of small programming puzzles for a variety of skill sets and skill levels that can be solved in any programming language you like.

Advent of Code 2021 Advent of Code is an Advent calendar of small programming puzzles for a variety of skill sets and skill levels that can be solved

Dec 2, 2021

Advent of Code is an Advent calendar of small programming puzzles for a variety of skill sets and skill levels that can be solved in any programming language you like. This repository holds my submission/answers for these challenges.

Advent of Code - Zach Howell's Answers Advent of Code is an Advent calendar of small programming puzzles for a variety of skill sets and skill levels

Jan 4, 2022

Read metrics from a Message Queue in Json format and expose them in a Prometheus compatible format

mq2prom Read metrics from a Message Queue in Json format and expose them in a Prometheus compatible format. Currently only works for MQTT compatible M

Jan 24, 2022

Using NFP (Number Format Parser) you can get an Abstract Syntax Tree (AST) from Excel number format expression

NFP (Number Format Parser) Using NFP (Number Format Parser) you can get an Abstract Syntax Tree (AST) from Excel number format expression. Installatio

Feb 4, 2022

A repository for showcasing my knowledge of the Google Go (2009) programming language, and continuing to learn the language.

Learning Google Golang (programming language) Not to be confused with the Go! programming language by Francis McCabe I don't know very much about the

Nov 6, 2022

The Snappy compression format in the Go programming language.

Owner

Go

Comments

Weird failure when building on Raspbian / Debian 10.11

github.com/golang/snappy

Fast skipping on uncompressible data

Not working on apple m1

port amd64 assembly to arm64

slice bounds out of range in decode

Integrated sync.Pool + Multiple Optimizations

all: simpler import path

Fix wrong arm64 scaled register format

Avoid allocating table on stack.

asm: invalid instruction

Fix max block size check

README: update instructions to install CLI

How should I validate encoded data?

Test failure on 32 bits arches

How come snappytool does not support stream format?

Related tags

Musgo is a Go code generator for binary MUS format with validation support.

A library that provides dynamic features of Go language.

Seekable ZSTD compression format implemented in Golang.

Floppa programming language inspired by the brainf*ck programming language. Created just for fun and you can convert your brainf*ck code to floppa code.

T# Programming Language. Something like Porth, Forth but written in Go. Stack-oriented programming language.

Yayx programming language is begginer friendly programming language.

Yayx programming language is begginer friendly programming language.

indodate is a plugin for golang programming language for date convertion on indonesian format

Advent of Code is an Advent calendar of small programming puzzles for a variety of skill sets and skill levels that can be solved in any programming language you like.

Advent of Code is an Advent calendar of small programming puzzles for a variety of skill sets and skill levels that can be solved in any programming language you like. This repository holds my submission/answers for these challenges.

Read metrics from a Message Queue in Json format and expose them in a Prometheus compatible format

Using NFP (Number Format Parser) you can get an Abstract Syntax Tree (AST) from Excel number format expression

A repository for showcasing my knowledge of the Google Go (2009) programming language, and continuing to learn the language.

A repository for showcasing my knowledge of the Go! (2003) programming language, and continuing to learn the language.

Optimized compression packages

Go wrapper for LZO compression library

Port of LZ4 lossless compression algorithm to Go

LZ4 compression and decompression in pure Go

Go parallel gzip (de)compression

Integer Compression Libraries for Go

Floppa programming language inspired by the brainfck programming language. Created just for fun and you can convert your brainfck code to floppa code.