LZ4 compression and decompression in pure Go

Last update: Dec 27, 2022

Comments: 17

lz4 : LZ4 compression in pure Go

Overview

This package provides a streaming interface to LZ4 data streams as well as low level compress and uncompress functions for LZ4 data blocks. The implementation is based on the reference C one.

Install

Assuming you have the go toolchain installed:

go get github.com/pierrec/lz4

There is a command line interface tool to compress and decompress LZ4 files.

go install github.com/pierrec/lz4/cmd/lz4c

Usage

Usage of lz4c:
  -version
        print the program version

Subcommands:
Compress the given files or from stdin to stdout.
compress [arguments] [<file name> ...]
  -bc
        enable block checksum
  -l int
        compression level (0=fastest)
  -sc
        disable stream checksum
  -size string
        block max size [64K,256K,1M,4M] (default "4M")

Uncompress the given files or from stdin to stdout.
uncompress [arguments] [<file name> ...]

Example

// Compress and uncompress an input string.
s := "hello world"
r := strings.NewReader(s)

// The pipe will uncompress the data from the writer.
pr, pw := io.Pipe()
zw := lz4.NewWriter(pw)
zr := lz4.NewReader(pr)

go func() {
	// Compress the input string.
	_, _ = io.Copy(zw, r)
	_ = zw.Close() // Make sure the writer is closed
	_ = pw.Close() // Terminate the pipe
}()

_, _ = io.Copy(os.Stdout, zr)

// Output:
// hello world

Contributing

Contributions are very welcome for bug fixing, performance improvements...!

Open an issue with a proper description
Send a pull request with appropriate test case(s)

Contributors

Thanks to all contributors so far!

Special thanks to @Zariel for his asm implementation of the decoder.

Special thanks to @klauspost for his work on optimizing the code.

Owner

Pierre Curto

https://github.com/pierrec/lz4

Comments

The latest commit (move stuff to v2) breaks package with go 1.10 and earlier

Everything is in the name.

go get github.com/pierrec/lz4

package github.com/pierrec/lz4/v2/internal/xxh32: cannot find package "github.com/pierrec/lz4/v2/internal/xxh32" in any of:
        /usr/lib/go-1.9/src/github.com/pierrec/lz4/v2/internal/xxh32 (from $GOROOT)
        /***/gopath/src/github.com/pierrec/lz4/v2/internal/xxh32 (from $GOPATH)

Fix previous commit that needs to convert pool returns.
Can't go get v4 because it doesn't compile.

Also I noted 2 staticcheck issue:

One is fixed in this PR. (int32 can never be negative).

The second one is more complex.

It seems that you want can to reuse hashtables([]int) with a sync.pool .

chainTable = HashTablePool.Get().([]int) defer HashTablePool.Put(chainTable)

This cause https://staticcheck.io/docs/checks#SA6002. The problem is that you need to store it as pointer, otherwise the slice header is copied, not a bit deal but still 3 int allocations ( that could be avoided). I think you could create your own type or do a pointer dance here.

I wanted to help you fix this one but I realized you might want to remove the parameters in the function since you always passing nil ? e.g. CompressBlockHC(src, dst []byte, depth CompressionLevel, hashTable, chainTable []int) = > CompressBlockHC(src, dst []byte, depth CompressionLevel)

WDYT ?

Signed-off-by: Cyril Tovena [email protected]
Multithreading support

I noticed that this library uses only one cpu core. It may be problem on processing large data. Do you have ideas about how to add multi thread support?

module name in go.mod for v2

It should have v2 in go.mod

module github.com/pierrec/lz4/v2

and then this example works,

import "github.com/pierrec/lz4/v2"

func test() {
    // ...
    text := lz4.NewReader(compressed)
    // ...
}

[v4] Adds arm64 acceleration to decoder.

Solves #142. Adapted from @greatroar's work on arm32.

  daisy  lizf  …  lz4  internal  lz4block  uname -a
Linux daisy 5.10.0-1008-oem #9+lx2k1 SMP Sat Dec 26 01:51:36 UTC 2020 aarch64 aarch64 aarch64 GNU/Linux
  daisy  lizf  …  lz4  internal  lz4block  go test ./...
ok  	github.com/pierrec/lz4/v4/internal/lz4block	(cached)

LZ4 Legacy support?

I noticed that the package doesn't have legacy support There is this legacy magic value here: https://github.com/lz4/lz4/blob/dev/programs/lz4io.c#L80

Is there any plan to support this?

Non-deterministic output observed

Hi,

we are observing a weird issue, where different servers running the same version of LZ4 package (github.com/pierrec/lz4 v2.3.0+incompatible) produce different compressed output from the same input, and I'm trying to understand what is going on.

In our codebase, we reuse *lz4.Writer instances, and use (*lz4.Writer).Reset call when starting a new output. (*lz4.Writer) receives single Write call with entire input all at once. Inputs are ~2 MB, we use 64kb buffers in LZ4, mostly to reduce memory usage during decompression.

When reading two files with lz4 debug enabled, I see these outputs:

LZ4: reader.go:145 Read buf len=32768
LZ4: reader.go:103 header block max size id=4 size=65536
LZ4: reader.go:132 header read: lz4.Header{BlockMaxSize: 65536 }
LZ4: reader.go:152 header read OK compressed buffer 65536 / 131072 uncompressed buffer 65536 : 65536 index=65536
LZ4: reader.go:164 reading block from writer
LZ4: reader.go:203 raw block size 15501
LZ4: reader.go:238 compressed block size 15501
LZ4: reader.go:274 current frame checksum 30916118
LZ4: reader.go:295 copied 32768 bytes to input
LZ4: reader.go:145 Read buf len=32768
LZ4: reader.go:295 copied 32768 bytes to input
LZ4: reader.go:145 Read buf len=32768
LZ4: reader.go:164 reading block from writer
LZ4: reader.go:203 raw block size 15400
LZ4: reader.go:238 compressed block size 15400
LZ4: reader.go:274 current frame checksum fd477d82
LZ4: reader.go:295 copied 32768 bytes to input
LZ4: reader.go:145 Read buf len=32768
LZ4: reader.go:295 copied 32768 bytes to input
LZ4: reader.go:145 Read buf len=32768
LZ4: reader.go:164 reading block from writer
LZ4: reader.go:203 raw block size 15396
LZ4: reader.go:238 compressed block size 15396
LZ4: reader.go:274 current frame checksum 564efa4
LZ4: reader.go:295 copied 32768 bytes to input
LZ4: reader.go:145 Read buf len=32768
LZ4: reader.go:295 copied 32768 bytes to input
LZ4: reader.go:145 Read buf len=32768
LZ4: reader.go:164 reading block from writer
LZ4: reader.go:203 raw block size 15388
LZ4: reader.go:238 compressed block size 15388
LZ4: reader.go:274 current frame checksum 20424901
LZ4: reader.go:295 copied 32768 bytes to input
LZ4: reader.go:145 Read buf len=32768
LZ4: reader.go:295 copied 32768 bytes to input
LZ4: reader.go:145 Read buf len=32768
LZ4: reader.go:164 reading block from writer
LZ4: reader.go:203 raw block size 13886
LZ4: reader.go:238 compressed block size 13886
LZ4: reader.go:274 current frame checksum bba29999
LZ4: reader.go:295 copied 32768 bytes to input
LZ4: reader.go:145 Read buf len=32768
LZ4: reader.go:295 copied 25551 bytes to input
LZ4: reader.go:145 Read buf len=32768
LZ4: reader.go:164 reading block from writer
LZ4: reader.go:185 frame checksum got=bba29999 / want=bba29999

Second:

LZ4: reader.go:145 Read buf len=32768
LZ4: reader.go:103 header block max size id=4 size=65536
LZ4: reader.go:132 header read: lz4.Header{BlockMaxSize: 65536 }
LZ4: reader.go:152 header read OK compressed buffer 65536 / 131072 uncompressed buffer 65536 : 65536 index=65536
LZ4: reader.go:164 reading block from writer
LZ4: reader.go:203 raw block size 15501
LZ4: reader.go:238 compressed block size 15501
LZ4: reader.go:274 current frame checksum 30916118
LZ4: reader.go:295 copied 32768 bytes to input
LZ4: reader.go:145 Read buf len=32768
LZ4: reader.go:295 copied 32768 bytes to input
LZ4: reader.go:145 Read buf len=32768
LZ4: reader.go:164 reading block from writer
LZ4: reader.go:203 raw block size 15400
LZ4: reader.go:238 compressed block size 15400
LZ4: reader.go:274 current frame checksum fd477d82
LZ4: reader.go:295 copied 32768 bytes to input
LZ4: reader.go:145 Read buf len=32768
LZ4: reader.go:295 copied 32768 bytes to input
LZ4: reader.go:145 Read buf len=32768
LZ4: reader.go:164 reading block from writer
LZ4: reader.go:203 raw block size 15399
LZ4: reader.go:238 compressed block size 15399
LZ4: reader.go:274 current frame checksum 564efa4
LZ4: reader.go:295 copied 32768 bytes to input
LZ4: reader.go:145 Read buf len=32768
LZ4: reader.go:295 copied 32768 bytes to input
LZ4: reader.go:145 Read buf len=32768
LZ4: reader.go:164 reading block from writer
LZ4: reader.go:203 raw block size 15386
LZ4: reader.go:238 compressed block size 15386
LZ4: reader.go:274 current frame checksum 20424901
LZ4: reader.go:295 copied 32768 bytes to input
LZ4: reader.go:145 Read buf len=32768
LZ4: reader.go:295 copied 32768 bytes to input
LZ4: reader.go:145 Read buf len=32768
LZ4: reader.go:164 reading block from writer
LZ4: reader.go:203 raw block size 13887
LZ4: reader.go:238 compressed block size 13887
LZ4: reader.go:274 current frame checksum bba29999
LZ4: reader.go:295 copied 32768 bytes to input
LZ4: reader.go:145 Read buf len=32768
LZ4: reader.go:295 copied 25551 bytes to input
LZ4: reader.go:145 Read buf len=32768
LZ4: reader.go:164 reading block from writer
LZ4: reader.go:185 frame checksum got=bba29999 / want=bba29999

The only difference is that raw block sizes for last two blocks are 15388, 13886 and 15386, 13887 respectively, otherwise both files decompress back to the same data.

Is there anything that can make LZ4 writers to generate slightly different output? Reset call seems to reset everything except the hashtable – could that be an issue?

Thank you.

ps: So far I've been unable to reproduce the issue on my machine. :-(

CompressBlock error returns are confusing
CompressBlock and CompressBlockHC docs currently state:

// If the destination buffer size is lower than CompressBlockBound and // the compressed size is 0 and no error, then the data is incompressible. // // An error is returned if the destination buffer is too small.

This was introduced to fix #71, but I find it confusing. What is the difference between incompressible and the destination buffer being too small?
go test results in error: invalid frame checksum

To reproduce I created a larger file:

for i in {1..100};do cat Mark.Twain-Tom.Sawyer.txt >> longer.txt; done

And added the "longer.txt" into the _test files.

Result: testdata/longer.txt : 24394635 / 18475893 / 38785100 --- FAIL: TestWriter (0.00s) --- FAIL: TestWriter/testdata/longer.txt/lz4.Header{CompressionLevel:10} (3.72s) writer_test.go:65: lz4: invalid frame checksum: got cdbda0bd; expected f1d62b24 FAIL exit status 1 FAIL github.com/pierrec/lz4 18.054s

I believe it's got something to do with io.Reader and it's quirk of not returning everything each time.
Increase general compression efficiency
tl;dr increased compression speed for small messages 95%, for large ones ~60%, but parallelism has died.

I like optimizing things. For one project I'm working on we wanted to support lz4 in a network compression scenario, where we are encoding small standalone sequences in a single data stream. I wrote a couple of reference implementations; flate/gzip performed quite well initially, but lz4 didn't do so well. I set about fixing that.

The first adjustment I did was to remove the parallelism in the Writer. I expect this is the most controversial change in this PR and totally understand if you don't want to accept it as a result. But, I think it makes sense... Go's built-in gzip implementation is single threaded and in most applications where lz4 support would be embedded, threading compression itself is unnecessary because the application is already serving other requests on all available threads; concurrency comes at the request level rather than the operation level. Also this let me do some other optimizations further below...

This change directly allowed me to make tweaks in the writer so that no memory allocations are necessary during compression. Previously Go needed to make some heap allocations for memory that was shared between threads. These changes cut about 50% off the time of compressing the lorem string, and are what is included in the first commit in this PR.

After that I noticed that each write was suspiciously slow given that we weren't allocating anything. Turns out most of the time was spec in clearing the hashTable as it was allocated on the stack. I toyed around with a few solutions to avoid this but ultimately ended up with a "generational" table, where each call to CompressBlock is given its own unique generation ID and operations only affect other values written during the current generation. This, combined with a single threading approach, lets us cache the hashTable. This change cut 75% off of the time that was remaining.

Finally, pprof showed me that quite a bit of time was being spent in binary.LittleEndian.Uint32. I used a little bit of scary unsafe that avoids this call on most common computers (x86/amd64/some ARM) which natively have little endian byte order. This cut 50% off of the remaining time.

All in all, compressing lorem in the included benchmark now runs about 95% faster on my machine. I was concerned that removing the parallelism would hurt compression speed on large sequences but it seems that, at least on the VM I was testing on, the optimizations which become possible as a result were a good tradeoff.

benchmark old ns/op new ns/op delta BenchmarkCompressEndToEnd 44258 2421 -94.53% BenchmarkCompressEndToEnd-10MB 24691717 10113045 -59.04%

Look ma, no mallocs!

BenchmarkCompressEndToEnd 500000 2434 ns/op 8 B/op 0 allocs/op

I think I got most of the low-hanging fruit in this PR. The majority of the remaining time is spent in branch mispredictions, some memmoves, hashing, bitwise operators, and assignments.
lz4 reader read block until buf is full?

for a io.reader n, err := lz4Reader.read(buf) i expect read should directly get any of current result and return the current read length, but it block until buf is full. it cause problem: https://go.dev/play/p/uFyyFbCL7Yp
avoid uncompressed duplicates in testdata to make module smaller

I was investigating what the largest dependencies were in a project by measuring how large the module zips were, and yours stood out at about thirty megabytes. This isn't terrible per se, but it's still surprisingly large, and still slows down some builds depending on the internet speed.

I see that the testdata directory contains some files in both the compressed and uncompressed forms. Have you thought about only keeping the compressed forms?

Add lz4.AppendOption for NewWriter

Add lz4.AppendOption, if NewWriter set lz4.AppendOption(true) ,then will not write header.

In this way, you can continue to write new data on the existing lz4 file.

like this:

        // 
	fw, _ := os.OpenFile(path, os.O_CREATE|os.O_RDWR|os.O_APPEND, 0644)
	w := lz4.NewWriter(fw)

	w.Apply(lz4.ChecksumOption(false))

	w.Write(d)
	w.Flush()

        // do some other  thing, or a long time later

        
	fw, _ := os.OpenFile(path, os.O_CREATE|os.O_RDWR|os.O_APPEND, 0644)
	w := lz4.NewWriter(fw)

	w.Apply(lz4.ChecksumOption(false), lz4.AppendOption(true))

	w.Write(d)
	w.Flush()

Thank

jsonlz4: lz4: invalid source or destination buffer too short

I tried using pierrec/lz4 to access mozilla's jsonlz4 files using the way outlined here: https://github.com/pierrec/lz4/issues/28 , but it always shows this error:

lz4: invalid source or destination buffer too short

mozlz4a.py can decompress them without problems

Demo: https://go.dev/play/p/A32NWBYkchg

package main

import (
	"fmt"
	"crypto/md5"
	"github.com/pierrec/lz4"
)

// Payload created with this Python script: https://gist.github.com/Tblue/62ff47bef7f894e92ed5
//
// $ printf 'mozLz40\x00!\x00\x00\x00\xF0\x12{\"version\":[\"sessionrestore\",1]}\n'  | md5sum
// 12c5a86eaafe57bbb0345f52505610bf  -
// printf 'mozLz40\x00!\x00\x00\x00\xF0\x12{\"version\":[\"sessionrestore\",1]}\n'  | python3.7 mozlz4a.py  -d -
// {"version":["sessionrestore",1]}

func md5sum(s string) (r string) {
	digest := md5.New()
	digest.Write([]byte(s))
	return fmt.Sprintf("%x", digest.Sum(nil))
}

var payload string = "mozLz40\x00!\x00\x00\x00\xF0\x12{\"version\":[\"sessionrestore\",1]}\n"

func main() {
	fmt.Println(md5sum(payload))
	out := make([]byte, len(payload)*1000)
	_, e := lz4.UncompressBlock([]byte(payload), out)
	if e != nil {
		panic(e)
	}
	fmt.Print(string(out))
}

lz4: invalid source or destination buffer too short

I have a compressed lz4 file (which is produced by a cpp program using c lz4 lib). I can decode it normally using lz4 command tool, but using golang lz4 cmd tool, it failed with this error message: lz4: invalid source or destination buffer too short

what can be the possible cause of this issue?
[RSVP] What is the source of xxh32zero.go?

Hi @pierrec,

What is the source of xxh32zero.go? I'm guessing your ported the reference implementation, and that the unreachable url in xxh32zero.go is a form of attribution. Or is this someone else's work that you then built off of?

I was able to circumvent the other issues by excluding files mentioned in #178 and disabling associated tests. Debian ftpmasters are treating #194, this issue, as a hard blocker. Finally this issue is also blocking work on reverse dependencies, so please reply asap :)

Thank you, Nicholas
Data race when using concurrency > 1

In v4 branch, the lz library may cause data corrupt if you have set concurrency > 1. The race problem happens at

https://github.com/pierrec/lz4/blob/v4/writer.go#L100 https://github.com/pierrec/lz4/blob/v4/writer.go#L159

Suppose this scenario, the calling sequences are Write, Flush, Write, Flush repeatedly, this is a very common scenario in web server. When w.data is filled with data but doesn't reach the buffer size, calling Flush will send w.data[:w.idx] to channel and reset w.idx to zero. Here comes the problem because w.data[:w.idx] is a shadow copy of slice, that means a new call of Write and compression gorouting writes and reads the same underlaying byte array with an overlap index.

Related tags

LZ4 compression and decompression in pure Go

lz4 : LZ4 compression in pure Go

Overview

Install

Example

Contributing

Contributors

Owner

Pierre Curto

Comments

The latest commit (move stuff to v2) breaks package with go 1.10 and earlier

Fix previous commit that needs to convert pool returns.

Multithreading support

module name in go.mod for v2

[v4] Adds arm64 acceleration to decoder.

LZ4 Legacy support?

Non-deterministic output observed

CompressBlock error returns are confusing

go test results in error: invalid frame checksum

Increase general compression efficiency

lz4 reader read block until buf is full?

avoid uncompressed duplicates in testdata to make module smaller

Add lz4.AppendOption for NewWriter

jsonlz4: lz4: invalid source or destination buffer too short

lz4: invalid source or destination buffer too short

[RSVP] What is the source of xxh32zero.go?

Data race when using concurrency > 1

Related tags

Optimized compression packages

Go wrapper for LZO compression library

Go parallel gzip (de)compression

Unsigned Integer 32 Byte Packing Compression

Bzip2 Compression Tool written in Go

Slipstream is a method for lossless compression of power system data.

An easy-to-use CLI-based compression tool.

zlib compression tool for modern multi-core machines written in Go

a little app to gzip+base64 encode and decode

A simple zip compactor app written in golang to help you life. Usage with native GUI and CLI.

Port of LZ4 lossless compression algorithm to Go

An effective time-series data compression/decompression method based on Facebook's Gorilla.

Go bindings for unarr (decompression library for RAR, TAR, ZIP and 7z archives)

Package cae implements PHP-like Compression and Archive Extensions.

Novel, efficient, and practical image compression with visually appealing results. 🤏 ✨

Optimized compression packages

Go wrapper for LZO compression library

Go parallel gzip (de)compression

Integer Compression Libraries for Go

Using brotli compression to embed static files in Go.