Unsigned Integer 32 Byte Packing Compression

dbp32

Unsigned Integer 32 Byte Packing Compression. Inspired by lemire/FastPFor.

Package bp32 is an implementation of the binary packing integer compression algorithm in Go using unsigned 32 integer blocks. It is mostly suitable for sorted arrays containing small positive 32bit-integers like IPv4 addresses or timestamp. Given a list of sorted integers, it computes the successive differences prior for compression purposes. For details, check Daniel Lemire and Leonid Boytsov, Decoding billions of integers per second.

Usage

This is the link to the go playground although time.Since does not work there.

package main

import (
	"fmt"
	"math/rand"
	"time"

	"github.com/0xc0d/dbp32"
)

func main() {
	input := make([]uint32, 1e6)
	randomSortedUint32(input, 1e3)
	output := make([]uint32, 1e6)

	s := time.Now()
	n, err := dbp32.Compress(input, output)
	if err != nil {
		panic(err)
	}
	e := time.Since(s)

	fmt.Println("Compressed in:", e.String(), "Ratio: ", float64(1e6)/float64(n))

	s = time.Now()
	n, err = dbp32.Decompress(output[:n], input)
	if err != nil {
		panic(err)
	}
	if n != 1e6 {
		panic("decompression failed")
	}
	e = time.Since(s)

	fmt.Println("Decompressed in:", e.String(), "ns")
}

func randomSortedUint32(in []uint32, maxDist uint32) {
	last := rand.Uint32() % (maxDist)
	for i := range in {
		in[i] = last + rand.Uint32()%(maxDist)
		last = in[i]
	}
}

Benchmark

cpu: Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
BenchmarkCompress
BenchmarkCompress/standard_1000
BenchmarkCompress/standard_1000-16         	  488390	        2488 ns/op	     896 B/op	       1 allocs/op
BenchmarkCompress/standard_100000
BenchmarkCompress/standard_100000-16       	    7816	      143335 ns/op	     256 B/op	       1 allocs/op
BenchmarkCompress/standard_1000000
BenchmarkCompress/standard_1000000-16      	     880	     1733248 ns/op	     512 B/op	       1 allocs/op
BenchmarkCompress/standard_100000000
BenchmarkCompress/standard_100000000-16    	       7	   161584835 ns/op	       0 B/op	       0 allocs/op
BenchmarkCompress/scored_1000
BenchmarkCompress/scored_1000-16           	  602750	        1730 ns/op	     896 B/op	       1 allocs/op
BenchmarkCompress/scored_100000
BenchmarkCompress/scored_100000-16         	    8806	      120738 ns/op	     256 B/op	       1 allocs/op
BenchmarkCompress/scored_1000000
BenchmarkCompress/scored_1000000-16        	     980	     1244335 ns/op	     512 B/op	       1 allocs/op
BenchmarkCompress/scored_100000000
BenchmarkCompress/scored_100000000-16      	       8	   150365733 ns/op	       0 B/op	       0 allocs/op
BenchmarkDecompress
BenchmarkDecompress/standard_1000
BenchmarkDecompress/standard_1000-16         	 1000000	      1158 ns/op	       0 B/op	       0 allocs/op
BenchmarkDecompress/standard_100000
BenchmarkDecompress/standard_100000-16       	   18090	     66249 ns/op	       0 B/op	       0 allocs/op
BenchmarkDecompress/standard_1000000
BenchmarkDecompress/standard_1000000-16      	    1636	    659632 ns/op	       0 B/op	       0 allocs/op
BenchmarkDecompress/standard_100000000
BenchmarkDecompress/standard_100000000-16    	      16	  71385376 ns/op	       0 B/op	       0 allocs/op
BenchmarkDecompress/scored_1000
BenchmarkDecompress/scored_1000-16           	 1367968	     869.7 ns/op	       0 B/op	       0 allocs/op
BenchmarkDecompress/scored_100000
BenchmarkDecompress/scored_100000-16         	   18804	     63270 ns/op	       0 B/op	       0 allocs/op
BenchmarkDecompress/scored_1000000
BenchmarkDecompress/scored_1000000-16        	    1640	    637747 ns/op	       0 B/op	       0 allocs/op
BenchmarkDecompress/scored_100000000
BenchmarkDecompress/scored_100000000-16      	      18	  70347136 ns/op	       0 B/op	       0 allocs/op

Owner
Ali Josie
Software Engineer
Ali Josie
Similar Resources

Wrap byte read options with uniform interface for io.Reader and byte slice

nibbler Nibble chunks from Reader streams and slice in a common way Overview This is a golang module that provides an interface for treating a Reader

Dec 23, 2021

Integer Compression Libraries for Go

Encoding This is a set of integer compression algorithms implemented in Go. It is an (incomplete) port of the JavaFastPFOR by Dr. Daniel Lemire. For m

Dec 16, 2022

sign Apple’s mobileconfig file to solve the ‘unsigned’ problem

sign Apple’s mobileconfig file to solve the ‘unsigned’ problem

amcs(apple mobile config signature) sign Apple’s mobileconfig file to solve the ‘unsigned’ problem the project rely openssl https://github.com/openssl

Nov 25, 2022

Go structure annotations that supports encoding and decoding; similar to C-style bitfields. Supports bitfield packing, self-describing layout parameters, and alignment.

Go structure annotations that supports encoding and decoding; similar to C-style bitfields. Supports bitfield packing, self-describing layout parameters, and alignment.

STRUCTure EXtensions structex provides annotation rules that extend Go structures for implementation of encoding and decoding of byte backed data fram

Oct 13, 2022

Membin is an in-memory database that can be stored on disk. Data model smiliar to key-value but values store as JSON byte array.

Membin Docs | Contributing | License What is Membin? The Membin database system is in-memory database smiliar to key-value databases, target to effici

Jun 3, 2021

ByNom is a Go package for parsing byte sequences, suitable for parsing text and binary data

ByNom is a Go package for parsing byte sequences. Its goal is to provide tools to build safe byte parsers without compromising the speed or memo

May 5, 2021

fast integer log base 10

Package log10 calculates log base 10 of an integer, fast. It is inspired by Daniel Lemire's blog post on this topic. TODO: Add implementations for oth

Mar 3, 2022

Bitwise AND on two byte-slices using SIMD instructions

This package provides a vectorised function which performs bitwise AND operation on all pairs of elements in two byte-slices. It detects CPU instruction set and chooses the available best one (AVX512, AVX2, SSE2).

Oct 17, 2022

golang struct 或其他对象向 []byte 的序列化或反序列化

bytecodec 字节流编解码 这个库实现 struct 或其他对象向 []byte 的序列化或反序列化 可以帮助你在编写 tcp 服务,或者需要操作字节流时,简化数据的组包、解包 这个库的组织逻辑 copy 借鉴了标准库 encoding/json 🙏 安装 使用 go get 安装最新版本

Jun 30, 2022

A simple evaluator for arithmetic integer expressions.

The expr package provides a simple evaluator for arithmetic integer expressions. The syntax and operations are the same as in Go. Operands are the nat

Dec 13, 2022

Simple go package which converts roman strings to integer

romanparse Simple go package which converts roman strings

Aug 11, 2022

Fast integer map for uint32-to-uint32

Fast integer map for uint32-to-uint32

Uint32-to-Uint32 Map This repository contains an implementation of uint32-to-uint32 map which is ~20-50% faster than Go standard map for the same type

Sep 21, 2022

Simple gc using integer vectors to simulate

gcint Simple gc using integer vectors to simulate Iterate primarily over what should be the shorter vector (readers) removing unused references in fro

Nov 24, 2021

A byte buffer library for GO

gbytebuffer A byte buffer library to read/write simple variable types from/into byte slices. Conversions happen in little endian format by default but

Nov 27, 2021

Algorithms for various integer sequences from the OEIS site.

OEIS The ongoing quest to program every sequence in the OEIS database (in Golang) Content sequences -- The folder containing the seq package, which co

Dec 23, 2021

Advent of Code Input Loader, provide a session cookie and a problem date, returns a string or []byte of the input

Advent of Code Get (aocget) A small lib to download your puzzle input for a given day. Uses your session token to authenticate to obtain your personal

Dec 9, 2021

Shuffle bits in byte slice, written in golang

Shuffle 'em It is a library for bit shuffling, written in golang. The usecases a

Dec 27, 2021

Guess-number-game - Computer thoughts of some integer number, you must guess it with limited number of attempts

Guess-number-game - Computer thoughts of some integer number, you must guess it with limited number of attempts

Guess number game Rules Computer has thought of some integer number. You must guess it, you have numberOfAttempts attempts. How to run Just type in co

Dec 31, 2021

Takes an integer array, returns the array sorted and the number of inversions in the array

Takes an integer slice, returns the slice sorted and the number of inversions in the slice

Jan 25, 2022
Related tags
Go wrapper for LZO compression library

This is a cgo wrapper around the LZO real-time compression library. LZO is available at http://www.oberhumer.com/opensource/lzo/ lzo.go is the go pack

Mar 4, 2022
Port of LZ4 lossless compression algorithm to Go

go-lz4 go-lz4 is port of LZ4 lossless compression algorithm to Go. The original C code is located at: https://github.com/Cyan4973/lz4 Status Usage go

Jun 14, 2022
LZ4 compression and decompression in pure Go

lz4 : LZ4 compression in pure Go Overview This package provides a streaming interface to LZ4 data streams as well as low level compress and uncompress

Dec 27, 2022
Go parallel gzip (de)compression

pgzip Go parallel gzip compression/decompression. This is a fully gzip compatible drop in replacement for "compress/gzip". This will split compression

Dec 29, 2022
Bzip2 Compression Tool written in Go

Bzip2 Compression Tool written in Go

Dec 28, 2021
Slipstream is a method for lossless compression of power system data.

Slipstream Slipstream is a method for lossless compression of power system data. Design principles The protocol is designed for streaming raw measurem

Apr 14, 2022
An easy-to-use CLI-based compression tool.

Easy Compression An easy-to-use CLI-based compression tool. Usage NAME: EasyCompression - A CLI-based tool for (de)compression USAGE: EasyCompr

Jan 1, 2022
zlib compression tool for modern multi-core machines written in Go

zlib compression tool for modern multi-core machines written in Go

Jan 21, 2022
Program that counts the bits in an unsigned integer

popcountloop This is an exercise of the book The Go Programming Language, by Ala

Dec 18, 2021
Kitex byte-dance internal Golang microservice RPC framework with high performance and strong scalability, customized extensions for byte internal.
Kitex byte-dance internal Golang microservice RPC framework with high performance and strong scalability, customized extensions for byte internal.

Kitex 字节跳动内部的 Golang 微服务 RPC 框架,具有高性能、强可扩展的特点,针对字节内部做了定制扩展。

Jan 9, 2023