Go package implementing bitsets

bitset

Go language library to map between non-negative integers and boolean values

Test Master Coverage Status Go Report Card PkgGoDev

Description

Package bitset implements bitsets, a mapping between non-negative integers and boolean values. It should be more efficient than map[uint] bool.

It provides methods for setting, clearing, flipping, and testing individual integers.

But it also provides set intersection, union, difference, complement, and symmetric operations, as well as tests to check whether any, all, or no bits are set, and querying a bitset's current length and number of positive bits.

BitSets are expanded to the size of the largest set bit; the memory allocation is approximately Max bits, where Max is the largest set bit. BitSets are never shrunk. On creation, a hint can be given for the number of bits that will be used.

Many of the methods, including Set, Clear, and Flip, return a BitSet pointer, which allows for chaining.

Example use:

package main

import (
	"fmt"
	"math/rand"

	"github.com/willf/bitset"
)

func main() {
	fmt.Printf("Hello from BitSet!\n")
	var b bitset.BitSet
	// play some Go Fish
	for i := 0; i < 100; i++ {
		card1 := uint(rand.Intn(52))
		card2 := uint(rand.Intn(52))
		b.Set(card1)
		if b.Test(card2) {
			fmt.Println("Go Fish!")
		}
		b.Clear(card1)
	}

	// Chaining
	b.Set(10).Set(11)

	for i, e := b.NextSet(0); e; i, e = b.NextSet(i + 1) {
		fmt.Println("The following bit is set:", i)
	}
	if b.Intersection(bitset.New(100).Set(10)).Count() == 1 {
		fmt.Println("Intersection works.")
	} else {
		fmt.Println("Intersection doesn't work???")
	}
}

As an alternative to BitSets, one should check out the 'big' package, which provides a (less set-theoretical) view of bitsets.

Package documentation is at: https://pkg.go.dev/github.com/willf/bitset?tab=doc

Memory Usage

The memory usage of a bitset using N bits is at least N/8 bytes. The number of bits in a bitset is at least as large as one plus the greatest bit index you have accessed. Thus it is possible to run out of memory while using a bitset. If you have lots of bits, you might prefer compressed bitsets, like the Roaring bitmaps and its Go implementation.

Implementation Note

Go 1.9 introduced a native math/bits library. We provide backward compatibility to Go 1.7, which might be removed.

It is possible that a later version will match the math/bits return signature for counts (which is int, rather than our library's unit64). If so, the version will be bumped.

Installation

go get github.com/willf/bitset

Contributing

If you wish to contribute to this project, please branch and issue a pull request against master ("GitHub Flow")

Running all tests

Before committing the code, please check if it passes tests, has adequate coverage, etc.

go test
go test -cover
Owner
Will Fitzgerald
Words and numbers
Will Fitzgerald
Comments
  • automation and improvements

    automation and improvements

    • merging bpowers changes to add constructor + accessor for underlying bit array;
    • add Makefile to automate repetitive tasks;
    • fix "golint" and "vet" errors;
    • add Travis-CI configuration for automatic QA checks;
    • updated README (you may want to remove the travis-ci and coveralls badges or update the links with your account.
  • Added methods for set indices

    Added methods for set indices

    Hi,

    I had to use your library for a little project of mine. Since I needed an easy way to list indices of the bits that were set, I implemented it and I figured you might want it; feel free to reject or ask for changes.

  • Overreading with v1.3.1

    Overreading with v1.3.1

    It seems you are still over-reading with the latest release. Here is a reproducer test:

    func TestWriteTo(t *testing.T) {
    	const length = 958506
    	addBuf := []byte(`1234`)
    	bs := New(length)
    	var buf bytes.Buffer
    	_, err := bs.WriteTo(&buf)
    	if err != nil {
    		t.Fatal(err)
    	}
    	buf.Write(addBuf)
    	
    	// Generate test input for regression tests:
    	if false {
    		gzout := bytes.NewBuffer(nil)
    		gz, err := gzip.NewWriterLevel(gzout, 9)
    		if err != nil {
    			t.Fatal(err)
    		}
    		gz.Write(buf.Bytes())
    		gz.Close()
    		t.Log("Encoded:", base64.StdEncoding.EncodeToString(gzout.Bytes()))
    	}
    	bs = New(length)
    	_, err = bs.ReadFrom(&buf)
    	if err != nil {
    		t.Fatal(err)
    	}
    	more, err := io.ReadAll(&buf)
    	if err != nil {
    		t.Fatal(err)
    	}
    	if !bytes.Equal(more, addBuf) {
    		t.Fatalf("extra mismatch. got %v, want %v", more, addBuf)
    	}
    }
    

    Works with v1.2.0, but outputs:

    === RUN   TestWriteTo
        bitset_test.go:1684: extra mismatch. got [], want [49 50 51 52]
    --- FAIL: TestWriteTo (0.00s)
    

    You can extend it for more sizes.

  • Copy: bug: destination is smaller than source

    Copy: bug: destination is smaller than source

    https://github.com/bits-and-blooms/bitset/blob/5a829244ffd64e4120015ce1bf8285b0b6168d55/bitset.go#L535

    It seems that here it assumes that the destination c.set will have len >= len(b.set).

    Otherwise the copy() will only copy the length of the smaller, c.set.

    Shouldn't c.set be extended here if it's shorter??

  • Makefile, unit tests and code cleanup improvements

    Makefile, unit tests and code cleanup improvements

    • Automatic GOPATH;
    • Default make to the help target;
    • New Makefile static analysis tools and improvements;
    • Separated benchmarks from unit tests;
    • New unit tests for better code coverage;
    • Merged liaojie fix;
    • Code cleanup.
  • Performance question

    Performance question

    I've compared willf/bitset to std::bitset , and found that the go version has virtually performed an order of magnitude slower than the c++ counterpart.

    Is c++ generally still far superior to go for CPU-bound tasks?

  • use bufio.Writer instead of binary.Write to reduce memory footprint while serializing

    use bufio.Writer instead of binary.Write to reduce memory footprint while serializing

    While writing (or reading) a very big bitset (few GBs) into a file, the memory usage will be doubled, crashing the program. This is because you are converting the whole set into another big []byte slice. (through binary.Write) This pull request fixs the issue by not trying to convert the whole set ([]uint64) into []byte, but convert one by one item and push it to a buffered stream. The memory usage is the same after calling WriteTo or ReadFrom

    Refs: bufio is the recomented way to read and writing buffered data. (https://pkg.go.dev/bufio#pkg-overview) A loop while reading out from bufio.Reader is also quite common (https://pkg.go.dev/bufio#example-Scanner.Bytes)

  • CopyFull: add func to duplicate a bitset to a target

    CopyFull: add func to duplicate a bitset to a target

    Adds a new function to copy completely to a destination bitset.

    This is because the Copy() function does not copy the lengths, so there is no way to do an in-place copy from another bitset.

  • add Shrink method

    add Shrink method

    add Insert method add tests for new methods

    These methods were added to overcome a specific problem we are facing in a project, but I thought they might be useful to other people as well. The Shrink method is similar to what was discussed in #64 but I don't think it satisfies all of those requirements.

  • integer limit

    integer limit

    Since the bitset accepts uint, I think the upper limit of integers should be 2^64-1=18446744073709551615. However, for an integer that is much smaller than that value (i.e. 123456789012345), my program panics.

    panic: runtime error: makeslice: len out of range goroutine 1 [running]: github.com/willf/bitset.(*BitSet).extendSetMaybe(0xc4200980a0, 0xbf980761e9d1fc7)

    Is there anything I misunderstand in using this package?

  • Add Reset function to reuse existing BitSet

    Add Reset function to reuse existing BitSet

    I have a use case where I need to decompress the bit mask on a hot path, would really like to reset the underlying words in the BitSet rather than calling From and create a new BitSet every time.

    If there is a better way to reuse the BitSet, please let me know.

Go package implementing an indexable ordered multimap

PACKAGE package skiplist import "github.com/glenn-brown/skiplist" Package skiplist implements fast indexable ordered multimaps. This sk

Jul 2, 2022
Go package implementing Bloom filters

Bloom filters A Bloom filter is a concise/compressed representation of a set, where the main requirement is to make membership queries; i.e., whether

Dec 28, 2022
Go library implementing xor filters
Go library implementing xor filters

xorfilter: Go library implementing xor filters Bloom filters are used to quickly check whether an element is part of a set. Xor filters are a faster a

Dec 30, 2022
Package set is a small wrapper around the official reflect package that facilitates loose type conversion and assignment into native Go types.

Package set is a small wrapper around the official reflect package that facilitates loose type conversion and assignment into native Go types. Read th

Dec 27, 2022
Go package for mapping values to and from space-filling curves, such as Hilbert and Peano curves.
Go package for mapping values to and from space-filling curves, such as Hilbert and Peano curves.

Hilbert Go package for mapping values to and from space-filling curves, such as Hilbert and Peano curves. Documentation available here This is not an

Dec 23, 2022
Package ring provides a high performance and thread safe Go implementation of a bloom filter.

ring - high performance bloom filter Package ring provides a high performance and thread safe Go implementation of a bloom filter. Usage Please see th

Nov 20, 2022
A package for Go that can be used for range queries on large number of intervals

go-stree go-stree is a package for Go that can be used to process a large number of intervals. The main purpose of this module is to solve the followi

May 14, 2022
parody of some of the basic python core features (collections package)

collections import "github.com/marcsantiago/collections" Overview Index Subdirectories Overview Index func StringEncoder(encoder *bytes.Buffer, data D

Jan 14, 2022
Package mafsa implements Minimal Acyclic Finite State Automata in Go, essentially a high-speed, memory-efficient, Unicode-friendly set of strings.

MA-FSA for Go Package mafsa implements Minimal Acyclic Finite State Automata (MA-FSA) with Minimal Perfect Hashing (MPH). Basically, it's a set of str

Oct 27, 2022
Package for indexing zip files and storing a compressed index

zipindex zipindex provides a size optimized representation of a zip file to allow decompressing the file without reading the zip file index. It will o

Nov 30, 2022
peanut is a Go package to write tagged data structs to disk in a variety of formats.

peanut peanut is a Go package to write tagged data structs to disk in a variety of formats. Its primary purpose is to provide a single consistent inte

Dec 16, 2022
graph package by golang

graph-go sample package main import ( "fmt" "github.com/Iovesophy/graph-go" ) func main() { samplePlace := []graph.Node{ {ID: 1, Element: "plac

Oct 24, 2021
Package iter provides generic, lazy iterators, functions for producing them from primitive types, as well as functions and methods for transforming and consuming them.

iter Package iter provides generic, lazy iterators, functions for producing them from primitive types, as well as functions and methods for transformi

Dec 16, 2022
Advanced linked list package for go.

golib/list ライブラリ 可変長の連結リストを提供するライブラリーです。 状況によらず、メモリ開放処理を一貫性した書き方で実装できるので、メモリ解放をプログラマが管理しやすい作りになっています。 list.List 片方向連結リストを提供するモジュールです。 list.Nodeが使われていま

Jan 21, 2022
GoIntervalTree - An IntervalTree package for Go

GoIntervalTree An IntervalTree package for Go Inspired by Centered Interval Tree implementation in Python This package provides functionality for inde

Feb 5, 2022
Drop-in replacement for Go's flag package, implementing POSIX/GNU-style --flags.

Description pflag is a drop-in replacement for Go's flag package, implementing POSIX/GNU-style --flags. pflag is compatible with the GNU extensions to

Dec 30, 2022
Drop-in replacement for Go's flag package, implementing POSIX/GNU-style --flags.

Description pflag is a drop-in replacement for Go's flag package, implementing POSIX/GNU-style --flags. pflag is compatible with the GNU extensions to

Nov 25, 2022
Go package implementing Bloom filters

Bloom filters A Bloom filter is a representation of a set of n items, where the main requirement is to make membership queries; i.e., whether an item

Dec 30, 2022
Go package implementing an indexable ordered multimap

PACKAGE package skiplist import "github.com/glenn-brown/skiplist" Package skiplist implements fast indexable ordered multimaps. This sk

Jul 2, 2022
Go package implementing Bloom filters

Go package implementing Bloom filters

Dec 30, 2022