Type-safe, zero-allocation sets for Go

Set GoDoc Go Report Card Build Status

Package set is a type-safe, zero-allocation port of the excellent package fatih/set. It contains sets for most of the basic types and you can generate sets for your own types with ease.

Example

Example code using the generated string set:

import "github.com/scylladb/go-set/strset"

s1 := strset.New("entry 1", "entry 2")
s2 := strset.New("entry 2", "entry 3")
s3 := strset.Intersection(s1, s2)
// s3 now contains only "entry 2"

The library exposes a number of top level factory functions that can be used to create a specific instances of the set type you want to use. For example to create a set to store int you could do like this:

import "github.com/scylladb/go-set"

s := set.NewIntSet()
// use the set...

Usage

In every subpackage Set is the main set structure that holds all the data and methods used to working with the set.

func Difference

func Difference(set1 *Set, sets ...*Set) *Set

Difference returns a new set which contains items which are in in the first set but not in the others.

func Intersection

func Intersection(sets ...*Set) *Set

Intersection returns a new set which contains items that only exist in all given sets.

func New

func New(ts ...T) *Set

New creates and initializes a new Set.

func NewWithSize

func NewWithSize(size int) *Set

NewWithSize creates a new Set and gives make map a size hint.

func SymmetricDifference

func SymmetricDifference(s *Set, t *Set) *Set

SymmetricDifference returns a new set which s is the difference of items which are in one of either, but not in both.

func Union

func Union(sets ...*Set) *Set

Union is the merger of multiple sets. It returns a new set with all the elements present in all the sets that are passed.

func (*Set) Add

func (s *Set) Add(items ...T)

Add includes the specified items (one or more) to the Set. The underlying Set s is modified. If passed nothing it silently returns.

func (*Set) Clear

func (s *Set) Clear()

Clear removes all items from the Set.

func (*Set) Copy

func (s *Set) Copy() *Set

Copy returns a new Set with a copy of s.

func (*Set) Each

func (s *Set) Each(f func(item T) bool)

Each traverses the items in the Set, calling the provided function for each Set member. Traversal will continue until all items in the Set have been visited, or if the closure returns false.

func (*Set) Has

func (s *Set) Has(items ...T) bool

Has looks for the existence of items passed. It returns false if nothing is passed. For multiple items it returns true only if all of the items exist.

func (*Set) IsEmpty

func (s *Set) IsEmpty() bool

IsEmpty reports whether the Set is empty.

func (*Set) IsEqual

func (s *Set) IsEqual(t *Set) bool

IsEqual test whether s and t are the same in size and have the same items.

func (*Set) IsSubset

func (s *Set) IsSubset(t *Set) bool

IsSubset tests whether t is a subset of s.

func (*Set) IsSuperset

func (s *Set) IsSuperset(t *Set) bool

IsSuperset tests whether t is a superset of s.

func (*Set) List

func (s *Set) List() []T

List returns a slice of all items. There is also StringSlice() and IntSlice() methods for returning slices of type string or int.

func (*Set) Merge

func (s *Set) Merge(t *Set)

Merge is like Union, however it modifies the current Set it's applied on with the given t Set.

func (*Set) Pop

func (s *Set) Pop() T

Pop deletes and returns an item from the Set. The underlying Set s is modified. If Set is empty, the zero value is returned.

func (*Set) Pop2

func (s *Set) Pop2() (T, bool)

Pop2 tries to delete and return an item from the Set. The underlying Set s is modified. The second value is a bool that is true if the item existed in the set, and false if not. If Set is empty, the zero value and false are returned.

func (*Set) Remove

func (s *Set) Remove(items ...T)

Remove deletes the specified items from the Set. The underlying Set s is modified. If passed nothing it silently returns.

func (*Set) Separate

func (s *Set) Separate(t *Set)

Separate removes the Set items containing in t from Set s. Please aware that it's not the opposite of Merge.

func (*Set) Size

func (s *Set) Size() int

Size returns the number of items in a Set.

func (*Set) String

func (s *Set) String() string

String returns a string representation of s

Performance

The improvement in performance by using concrete types over interface{} is notable. Below you will find benchmark results comparing type-safe sets to fatih/set counterparts for string, int64, int32, float64 and float32 on a local machine, Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz.

pkg: github.com/scylladb/go-set/strset
BenchmarkTypeSafeSetHasNonExisting-4            200000000                7.02 ns/op            0 B/op          0 allocs/op
BenchmarkInterfaceSetHasNonExisting-4           20000000                60.0 ns/op            32 B/op          2 allocs/op
BenchmarkTypeSafeSetHasExisting-4               200000000                9.02 ns/op            0 B/op          0 allocs/op
BenchmarkInterfaceSetHasExisting-4              20000000                97.0 ns/op            32 B/op          2 allocs/op
BenchmarkTypeSafeSetHasExistingMany-4           100000000               16.8 ns/op             0 B/op          0 allocs/op
BenchmarkInterfaceSetHasExistingMany-4          30000000               106 ns/op              32 B/op          2 allocs/op
BenchmarkTypeSafeSetAdd-4                        3000000               469 ns/op              58 B/op          0 allocs/op
BenchmarkInterfaceSetAdd-4                       2000000               909 ns/op             117 B/op          2 allocs/op

pkg: github.com/scylladb/go-set/i64set
BenchmarkTypeSafeSetHasNonExisting-4            300000000                5.51 ns/op            0 B/op          0 allocs/op
BenchmarkInterfaceSetHasNonExisting-4           30000000                49.4 ns/op            24 B/op          2 allocs/op
BenchmarkTypeSafeSetHasExisting-4               200000000                7.53 ns/op            0 B/op          0 allocs/op
BenchmarkInterfaceSetHasExisting-4              20000000                68.5 ns/op            24 B/op          2 allocs/op
BenchmarkTypeSafeSetHasExistingMany-4           100000000               11.0 ns/op             0 B/op          0 allocs/op
BenchmarkInterfaceSetHasExistingMany-4          20000000                74.5 ns/op            24 B/op          2 allocs/op
BenchmarkTypeSafeSetAdd-4                       10000000               225 ns/op              40 B/op          0 allocs/op
BenchmarkInterfaceSetAdd-4                       3000000               403 ns/op              82 B/op          2 allocs/op

pkg: github.com/scylladb/go-set/i32set
BenchmarkTypeSafeSetHasNonExisting-4            300000000                5.61 ns/op            0 B/op          0 allocs/op
BenchmarkInterfaceSetHasNonExisting-4           30000000                48.8 ns/op            20 B/op          2 allocs/op
BenchmarkTypeSafeSetHasExisting-4               200000000                7.07 ns/op            0 B/op          0 allocs/op
BenchmarkInterfaceSetHasExisting-4              20000000                69.3 ns/op            20 B/op          2 allocs/op
BenchmarkTypeSafeSetHasExistingMany-4           100000000               11.7 ns/op             0 B/op          0 allocs/op
BenchmarkInterfaceSetHasExistingMany-4          20000000                71.1 ns/op            20 B/op          2 allocs/op
BenchmarkTypeSafeSetAdd-4                       10000000               206 ns/op              25 B/op          0 allocs/op
BenchmarkInterfaceSetAdd-4                       3000000               394 ns/op              78 B/op          2 allocs/op

pkg: github.com/scylladb/go-set/f64set
BenchmarkTypeSafeSetHasNonExisting-4            300000000                5.82 ns/op            0 B/op          0 allocs/op
BenchmarkInterfaceSetHasNonExisting-4           30000000                49.8 ns/op            24 B/op          2 allocs/op
BenchmarkTypeSafeSetHasExisting-4               50000000                26.8 ns/op             0 B/op          0 allocs/op
BenchmarkInterfaceSetHasExisting-4              20000000                77.6 ns/op            24 B/op          2 allocs/op
BenchmarkTypeSafeSetHasExistingMany-4           50000000                27.6 ns/op             0 B/op          0 allocs/op
BenchmarkInterfaceSetHasExistingMany-4          20000000                82.3 ns/op            24 B/op          2 allocs/op
BenchmarkTypeSafeSetAdd-4                       10000000               270 ns/op              40 B/op          0 allocs/op
BenchmarkInterfaceSetAdd-4                       3000000               428 ns/op              82 B/op          2 allocs/op

pkg: github.com/scylladb/go-set/f32set
BenchmarkTypeSafeSetHasNonExisting-4            300000000                5.78 ns/op            0 B/op          0 allocs/op
BenchmarkInterfaceSetHasNonExisting-4           30000000                49.3 ns/op            20 B/op          2 allocs/op
BenchmarkTypeSafeSetHasExisting-4               50000000                24.9 ns/op             0 B/op          0 allocs/op
BenchmarkInterfaceSetHasExisting-4              20000000                78.6 ns/op            20 B/op          2 allocs/op
BenchmarkTypeSafeSetHasExistingMany-4           50000000                25.1 ns/op             0 B/op          0 allocs/op
BenchmarkInterfaceSetHasExistingMany-4          20000000                81.1 ns/op            20 B/op          2 allocs/op
BenchmarkTypeSafeSetAdd-4                       10000000               246 ns/op              24 B/op          0 allocs/op
BenchmarkInterfaceSetAdd-4                       3000000               408 ns/op              78 B/op          2 allocs/op

Code generation

For code generation we use Google go_generics tool that we forked to provide bazel-free installation, to install run:

go get -u github.com/mmatczuk/go_generics/cmd/go_generics

Once you have go_generics installed properly you can regenerate the code using go generate in the top level directory.

Your custom types

If you have types that you would like to use but the are not amenable for inclusion in this library you can simply generate code on your own and put it in your package.

For example, to generate a set for SomeType in package sometypeset call:

./gen_set.sh SomeType sometypeset

this would generate a new directory sometypeset in current working directory.

If you think your addition belongs here we are open to accept pull requests.

License

Copyright (C) 2018 ScyllaDB

This project is distributed under the Apache 2.0 license. See the LICENSE file for details. It contains software from:

GitHub star is always appreciated!

Owner
ScyllaDB
ScyllaDB, the The Real-Time Big Data Database
ScyllaDB
Comments
  • Add Pop2 function

    Add Pop2 function

    As discussed in issue #11, there is no way to find out if a zero value was returned because it was part of the Set or because the Set was empty. This adds the Pop2 function with an extra boolean return value to make the distinction.

    Let me know if you want changes in the signature or the documentation. I'm not that happy with the docs for Pop2 to be honest.

    Closes #11.

  • Zero value in pop

    Zero value in pop

    Documentation of the Pop function in strset says:

    If Set is empty, nil is returned.

    The signature of the function is func (s *Set) Pop() string so a nil return is not possible. It instead returns the zero value, in this case an empty string. An empty string, however, could be a valid member of a string set.

    What do you think of changing the signature of Pop to add a booleanv value indicating if there was an element or not, such as https://godoc.org/github.com/standupdev/strset#Set.Pop. It would be similar to the way maps are natively implemented.

  • strset build fail on 386 arch

    strset build fail on 386 arch

    https://github.com/scylladb/go-set/blob/e560bb8f49bb7f34d4f59b7e771f6e1307c329da/strset/strset.go#L255

    The usage of untyped numeric constant math.MaxInt64 along the repo makes it hard to cross-compile for GOARCH=386.

    Build error message of mine below:

    release failed after 1.31s error=failed to build for linux_386: # github.com/scylladb/go-set/strset
    ../../go/pkg/mod/github.com/scylladb/[email protected]/strset/strset.go:248:10: constant 9223372036854775807 overflows int
    ../../go/pkg/mod/github.com/scylladb/[email protected]/strset/strset.go:255:13: constant 9223372036854775807 overflows int
    
  • add optimizations in original set impl

    add optimizations in original set impl

    • Add() + Remove() methods: not sure you gain something with testing len() here, but you surely loose as most of the time your users will pass at least one item;
    • Has(): almost the same here. You can avoid testing len() but initializing "has" to false. In your case, initializing it to true is useless, as its value is immediately overwritten in the loop;
    • IsEqual(): testing for same size is a bit complicated, why not if len(s.m) != t.Size() {?
    • IsSubSet() & IsSuperSet() seem bad named. When I read a.IsSubSet(b) I understand I will test whether a is subset of b but in fact it is the opposite;
    • Copy() could be far more efficient as you already know the size of the new set;
    • Separate() could avoid creating a list of items to remove, by removing them directly without any allocation;
    • Intersection(): result := all.Copy() is more efficient than recomputing the same union;
    • Difference(): you say "separate is thread safe" in a comment, but it is not true. See https://golang.org/doc/faq#atomic_maps

    source: https://www.reddit.com/r/golang/comments/953dnz/ann_goset_typesafe_sets_for_go/

  • Possible to do zero allocation for this too? https://github.com/VictoriaMetrics/fastcache

    Possible to do zero allocation for this too? https://github.com/VictoriaMetrics/fastcache

    do you have an alternative to kv lru cache that has zero allocation? fastcache has 2b allocation per operation https://github.com/VictoriaMetrics/fastcache

Go concurrent-safe, goroutine-safe, thread-safe queue
Go concurrent-safe, goroutine-safe, thread-safe queue

goconcurrentqueue - Concurrent safe queues The package goconcurrentqueue offers a public interface Queue with methods for a queue. It comes with multi

Dec 31, 2022
Zero allocation Nullable structures in one library with handy conversion functions, marshallers and unmarshallers

nan - No Allocations Nevermore Package nan - Zero allocation Nullable structures in one library with handy conversion functions, marshallers and unmar

Dec 20, 2022
dagger is a fast, concurrency safe, mutable, in-memory directed graph library with zero dependencies
dagger is a fast, concurrency safe, mutable, in-memory directed graph library with zero dependencies

dagger is a blazing fast, concurrency safe, mutable, in-memory directed graph implementation with zero dependencies

Dec 19, 2022
Null Types, Safe primitive type conversion and fetching value from complex structures.

Typ Typ is a library providing a powerful interface to impressive user experience with conversion and fetching data from built-in types in Golang Feat

Sep 26, 2022
Sync distributed sets using bloom filters

goSetReconciliation An implementation to sync distributed sets using bloom filters. Based on the paper "Low complexity set reconciliation using Bloom

Jan 4, 2022
Distributed merge sort for large sets across nodes in a network

distMergeSort An implementation of mergesort distributed across nodes used to sort large sets. Introduction Merge sort partitions sets so that they ca

Jul 8, 2022
Generic types that are missing from Go, including sets, trees, sorted lists, etc.

go-typ Generic types that are missing from Go, including sets, trees, sorted lists, etc. All code is implemented with 0 dependencies and in pure Go co

Dec 4, 2022
Package ring provides a high performance and thread safe Go implementation of a bloom filter.

ring - high performance bloom filter Package ring provides a high performance and thread safe Go implementation of a bloom filter. Usage Please see th

Nov 20, 2022
A thread safe map which has expiring key-value pairs

~ timedmap ~ A map which has expiring key-value pairs. go get github.com/zekroTJA/timedmap Intro This package allows to set values to a map which will

Dec 29, 2022
go.fifo provides a simple fifo thread-safe queue for the Go programming language

go.fifo Description go.fifo provides a simple FIFO thread-safe queue. *fifo.Queue supports pushing an item at the end with Add(), and popping an item

Aug 29, 2022
A Golang lock-free thread-safe HashMap optimized for fastest read access.

hashmap Overview A Golang lock-free thread-safe HashMap optimized for fastest read access. Usage Set a value for a key in the map: m := &HashMap{} m.S

Dec 30, 2022
A simple and efficient thread-safe sharded hashmap for Go

shardmap A simple and efficient thread-safe sharded hashmap for Go. This is an alternative to the standard Go map and sync.Map, and is optimized for w

Dec 17, 2022
Smartsort - A smart sorting algorithm for Go to sort filename containing digits that is not zero padded

smartsort A smart sorting algorithm for Go to sort filename containing digits th

Mar 26, 2022
A simple set type for the Go language. Trusted by Docker, 1Password, Ethereum and Hashicorp.

golang-set The missing set collection for the Go language. Until Go has sets built-in...use this. Coming from Python one of the things I miss is the s

Jan 8, 2023
ID type with marshalling to/from hash to prevent sending IDs to clients.
ID type with marshalling to/from hash to prevent sending IDs to clients.

Hide IDs Hide is a simple package to provide an ID type that is marshalled to/from a hash string. This prevents sending technical IDs to clients and c

Dec 10, 2022
Type-agnostic partitioning for Go's indexable collections and strings.

Go Type-Agnostic Collection Partitioning Type-agnostic partitioning for anything that can be indexed in Go - slices, arrays,strings. Inspired by Guava

Aug 11, 2021
flexible data type for Go
flexible data type for Go

Generic flexible data type for Go support: Go 1.12+ Install standard go get: go get -u github.com/usk81/generic/v2 Usage encode/decode: package main

Dec 31, 2022
Package set is a small wrapper around the official reflect package that facilitates loose type conversion and assignment into native Go types.

Package set is a small wrapper around the official reflect package that facilitates loose type conversion and assignment into native Go types. Read th

Dec 27, 2022
A faster method to get elements from an interface (Struct or Slice type) for Go.

A faster method to get elements from an interface (Struct or Slice type) for Go.

May 13, 2022