A fast, threadsafe skip list in Go

fast-skiplist

Purpose

As the basic building block of an in-memory data structure store, I needed an implementation of skip lists in Go. It needed to be easy to use and thread-safe while maintaining the properties of a classic skip list.

There are several skip list implementations in Go. However, they all are implemented in slightly different ways with sparse optimizations and occasional shortcomings. Please see the skiplist-survey repo for a comparison of Go skip list implementations (including benchmarks).

The purpose of this repo is to offer a new, fast implementation with an easy-to-use interface that will suit general data storage purposes.

Operation Time Complexity
Insertion O(log N)
Removal O(log N)
Check if contains O(log N)
Enumerate in order O(N)

Quickstart

To start using the library right away, just do:

go get github.com/sean-public/fast-skiplist

There are no external dependencies, so you can start using it right away:

import github.com/sean-public/fast-skiplist

list := skiplist.New()
list.Set(123, "This string data is stored at key 123!")
fmt.Println(list.Get(123).value)
fmt.Println(list.Length)	// prints 1
list.Remove(123)
fmt.Println(list.Length)	// prints 0

Of course there are tests, including benchmarks and race condition detection with concurrency:

$ go test -cover
PASS
coverage: 100.0% of statements
ok      github.com/sean-public/fast-skiplist    0.006s

$ go test -race
Structure sizes: SkipList is 136, Element is 48 bytes
PASS
ok  	github.com/sean-public/fast-skiplist	41.530s

$ go test -bench=.
Structure sizes: SkipList is 136, Element is 48 bytes
goos: darwin
goarch: amd64
pkg: github.com/sean-public/fast-skiplist
BenchmarkIncSet-8   5000000    370 ns/op    13484040.32 MB/s    62 B/op    3 allocs/op
BenchmarkIncGet-8   10000000   205 ns/op    48592107.58 MB/s    0 B/op     0 allocs/op
BenchmarkDecSet-8   10000000   281 ns/op    35547886.82 MB/s    62 B/op    3 allocs/op
BenchmarkDecGet-8   10000000   212 ns/op    47124462.78 MB/s    0 B/op     0 allocs/op
PASS
ok  	github.com/sean-public/fast-skiplist	21.709s

About fast-skiplist

"Perfection is achieved not when there is nothing more to add, but rather when there is nothing more to take away" — Antoine de Saint-Exupery

If fast-skiplist is faster than other packages with the same features, it's because it does less wherever possible. It locks less, it blocks less, and it traverses less data. Even with these tricks up its sleeve, it has fewer lines of code than most implementations.

Calculating the Height of New Nodes

When inserting, it calculates "height" directly instead of consecutive "coin tosses" to add levels. Additionally, it uses a local PRNG source that isn't blocked globally for improved concurrent insert performance.

The probability of adding new nodes to each level of the structure (it's height) is determined by successively "rolling the dice" at each level until it doesn't meet a fixed value P. The default P values for skip lists in the wild range from 0.25 to 0.5. In this implementation, the default is 1/e, which is optimal for a general-purpose skip list. To find the derivation of this number, see Analysis of an optimized search algorithm for skip lists Kirschenhofer et al (1995).

Almost all other implementations are using common functions in math/rand, which will block because querying the PRNG to determine the height of new nodes waits then acquires a lock via the system-wide random number generator. We get around this by assigning a new rand source to each skip list instantiated, so each skip list can only ever block itself. This significantly speeds up insert times when you are managing multiple lists with high concurrency.

Additionally, this implementation always requests just one number from the PRNG. A pre-computed probability table is used to look up what the height of the new node will be. This is faster and offers a fixed calculation time compared to successive "dice rolls" for each level. The table is computed for each level L using the default P value of 1/e: math.Pow(1.0/math.E, L-1). It is consulted during inserts by querying for a random number in range [0.0,1.0) and finding the highest level in the table where the random number is less than or equal to the computed number.

For example, let's say math.Float64() returned r=0.029 and the table was pre-computed to contain (with a maximum height of 6):

height probability
1 1.000000000
2 0.367879441
3 0.135335283
4 0.049787068
5 0.018315639
6 0.006737947

So the height for the new node would be 5 because p5 > r ≥ p6, or 0.018315639 > 0.029 ≥ 0.006737947.

I believe this fast new node height calculation to be novel and faster than any others with user-defined P values. Ticki, for example, proposes an O(1) calculation but it is fixed to P=0.5 and I haven't encountered any other optimizations of this calculation. In local benchmarks, this optimization saves 10-25ns per insert.

Better Cooperative Multitasking

Why not a lock-free implementation? The overhead created is more than the time spent in contention of a locking version under normal loads. Most research on lock-free structures assume manual alloc/free as well and have separate compaction processes running that are unnecessary in Go (particularly with improved GC as of 1.8). The same is true for the newest variation, the rotating skip list, which claims to be the fastest to date for C/C++ and Java because the compared implementations have maintenance threads with increased overhead for memory management.

Caching and Search Fingers

As Pugh described in A Skip List Cookbook, search "fingers" can be retained after each lookup operation. When starting the next operation, the finger will point to where the last one occurred and afford the opportunity to pick up the search there instead of starting at the head of the list. This offers O(log m) search times, where m is the number of elements between the last lookup and the current one (m is always less than n).

This implementation of a search finger does not suffer the usual problem of "climbing" up in levels when resuming search because it stores pointers to previous nodes for each level independently.

Benchmarks

Speed is a feature! Below is a set of results demonstrating the flat performance (time per operation) as the list grows to millions of elements. Please see the skiplist-survey repo for complete benchmark results from this and other Go skip list implementations.

benchmark results chart

Todo

  • Build more complex test cases (specifically to prove correctness during high concurrency).
  • Benchmark memory usage.
  • Add "span" to each element to store the distance to the next node on every level. This gives each node a calculable index (ZRANK and associated commands in Redis).
Comments
  • i hava a  test to show. skiplist is not fast!

    i hava a test to show. skiplist is not fast!

    package main

    import ( "bytes" "encoding/gob" "io/ioutil" "log" "os" "testing"

    pqueuekey "github.com/474420502/focus/priority_queuekey"
    skiplist "github.com/sean-public/fast-skiplist"
    
    "github.com/474420502/focus/compare"
    "github.com/Pallinder/go-randomdata"
    

    )

    // TestCreateData Before Benchmark, Execute the func for test func TestCreateData(t *testing.T) {

    f, err := os.OpenFile("l.log", os.O_CREATE|os.O_TRUNC|os.O_WRONLY, 0666)
    if err != nil {
    	log.Println(err)
    }
    
    var l []int
    
    m := make(map[int]int)
    for i := 0; len(l) < CompartorSize; i++ {
    	v := randomdata.Number(0, NumberMax)
    	if _, ok := m[v]; !ok {
    		m[v] = v
    		l = append(l, v)
    	}
    }
    
    var result bytes.Buffer
    encoder := gob.NewEncoder(&result)
    encoder.Encode(l)
    lbytes := result.Bytes()
    f.Write(lbytes)
    

    }

    // BenchmarkSkiplist skiplist Benchmark func BenchmarkSkiplist(b *testing.B) {

    l := loadTestData()
    execCount := 5
    b.N = len(l) * execCount
    
    var temp []float64
    for _, v := range l {
    	temp = append(temp, float64(v))
    }
    
    b.ResetTimer()
    b.StartTimer()
    
    for i := 0; i < execCount; i++ {
    	pq := skiplist.New()
    	for _, v := range temp {
    		pq.Set(v, v)
    	}
    }
    

    }

    // BenchmarkPriorityQueue like skiplist tree func BenchmarkPriorityQueue(b *testing.B) {

    l := loadTestData()
    execCount := 5
    b.N = len(l) * execCount
    
    b.ResetTimer()
    b.StartTimer()
    
    for i := 0; i < execCount; i++ {
    	pq := pqueuekey.New(compare.Int)
    	for _, v := range l {
    		pq.Push(v, v)
    	}
    }
    

    }

    const CompartorSize = 1000000 const NumberMax = 50000000

    func loadTestData() []int { data, err := ioutil.ReadFile("./l.log") if err != nil { log.Println(err) } var l []int decoder := gob.NewDecoder(bytes.NewReader(data)) decoder.Decode(&l) return l }

  • Not threadsafe

    Not threadsafe

    There are a number of race conditions in this library, for example getPrevElementNodes is called without a lock, and it reads the value of list.maxLevel which could be concurrently set by list.SetMaxLevel (again without locking).

    It may be worth considering removing the lock entirely and simply not advertising the library as threadsafe - or at least having a separate version of it. I imagine that true threadsafety will come at a significant performance penalty.

  • quit loop earlier when get

    quit loop earlier when get

    quit loop earlier when get

    BenchmarkIncGet/get1-8         	  352358	      3883 ns/op	90745.84 MB/s	       0 B/op	       0 allocs/op
    BenchmarkIncGet/get2-8         	 2053378	      1722 ns/op	1192243.30 MB/s	       0 B/op	       0 allocs/op
    
  • Rare panic for high loads

    Rare panic for high loads

    Golang 1.12.9 linux/amd64 Single thread application Calling code is something like:

    list := skiplist.NewWithMaxLevel(4)
    ....
    // playaround with ~4k of elements
    ....
    element := list.Front()
    value := element.Value()
    

    Gets a panic:

    panic: runtime error: invalid memory address or nil pointer dereference
    [signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x4e291b]
    goroutine 553 [running]:
    github.com/sean-public/fast-skiplist.(*Element).Value(...)
            /home/user/go/pkg/mod/github.com/sean-public/[email protected]/type.go:25
    ......
    
  • Question about locking claims in readme

    Question about locking claims in readme

  • SetMaxLevel can alter existing data?

    SetMaxLevel can alter existing data?

  • Unnecessary lock? and other changes in forks

    Unnecessary lock? and other changes in forks

    Hi, I see a couple possible enhancements in forks with un-PR'd commits:

    • (see "remove locking") https://github.com/sean-public/fast-skiplist/compare/master...pilosa:master
    • (Back()) https://github.com/sean-public/fast-skiplist/compare/master...JakeCooper:master
  • Allow access to the key and value in Element

    Allow access to the key and value in Element

    In order for this to be in any way useful, you need to be able to access the value of the element you retrieve. This change makes the value accessible by exposing Element.Value() and the same for Key() as well.

  • more readability

    more readability

    because node in prevs are smaller than key, so prevs.next must be nil or >= key. so, element.key <= key actually means element.key == key.

    Do I understand correctly?

  • Clarify maxLevel meaning

    Clarify maxLevel meaning

    I've realized that maxLevel meaning is missed in the package doc. So if a library user is not familiar with skiplist algorithm internals he can't set maxLevel properly.

  • skiplist created with NewWithMaxLevel(19+) panics on Set

    skiplist created with NewWithMaxLevel(19+) panics on Set

    panic: runtime error: index out of range
    
    goroutine 1 [running]:
    github.com/sean-public/fast-skiplist.(*SkipList).getPrevElementNodes(0xc42006e120, 0x0, 0xc42006e178, 0xc420067500, 0x7f793c60f000)
            /home/user/go/src/github.com/sean-public/fast-skiplist/skiplist.go:109 +0xd9
    github.com/sean-public/fast-skiplist.(*SkipList).Set(0xc42006e120, 0x0, 0x468de0, 0x4de9b0, 0x0)
            /home/user/go/src/github.com/sean-public/fast-skiplist/skiplist.go:28 +0x8a
    main.main()
            /home/user/test.go:5 +0x59
    exit status 2
    

    The code (test.go):

    package main
    import "github.com/sean-public/fast-skiplist"
    func main(){
      s := skiplist.NewWithMaxLevel(19)
      s.Set(0, struct{}{}) // line 5
    }
    

    DefaultMaxLevel (18) is used instead of MaxLevel in lines 160, 161, and 165:

    https://github.com/sean-public/fast-skiplist/blob/1de4bd306c3f816ec0c2702c89d3e2f279f67422/skiplist.go#L154-L167

A skip list implementation in Go

About This is a library implementing skip lists for the Go programming language (http://golang.org/). Skip lists are a data structure that can be used

Sep 21, 2022
A Go library for an efficient implementation of a skip list: https://godoc.org/github.com/MauriceGit/skiplist
A Go library for an efficient implementation of a skip list: https://godoc.org/github.com/MauriceGit/skiplist

Fast Skiplist Implementation This Go-library implements a very fast and efficient Skiplist that can be used as direct substitute for a balanced tree o

Dec 30, 2022
skipmap is a high-performance concurrent sorted map based on skip list. Up to 3x ~ 10x faster than sync.Map in the typical pattern.
skipmap is a high-performance concurrent sorted map based on skip list. Up to 3x ~ 10x faster than sync.Map in the typical pattern.

Introduction skipmap is a high-performance concurrent map based on skip list. In typical pattern(one million operations, 90%LOAD 9%STORE 1%DELETE), th

Jan 8, 2023
A skip list of arbitrary elements that can be filtered using roaring bitmaps stored in an LRU cache

Skipfilter This package provides a data structure that combines a skiplist with a roaring bitmap cache. This library was created to efficiently filter

Aug 4, 2022
A collection of useful, performant, and threadsafe Go datastructures.

go-datastructures Go-datastructures is a collection of useful, performant, and threadsafe Go datastructures. NOTE: only tested with Go 1.3+. Augmented

Dec 29, 2022
A threadsafe single-value cache for Go with a simple but flexible API

SVCache SVCache is a threadsafe, single-value cache with a simple but flexible API. When there is no fresh value in the cache, an attempt to retrieve

Jan 23, 2022
Most comprehensive list :clipboard: of tech interview questions :blue_book: of companies scraped from Geeksforgeeks, CareerCup and Glassdoor.
Most comprehensive list :clipboard: of tech interview questions :blue_book: of companies scraped from Geeksforgeeks, CareerCup and Glassdoor.

Companies* Companies E Expedia G Grab M MobiKwik N NEC Technologies P PayPal S Samsung Research Institute U Uber Y Yatra.com Z Zomato Announcements ??

Dec 29, 2022
The first generic linked list in go :dancer:

linkedlist.go The first generic linked list in go ?? gotip first of all you need to install the master version of golang. go install golang.org/dl/got

Dec 7, 2022
Advanced linked list package for go.

golib/list ライブラリ 可変長の連結リストを提供するライブラリーです。 状況によらず、メモリ開放処理を一貫性した書き方で実装できるので、メモリ解放をプログラマが管理しやすい作りになっています。 list.List 片方向連結リストを提供するモジュールです。 list.Nodeが使われていま

Jan 21, 2022
Fast ring-buffer deque (double-ended queue)

deque Fast ring-buffer deque (double-ended queue) implementation. For a pictorial description, see the Deque diagram Installation $ go get github.com/

Dec 26, 2022
Go native library for fast point tracking and K-Nearest queries

Geo Index Geo Index library Overview Splits the earth surface in a grid. At each cell we can store data, such as list of points, count of points, etc.

Dec 3, 2022
Fast in-memory key:value store/cache with TTL

MCache library go-mcache - this is a fast key:value storage. Its major advantage is that, being essentially a thread-safe . map[string]interface{} wit

Nov 11, 2022
Data structure and relevant algorithms for extremely fast prefix/fuzzy string searching.

Trie Data structure and relevant algorithms for extremely fast prefix/fuzzy string searching. Usage Create a Trie with: t := trie.New() Add Keys with:

Dec 27, 2022
A fast little LRU cache for Go

tinylru A fast little LRU cache. Getting Started Installing To start using tinylru, install Go and run go get: $ go get -u github.com/tidwall/tinylru

Dec 24, 2022
Fast Raft framework using the Redis protocol for Go
Fast Raft framework using the Redis protocol for Go

This project has been archived. Please check out Uhaha for a fitter, happier, more productive Raft framework. Finn is a fast and simple framework for

Oct 10, 2022
Go implementation of SipHash-2-4, a fast short-input PRF created by Jean-Philippe Aumasson and Daniel J. Bernstein.

SipHash (Go) Go implementation of SipHash-2-4, a fast short-input PRF created by Jean-Philippe Aumasson and Daniel J. Bernstein (http://131002.net/sip

Dec 25, 2022
Fast golang queue using ring-buffer

Queue A fast Golang queue using a ring-buffer, based on the version suggested by Dariusz Górecki. Using this instead of other, simpler, queue implemen

Jan 3, 2023
go/golang: fast bit set Bloom filter

package implements a fast bloom filter with real 'bitset' and JSONMarshal/JSONUnmarshal to store/reload the Bloom filter.

Nov 3, 2022
Implementation of Boyer-Moore fast string search algorithm in Go

boyermoore Implementation of Boyer-Moore fast string search algorithm in Go

Oct 7, 2022