Shuffle bits in byte slice, written in golang

Shuffle 'em

It is a library for bit shuffling, written in golang. The usecases are listed further below in the file.

Installation

go-shufflem uses Go Modules. You can use the

go get github.com/vaibhav-kaushal/go-shufflem

command (or you can just copy paste the shuffler.go file in your codebase if you can't use go mod). The project has no dependencies outside the golang's standard library.

Testing

Checking the output

The main.go file can be used to check the output. It contains a sample implementation of the library. You can run

go run main.go shuffler.go

from the project root to see it for yourself.

Unit testing

To run the unit test, you can run:

go test

Performance testing

The code is not super efficient and the performance will very according to the size of the input and number of shuffles you ask the program to perform. Hence, you should test your usecase for performance. A sample test is already present. To run the benchmark, run:

go test -bench=. -count=5 -run=^#

Note: The -run=^# part requests go to not run the unit tests. Omitting that part should still be okay.

Usecases

There are multiple places where this library can be useful.

1. Public IDs of objects in a web application

This library is inspired by possible use of ULID. ULID allows you to have decentralized Primary Keys while guaranteeing uniqueness (how you can use ULIDs to guarantee uniqueness is described a little below). At the same time, they keep the index sizes (and thus lookups) faster.

However ULIDs start with a time component and might not be the best for being used at places where the primary keys are not safe for being revealed (such as public URLs) as they automatically indicate the time of creation of an object.

There are two approaches to solve that issue -

  1. Create some kind of public_id for each object whose ID can be exposed in the public and then map/search the real ID of the object in your application code.
  2. Create a bit-shuffle mapping for each type of object and shuffle the bits before showing them in public URLs and in your application code, reshuffle them before searching for them in the database.

It is the second approach where this library can come in really handy. You can have one config for each type (e.g. User IDs, Post IDs etc.). Before using them in public (such as URLs), you can do a shuffle. Similarly, when receiving them in your application code from user (such as in an API call), shuffle them before using.

The advantage of this approach is that you never have to save a public_id. That allows you to:

  1. Save disk space.
  2. Prevent database lookups on public_ids.
  3. Simplifies queries, especially around joins.
  4. Anyone outside your team working (e.g. someone having the DB dump) on a project who wants to relate any object's Public ID will never find the ID in your database and is going to have a really hard time figuring out how all the IDs in public URLs map to what's in the database.

If you are using a (micro)service based architecture, you can have the shuffler in its own service and conceal the shuffle maps of each type in that service.

Implementing uniqueness with ULID

If you can use a region or DC-based component in the ULID's randomness bits and incrementing the rest based on an algorithm that only increments the remaining bits within the millisecond. The randomness bits being guaranteed to be unique within the millisecond has already been implemented well in golang (oklog's UUID library). You can use the same library to define your own entropy for the above-mentioned behavior.

2. Custom symmetric-encryption-like behavior

Most symmetric encryption algorithms are based on three core parts:

  1. Blocks of data: Each encryption mechanism ingests data in blocks of predefined size.
  2. Encryption algorithm: The core algorithm which changes the input blocks to the encrypted (output) blocks.
  3. An encryption key: A set of bytes which change the way algorithm will encrypt the blocks. Changing the key while keeping the algorithm and the input will result in a different output in almost all encryption algorithms.

You can use this library to encrypt your data (in a way) where the shuffle map will serve as the encryption key. Like real encryption algorithms, when using bit-shuffling for encryption, guessing the key or original data becomes more and more difficult as you increase the BitCount and number of shuffle pairs in Config!

The advantage here is that you can use can vary the input block size (increase the BitCount; ensure it is a multiple of 8 though) and strength of encryption (number of entries in ShuffleMap) according to your choice!

For example, if the input (in hexadecimal notation):

676f73687566666c656d6c696240766169626861766b61757368616c2e636f6d

gets changed to

b6f6c62e3686166bae86d66e0d164696866e021a9636b6a6746666ae68cef6e6

as the output of a shuffle, can you really tell what the shuffle map was like? How long would it take for you to make the correct guess and how many such samples are you going to need for the correct guess?

Todo

  1. Add the cyclic shuffling capability.
  2. Add an example in this (README) file.

What is bit swapping

Bit swapping is swapping the value of bits. Assuming your input bits (spaces are added after 8 bits for readability only) are 11001101 10101101 and you want to swap bits at indexes (indexed from 0) 1 and 11 and bits 7 and 8 then your result will be 10001110 10111101

What is meant by cyclic shuffling?

When you interchange the positions of a set of bits in a way that they are not simple shuffles, but repositioned amongst themselves, it would be called cyclic shuffling.

Assuming the same input again (11001101 10101101) if you want to set value of bit 1 to value at bit 6, value of bit 6 to value at bit 8 and value of bit 8 to value at bit 1, then your result would be 10001111 11101101. This looks like a simple bit swap too but with various inputs, the results will vary in ways that will be much more difficult to predict what the shuffle map was.

Similar Resources

BadgerDB is an embeddable, persistent and fast key-value (KV) database written in pure Go

BadgerDB is an embeddable, persistent and fast key-value (KV) database written in pure Go

BadgerDB BadgerDB is an embeddable, persistent and fast key-value (KV) database written in pure Go. It is the underlying database for Dgraph, a fast,

Dec 10, 2021

GalaxyDB is a hobbyist key-value database written in Go.

GalaxyDB GalaxyDB is a hobbyist key-value database written in Go Author: Andrew N ([email protected]) Features Data is stored via keys Operations Grafana

Mar 30, 2022

A SQLite-based hierarchical key-value store written in Go

camellia 💮 A lightweight hierarchical key-value store camellia is a Go library that implements a simple, hierarchical, persistent key-value store, ba

Nov 9, 2022

ArcticDB - an embeddable columnar database written in Go

ArcticDB - an embeddable columnar database written in Go

This project is still in its infancy, consider it not production-ready, probably has various consistency and correctness problems and all API will cha

Dec 29, 2022

golang bigcache with clustering as a library.

clusteredBigCache This is a library based on bigcache with some modifications to support clustering and individual item expiration Bigcache is an exce

Sep 26, 2022

moss - a simple, fast, ordered, persistable, key-val storage library for golang

moss moss provides a simple, fast, persistable, ordered key-val collection implementation as a 100% golang library. moss stands for "memory-oriented s

Dec 18, 2022

A tiny Golang JSON database

Scribble A tiny JSON database in Golang Installation Install using go get github.com/nanobox-io/golang-scribble. Usage // a new scribble driver, provi

Dec 31, 2022

Golang in-memory database built on immutable radix trees

go-memdb Provides the memdb package that implements a simple in-memory database built on immutable radix trees. The database provides Atomicity, Consi

Jan 7, 2023

☄ The golang convenient converter supports Database to Struct, SQL to Struct, and JSON to Struct.

☄ The golang convenient converter supports Database to Struct, SQL to Struct, and JSON to Struct.

Gormat - Cross platform gopher tool The golang convenient converter supports Database to Struct, SQL to Struct, and JSON to Struct. 中文说明 Features Data

Dec 20, 2022
Related tags
Eagle - Eagle is a fast and strongly encrypted key-value store written in pure Golang.

EagleDB EagleDB is a fast and simple key-value store written in Golang. It has been designed for handling an exaggerated read/write workload, which su

Dec 10, 2022
A lightweight document-oriented NoSQL database written in pure Golang.
A lightweight document-oriented NoSQL database written in pure Golang.

Lightweight document-oriented NoSQL Database ???? English | ???? 简体中文 | ???? Spanish CloverDB is a lightweight NoSQL database designed for being simpl

Jan 1, 2023
Efficient cache for gigabytes of data written in Go.

BigCache Fast, concurrent, evicting in-memory cache written to keep big number of entries without impact on performance. BigCache keeps entries on hea

Jan 4, 2023
🔑A high performance Key/Value store written in Go with a predictable read/write performance and high throughput. Uses a Bitcask on-disk layout (LSM+WAL) similar to Riak.

bitcask A high performance Key/Value store written in Go with a predictable read/write performance and high throughput. Uses a Bitcask on-disk layout

Sep 26, 2022
A simple, fast, embeddable, persistent key/value store written in pure Go. It supports fully serializable transactions and many data structures such as list, set, sorted set.

NutsDB English | 简体中文 NutsDB is a simple, fast, embeddable and persistent key/value store written in pure Go. It supports fully serializable transacti

Jan 1, 2023
Embedded key-value store for read-heavy workloads written in Go
Embedded key-value store for read-heavy workloads written in Go

Pogreb Pogreb is an embedded key-value store for read-heavy workloads written in Go. Key characteristics 100% Go. Optimized for fast random lookups an

Jan 3, 2023
Fast and simple key/value store written using Go's standard library
Fast and simple key/value store written using Go's standard library

Table of Contents Description Usage Cookbook Disadvantages Motivation Benchmarks Test 1 Test 4 Description Package pudge is a fast and simple key/valu

Nov 17, 2022
Nipo is a powerful, fast, multi-thread, clustered and in-memory key-value database, with ability to configure token and acl on commands and key-regexes written by GO

Welcome to NIPO Nipo is a powerful, fast, multi-thread, clustered and in-memory key-value database, with ability to configure token and acl on command

Dec 28, 2022
🤔 A minimize Time Series Database, written from scratch as a learning project.
🤔 A minimize Time Series Database, written from scratch as a learning project.

mandodb ?? A minimize Time Series Database, written from scratch as a learning project. 时序数据库(TSDB: Time Series Database)大多数时候都是为了满足监控场景的需求,这里先介绍两个概念:

Jan 3, 2023
Key-Value Storage written in Go.

kvs kvs is an in-memory key-value storage written in Go. It has 2 different usage. It can be used as a package by importing it to your code or as a se

Jun 15, 2022