Go package for sharding databases ( Supports every ORM or raw SQL )

Octillery GoDoc CircleCI codecov Go Report Card

Octillery is a Go package for sharding databases. It can use with every OR Mapping library ( xorm , gorp , gorm , dbr ...) implementing database/sql interface, or raw SQL.

Currently supports MySQL (for product) and SQLite3 (for testing) .

Motivation

We need database sharding library in Go. Of course, we know some libraries like ( https://github.com/evalphobia/wizard , https://github.com/go-pg/sharding ). But OR Mapping library they support are restricted and we want to write sharding configuration declaratively, also expect to pluggable for sharding algorithm or database adapter, and expect to configurable sharding key or whether use sequencer or not.

Features

  • Supports every OR Mapping library implementing database/sql interface ( xorm , gorp , gorm , dbr , ... )
  • Supports using database/sql ( raw SQL ) directly
  • Pluggable sharding algorithm ( preinstalled algorithms are modulo and hashmap )
  • Pluggable database adapter ( preinstalled adapters are mysql and sqlite3 )
  • Declarative describing for sharding configuration in YAML
  • Configurable sharding algorithm, database adapter, sharding key, whether use sequencer or not.
  • Supports capture read/write queries just before passing to database driver
  • Supports database migration by CLI ( powered by schemalex )
  • Supports import seeds from CSV

Install

Install as a CLI tool

go get go.knocknote.io/octillery/cmd/octillery

Install as a library

go get go.knocknote.io/octillery

How It Works

1. How database sharding works

We explain by using posts table.

posts table schema is

CREATE TABLE `posts` (
  `id` bigint unsigned NOT NULL AUTO_INCREMENT,
  `user_id` bigint unsigned NOT NULL,
  `created_at` datetime NOT NULL,
  `updated_at` datetime NOT NULL,
  PRIMARY KEY (`id`),
  UNIQUE KEY `uq_posts_01` (`user_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;

And we want to shard this table to four databases for load distribution.

In this case, we can try to two approach according to requirements.

1. Using Sequencer

If you want id to be unique in all databases, you should use this approach.
Architecture of this approach would be the following.

architecture

Application create SQL ( like insert into posts (id, user_id, ...) values (null, 1, ...) ), in this point, id value is null because still not decide. In accordance with the above graph, insert this query to one of the databases.

  1. Application requests id value to sequencer
  2. Sequencer generates next unique id in all shards
  3. Sequencer returns id value to application ( ex. id = 1 )
  4. Replace id value from null to 1 (ex. insert into posts (id, user_id, ...) values (1, 1, ...) )
  5. Decide target based of id value by sharding algorithm ( default modulo ) and insert record to selected database.

By using sequencer approach, you can get unique id value in all databases. Therefore, if you insert multiple records, database records should looks like the following.

posts_shard_1 posts_shard_2 posts_shard_3 posts_shard_4
id user_id id user_id id user_id id user_id
1 1 2 2 3 3 4 4
5 5 6 6 7 7 8 8

2. Using Sharding Key ( without Sequencer )

If you don't care about uniqueness of id, you can use sharding key approach.
Architecture of this appraoch would be the following.

architecture2

  1. Decide target based of user_id value by sharding algorithm ( default modulo ) and insert record to selected database.

By using sharding key approach, same id value will appear in multiple databases. Therefore, if you insert multiple records, database record should looks like the following.

posts_shard_1 posts_shard_2 posts_shard_3 posts_shard_4
id user_id id user_id id user_id id user_id
1 1 1 2 1 3 1 4
2 5 2 6 2 7 2 8

2. Requirements of database sharding library

We explained how to sharding database at section 1. From this we define requirements of database sharding library.

  • Know about database sharding configuration
  • Capture query just before passing to database driver
  • Parse query and find sharding key
  • If use sequencer, requests id value to sequencer and replace value of id column by it
  • Select sharding target based of sharding key by sharding algorithm

3. How Octillery works

How To Capture Query

Octillery CLI tool supports transpose command. It replace import statement of database/sql to go.knocknote.io/octillery/database/sql.

go.knocknote.io/octillery/database/sql package has compatible interface of database/sql.

Therefore, OR Mapping library call Octillery's interface. and it can capture all queries.

How To Parse SQL

Octillery use github.com/blastrain/vitess-sqlparser as SQL parser. It implements powered by vitess and tidb .

How To Use New Database Adapter

Octillery supports mysql and sqlite3 adapter by default.
If you want to use new database adapter, need to the following two steps.

  1. Write DBAdapter interface. ( see https://godoc.org/go.knocknote.io/octillery/connection/adapter )
  2. Put new adapter file to go.knocknote.io/octillery/plugin directory

How To Use New Database Sharding Algorithm

Octillery supports modulo and hashmap algorithm by default.
If you want to use new algorithm, need to the following two steps.

  1. Write ShardingAlgorithm interface. ( see https://godoc.org/go.knocknote.io/octillery/algorithm )
  2. Put new algorithm file to go.knocknote.io/octillery/algorithm directory

Usage

1. Install CLI tool

$ go get go.knocknote.io/octillery/cmd/octillery

2. Install library

$ go get go.knocknote.io/octillery

3. Replace already imported database/sql statement

$ octillery transpose

--dry-run option confirms without overwriting

4. Install database adapter

$ octillery install --mysql

5. Describe database cofiguration in YAML

databases.yml

default: &default
  adapter: mysql
  encoding: utf8mb4
  username: root
  master:
    - localhost:3306

tables:
  posts:
    shard: true
    shard_key: user_id
    shards:
      - post_shard_1:
          <<: *default
          database: posts_shard_1
      - post_shard_2:
          <<: *default
          database: posts_shard_2

6. Migrate database

$ octillery migrate --config databases.yml /path/to/schema

--dry-run option confirms migration plan

7. Load configuration file

package main

import (
	"go.knocknote.io/octillery"
	"go.knocknote.io/octillery/database/sql"
)

func main() {
	if err := octillery.LoadConfig("databases.yml"); err != nil {
		panic(err)
	}
	db, _ := sql.Open("mysql", "")
	db.QueryRow("...")
}

Document

See GoDoc

Development

Install dependencies

$ make deps

If update dependencies, the following

  1. Modify glide.yaml
  2. Run make update-deps
  3. Commit glide.yaml and glide.lock

Run tests

$ make test

See also

Committers

Masaaki Goshima (@goccy)

LICENSE

MIT

Owner
Comments
  • xorm supporting

    xorm supporting

    Octillery replace import statement of database/sql to go.knocknote.io/octillery/database/sql. Therefore, OR Mapping library call Octillery's interface. and it can capture all queries.

    We don't import the database/sql in project directly but in xorm, so should we replace the xorm pkg? if so how to solve the dependency problem with go mod?

  • Fix importer

    Fix importer

    If CSV includes "abcde\nfg" , we expect to that it is inserted the following.

    abcde
    fg
    

    But, currently inserted value is the following.

    abcde\nfg
    

    So, I fixed this by unquoting with double quotation value.

  • Supports Distributed Transaction

    Supports Distributed Transaction

    As a default, I will support distributed transaction.

    How to use

    Set before/after callback for commit

    octillery.BeforeCommitCallback(func(tx *sql.Tx, writeQueries []*sql.QueryLog) error {
      // write all writeQueries to log (WAL) in application side
      return nil
    }))
    octillery.AfterCommitCallback(func(tx *sql.Tx) error {}, func(tx *sql.Tx, isCriticalError bool, failureWriteQueries []*sql.QueryLog) error {
      if isCriticalError {
        // distributed transaction error
        // record failureWriteQueries and should execute after.
      }
    }))
    

    Also, callback function can set at *sql.Tx scope too.

    tx, _ := db.Begin()
    tx.BeforeCommitCallback(func(writeQueries []*sql.QueryLog) error {
      // write all writeQueries to log (WAL) in application side
      return nil
    }))
    tx.AfterCommitCallback(func() error {}, func(isCriticalError bool, failureWriteQueries []*sql.QueryLog) error {
      if isCriticalError {
        // distributed transaction error
        // record failureWriteQueries and should execute after.
      }
    }))
    

    Exec for multiple databases

    You can use *sql.Tx to multiple databases.

    (tableA and tableB are stored other databases)

    tx, _ := db.Begin()
    tx.Exec("insert into tableA ...") // (1)
    tx.Exec("insert into tableB ...") // (2)
    
    // tx.Commit() commit (1) before it commit (2)
    // if (1) and (2) are success too, call AfterCommitSuccessCallback
    // if (1) and (2) are failure too, call AfterCommitFailureCallback with NOT CriticalError
    // if (1) is failure, instantly call AfterCommitFailureCallback with NOT CriticalError
    // if (1) is OK but (2) is failure, call AfterCommitFailureCallback with CriticalError
    if err := tx.Commit(); err != nil {
      ...
    }
    

    Recover from CriticalError

    writeQueryLogs := queryLogsfromWAL()
    tx, _ := db.Begin()
    for _, writeQueryLog := range writeQueryLogs {
        isCommittedQuery, err := tx.IsAlreadyCommittedQueryLog(writeQueryLog)
        if err != nil {
           ....
        }
        if !isCommittedQuery {
            result, err := tx.ExecWithQueryLog(writeQueryLog)
            ....
        }
    }
    

    If disable distributed transaction support

    set distributed_transaction: false in databases.yml

  • Add nil check for getting read/write queries

    Add nil check for getting read/write queries

    If application does not call tx.Query() or tx.Exec(), *sql.Tx.tx instance is nil . In this case, invalid memory address occur at WriteQueries or ReadQueries.
    So I add nil check for that.

  • Supports Go Modules ( GO111MODULE=on )

    Supports Go Modules ( GO111MODULE=on )

    • Add go.mod and go.sum files to support Go Modules ( GO111MODULE=on )
    • octillery install automatically put source of plugin to [$HOME/go|$GOPATH]/pkg/mod/go.knocknote.io/octillery@*/plugin if module-aware mode is enabled
  • Add configuration parameter for skipping auto create database

    Add configuration parameter for skipping auto create database

    connection.SetConfig calls setupDBFromConfig and creates database automatically.

    In migration phase this function is really useful. But online application I use a strictly restricted database user ( don't have create database or any needless priviledges.

    So I propose the function for skip auto create database.

    default: &default
      adapter: mysql
      encoding: utf8mb4
      username: appuser
      password: ${appuser_password}
    
    skip_auto_setup: ${skip_auto_setup}
    
  • Remove default plugins from plugin directory

    Remove default plugins from plugin directory

    [ Changed Summary ]

    • Remove default plugins ( mysql.go and sqlite3.go ) from plugin directory
    • Add .gitignore
    • Fix typo of document
    • Fix coding style ( it suggested by golint )
  • Improve score of goreportcard

    Improve score of goreportcard

    [Changed Summary]

    • Add document for exported function suggested by golint
    • Fix coding style ( it suggested by golint )
    • Fix some codes by using gocyclo
  • Update badge in README.md and Fix some codes suggested by golint

    Update badge in README.md and Fix some codes suggested by golint

    [Changed Summary]

    Add badge of goreportcard to README.md, and Fix badge style of CircleCI. Also, I fixed some codes by https://goreportcard.com/report/go.knocknote.io/octillery .

    See https://github.com/knocknote/octillery/blob/feature/improve-goreportcard/README.md .

  • Support MySQL8

    Support MySQL8

    this PR resolves the following MySQL 8 problems .

    • https://github.com/schemalex/schemalex/pull/85
    • https://www.pixelstech.net/article/1531316568-Fix-this-authentication-plugin-is-not-supported-issue-while-using-Go-to-connect-MySQL-8
  • Support os environment for database.yml

    Support os environment for database.yml

    Since host and password change depending on the environment such as docker, local, GCP, I want to use the environment variable of OS for yml. ex)

    default: &default
      adapter: mysql
      encoding: utf8mb4
      username: user
      password: ${DB_PASSWORD}
      database: dbname
    
    tables:
      table1:
        shard: true
        shard_key: user_id
        shards:
          - table1_shard_1:
              <<: *default
              master:
                - ${DB_SHARD_1}:3306
          - table2_stone_shard_2:
              <<: *default
              master:
                - ${DB_SHARD_2}:3306
    

    Can I modify config.Load to use ExpandEnv? ?

  • Add PostgreSQL support

    Add PostgreSQL support

    Refs: https://github.com/blastrain/octillery/issues/25

    I'd like to add PostgreSQL support for octillery. I have added an adapter to support the new DB driver, but please let me know if there is anything else I need to do.

  • Fix call timing of defer statement

    Fix call timing of defer statement

    Call defer in a function scope instead of in a loop because a defer statement defers the execution of a function until the surrounding function returns.

  • Add PostgreSQL support

    Add PostgreSQL support

    I'd like to add PostgreSQL support for octillery. If it is ok to work for it, let me and my colleague make pull request .

    It seems that only we need is adding adapter for supporting new DB driver. But please let me know if I'm missing something to add a a driver. https://github.com/blastrain/octillery/tree/master/connection/adapter/plugin

  • Removal of master-slave vocabulary

    Removal of master-slave vocabulary

    For diversity reasons, it would be nice to try to avoid 'master' and 'slave' terminology in this repository which can be associated to slavery. The master-slave terminology could be problematic for people in several countries which has the history of slavery like Romania, USA and many others. Thank you for considering the proposal. Let me know if any changes in the PR are needed, I would be happy to implement them.

🏋️ dbbench is a simple database benchmarking tool which supports several databases and own scripts

dbbench Table of Contents Description Example Installation Supported Databases Usage Custom Scripts Troubeshooting Development Acknowledgements Descri

Dec 30, 2022
Universal command-line interface for SQL databases

usql A universal command-line interface for PostgreSQL, MySQL, Oracle Database, SQLite3, Microsoft SQL Server, and many other databases including NoSQ

Jan 9, 2023
OctoSQL is a query tool that allows you to join, analyse and transform data from multiple databases and file formats using SQL.
OctoSQL is a query tool that allows you to join, analyse and transform data from multiple databases and file formats using SQL.

OctoSQL OctoSQL is a query tool that allows you to join, analyse and transform data from multiple databases, streaming sources and file formats using

Dec 29, 2022
SQL API is designed to be able to run queries on databases without any configuration by simple HTTP call.

SQL API SQL API is designed to be able to run queries on databases without any configuration by simple HTTP call. The request contains the DB credenti

Dec 2, 2022
Use SQL to instantly query instances, networks, databases, and more from Scaleway. Open source CLI. No DB required.
Use SQL to instantly query instances, networks, databases, and more from Scaleway. Open source CLI. No DB required.

Scaleway Plugin for Steampipe Use SQL to query infrastructure servers, networks, databases and more from your Scaleway project. Get started → Document

Nov 16, 2022
Manage SQL databases, users and grant using kubernetes manifests

SqlOperator Operate sql databases, users and grants. This is a WIP project and should not at all be used in production at this time. Feel free to vali

Nov 28, 2021
A go Library for scan database/sql rows to struct、slice、other types. And it support multiple databases connection management

ploto A go Library for scan database/sql rows to struct、slice、other types. And it support multiple databases connection management It's not an ORM. wo

Nov 3, 2022
SQLite extension for accessing other SQL databases

dblite SQLite extension for accessing other SQL databases, in SQLite. Similar to how Postgres Foreign Data Wrappers enable access to other databases i

Dec 23, 2022
Use SQL to query databases, logs and more from PlanetScale

Use SQL to instantly query PlanetScale databases, branches and more. Open source CLI. No DB required.

Sep 30, 2022
This is the code example how to use SQL to query data from any relational databases in Go programming language.

Go with SQL example This is the code example how to use SQL to query data from any relational databases in Go programming language. To start, please m

Mar 12, 2022
A simple auditor of SQL databases.

DBAuditor SQL数据库审计系统,目前支持SQL注入攻击审计 环境配置 sudo apt install golang 运行方式 将待审计语句填入test.txt中,然后运行主程序: 直接运行: go run main.go 编译运行: go build main.go ./main 主要目

Nov 9, 2022
A flexible and powerful SQL string builder library plus a zero-config ORM.

SQL builder for Go Install Usage Basic usage Pre-defined SQL builders Build SQL for MySQL, PostgreSQL or SQLite Using Struct as a light weight ORM Nes

Dec 30, 2022
A tool to run queries in defined frequency and expose the count as prometheus metrics. Supports MongoDB and SQL
A tool to run queries in defined frequency and expose the count as prometheus metrics. Supports MongoDB and SQL

query2metric A tool to run db queries in defined frequency and expose the count as prometheus metrics. Why ? Product metrics play an important role in

Jul 1, 2022
write APIs using direct SQL queries with no hassle, let's rethink about SQL

SQLer SQL-er is a tiny portable server enables you to write APIs using SQL query to be executed when anyone hits it, also it enables you to define val

Jan 7, 2023
Parses a file and associate SQL queries to a map. Useful for separating SQL from code logic

goyesql This package is based on nleof/goyesql but is not compatible with it any more. This package introduces support for arbitrary tag types and cha

Oct 20, 2021
Go-sql-reader - Go utility to read the externalised sql with predefined tags

go-sql-reader go utility to read the externalised sql with predefined tags Usage

Jan 25, 2022
Cross-platform client for PostgreSQL databases

pgweb Web-based PostgreSQL database browser written in Go. Overview Pgweb is a web-based database browser for PostgreSQL, written in Go and works on O

Dec 30, 2022
Redis-shake is a tool for synchronizing data between two redis databases. Redis-shake是一个用于在两个redis之间同步数据的工具,满足用户非常灵活的同步、迁移需求。
Redis-shake is a tool for synchronizing data between two redis databases. Redis-shake是一个用于在两个redis之间同步数据的工具,满足用户非常灵活的同步、迁移需求。

RedisShake is mainly used to synchronize data from one redis to another. Thanks to the Douyu's WSD team for the support. 中文文档 English tutorial 中文使用文档

Dec 29, 2022
Go sqlite3 http vfs: query sqlite databases over http with range headers

sqlite3vfshttp: a Go sqlite VFS for querying databases over http(s) sqlite3vfshttp is a sqlite3 VFS for querying remote databases over http(s). This a

Dec 27, 2022