mygopherhose - Parallel importer for mysqldumps.

mygopherhose

Parallel importer for mysqldumps.

What

mygopherhose uses a dump produced by mysqldump and imports it trying to parallelize INSERT statements.

From benchmarks, it seems to perform 3x faster on small instances (should be even better on high end machines).

Caveats

  • tables are not locked
  • does not support SETting things
  • does not support stored procedures

Use at your own risks in production.

Usage

mygopherhose [-h host] -u user -p [password] [-P port] [-d dbname] [-b bufsize] dumpfile
        -h defaults to 127.0.0.1
        -P defaults to 3306
        -b defaults to 10485760 bytes
        -d can be omitted is dump contains `USE DATABASE foo;` stanza
        -p if parameter is empty, password will be asked interactively
  • 🐑 is displayed at each CREATE TABLE operation.
  • is diplayed when all transactions for the current table have been sent to the buffered channel for processing
  • a dot is displayed for each INSERT statement applied (those statements almost always contains several rows, see net_buffer_length)

GCP benchmarks

  • Client VM: n2-standard-2
  • CloudSQL: db-n1-standard-8 / 260GB
  • Dump:
    • 20GB
    • 95 tables
    • 68781981 rows
    • data size 23.04G
    • index size 2.48 G
description duration db CPU
mysql cli piped 30m24.209s ~11%
mysql -e source 30m30.161s ~11%
mygopherhose -c 10 13m02.016s 54-87%
mygopherhose -c 20 12m51.301s 60-90%
mygopherhose -c 40 12m51.799s 60-93%
mygopherhose -c 100 13m09.446s 69-97%
Owner
devops.works
Devopsy things
devops.works
Similar Resources

A fully self-contained Nmap like parallel port scanning module in pure Golang that supports SYN-ACK (Silent Scans)

gomap What is gomap? Gomap is a fully self-contained nmap like module for Golang. Unlike other projects which provide nmap C bindings or rely on other

Dec 10, 2022

Set up tasks to be executed in parallel.

A simple Go library to set up tasks to be executed in parallel. package main import ( "context" "log" "github.com/bep/workers" ) func main() {

Sep 18, 2022

DSV Parallel Processor takes input files and query specification via a spec file

DSV Parallel Processor Spec file DSV Parallel Processor takes input files and query specification via a spec file (conventionally named "spec.toml").

Oct 9, 2021

A very simple ssh-agent that signs requests in parallel

ssh-agent A very simple ssh-agent that signs requests in parallel. Usage To install and run the agent simply run: $ go install github.com/Woellchen/ss

Dec 14, 2022

How we can run unit tests in parallel mode with failpoint injection taking effect and without injection race

This is a simple demo to show how we can run unit tests in parallel mode with failpoint injection taking effect and without injection race. The basic

Oct 31, 2021

Dataflow is a Kubernetes-native platform for executing large parallel data-processing pipelines.

Dataflow is a Kubernetes-native platform for executing large parallel data-processing pipelines.

Dataflow Summary Dataflow is a Kubernetes-native platform for executing large parallel data-processing pipelines. Each pipeline is specified as a Kube

Jan 4, 2023

Parallel implementation of Gzip for modern multi-core machines written in Go

gzip Parallel implementation of gzip for modern multi-core machines written in Go Usage: gzip [OPTION]... [FILE] Compress or uncompress FILE (by defau

Nov 16, 2021

Pengenalan Concurrency dan Parallel Programming

Pengenalan Concurrency dan Parallel Programming

Golang Goroutine Sumber Tutorial: Udemy Slide Pengenalan Concurrency dan Parallel Programming Pengenalan Parallel Programming Saat ini kita hidup dima

Nov 5, 2021

An easy-to-use Map Reduce Go parallel-computing framework inspired by 2021 6.824 lab1. It supports multiple workers on a single machine right now.

MapReduce This is an easy-to-use Map Reduce Go framework inspired by 2021 6.824 lab1. Feature Multiple workers on single machine right now. Easy to pa

Dec 5, 2022

Stack Up is a simple deployment tool that performs given set of commands on multiple hosts in parallel.

Stack Up is a simple deployment tool that performs given set of commands on multiple hosts in parallel.

Stack Up is a simple deployment tool that performs given set of commands on multiple hosts in parallel. It reads Supfile, a YAML configuration file, which defines networks (groups of hosts), commands and targets.

Jan 1, 2023

Argo Workflows is an open source container-native workflow engine for orchestrating parallel jobs on Kubernetes.

Argo Workflows is an open source container-native workflow engine for orchestrating parallel jobs on Kubernetes.

What is Argo Workflows? Argo Workflows is an open source container-native workflow engine for orchestrating parallel jobs on Kubernetes. Argo Workflow

Dec 10, 2021

Parallel processing through go routines, copy and delete thousands of key within some minutes

redis-dumper CLI Parallel processing through go routines, copy and delete thousands of key within some minutes copy data by key pattern from one redis

Dec 26, 2021

Launch parallel processes (shuttles) with conditional parameter(s) specified in text file(s)

~# shuttle Launch some shuttles here and there. Usage Usage example: # Launch as many sqlmaps as the lines in targets.txt # with 4 maximum istances at

Jan 8, 2022

A simple package for executing work in parallel up to a limit.

concurrencylimiter A simple package for executing work concurrently - up to a limit. The intended usecase looks something like: func concurrentlyDo(ta

Dec 19, 2021

A tool/library to run custom validations on Kubernetes resources in parallel

cluster-validator cluster-validator is a tool/library for performing resource validations in parallel on a Kubernetes cluster. For example, validating

Mar 2, 2022

Provides the function Parallel to create a synchronous in memory pipe and lets you write to and read from the pipe parallelly

iopipe provides the function Parallel to create a synchronous in memory pipe and lets you write to and read from the pipe parallely

Jan 25, 2022

📊 parallel black box PostgreSQL unit tests run against a real database.

📊   parallel black box PostgreSQL unit tests run against a real database.

📊 psql-docker-tests-example Parallel black box PostgreSQL unit tests run against a real database. Consider reading the Medium Story first. This packa

Sep 15, 2022
Comments
  • honor variable settings in dumps

    honor variable settings in dumps

    mygh should honor SETs present in dump headers and footers

    since it uses multiple connections, these SET must happen in each of them

    /*!40101 SET @OLD_CHARACTER_SET_CLIENT=@@CHARACTER_SET_CLIENT */;
    /*!40101 SET @OLD_CHARACTER_SET_RESULTS=@@CHARACTER_SET_RESULTS */;
    /*!40101 SET @OLD_COLLATION_CONNECTION=@@COLLATION_CONNECTION */;
    /*!40101 SET NAMES utf8mb4 */;
    /*!40103 SET @OLD_TIME_ZONE=@@TIME_ZONE */;
    /*!40103 SET TIME_ZONE='+00:00' */;
    /*!40014 SET @OLD_UNIQUE_CHECKS=@@UNIQUE_CHECKS, UNIQUE_CHECKS=0 */;
    /*!40014 SET @OLD_FOREIGN_KEY_CHECKS=@@FOREIGN_KEY_CHECKS, FOREIGN_KEY_CHECKS=0 */;
    /*!40101 SET @OLD_SQL_MODE=@@SQL_MODE, SQL_MODE='NO_AUTO_VALUE_ON_ZERO' */;
    /*!40111 SET @OLD_SQL_NOTES=@@SQL_NOTES, SQL_NOTES=0 */;
    
  • behaviour when database is not set

    behaviour when database is not set

    when DB is not set, every query sent by mygopherhose fails

    Host        : 127.0.0.1
    User        : root
    Database    : 
    Buffer size : 10485760 bytes
    File        : dump.sql
    
    DROP TABLE IF EXISTS `foo`;: Error 1046: No database selected
    0 statements
    🐑 foo
    

    if db is mandatory, arg should be checked

  • Speed improvements

    Speed improvements

    Before import

    SET FOREIGN_KEY_CHECKS = 0;
    SET UNIQUE_CHECKS = 0;
    SET AUTOCOMMIT = 0;
    

    And for each table (if MyISAM; has no effect on InnoDB):

    ALTER TABLE `table_name` DISABLE KEYS;
    

    After import

    SET FOREIGN_KEY_CHECKS = 1;
    SET UNIQUE_CHECKS = 1;
    SET AUTOCOMMIT = 1;
    

    (probably not necessary since the session is closed afterwards)

    And for each table.

    ALTER TABLE `table_name` ENABLE KEYS;
    
parallel: a Go Parallel Processing Library

parallel: a Go Parallel Processing Library Concurrency is hard. This library doesn't aim to make it easy, but it will hopefully make it a little less

May 9, 2022
M3u8-parallel-downloader - M3u8 parallel downloader with golang

m3u8-parallel-downloader Usage ./m3u8-parallel-downloader -input http://example.

Aug 12, 2022
Dependency-free replacement for GNU parallel, perfect fit for usage in an initramfs.

coshell v0.2.5 A no-frills dependency-free replacement for GNU parallel, perfect for initramfs usage. Licensed under GNU/GPL v2. How it works An sh -c

Dec 19, 2022
A parallel downloader with resume capability
A parallel downloader with resume capability

Grozilla The Grozilla is a simple implementation that allows downloading of video,audio,package or zip files parallely and efficiently using light wei

May 5, 2022
Next generation distributed, event-driven, parallel config management!
Next generation distributed, event-driven, parallel config management!

mgmt: next generation config management! About: Mgmt is a real-time automation tool. It is familiar to existing configuration management software, but

Dec 26, 2022
Go parallel gzip (de)compression

pgzip Go parallel gzip compression/decompression. This is a fully gzip compatible drop in replacement for "compress/gzip". This will split compression

Dec 29, 2022
Parallel Digital Universe - A decentralized identity-based social network

Parallel Digital Universe Golang implementation of PDU. What is PDU? Usage Development Contributing PDU PDU is a decentralized identity-based social n

Nov 20, 2022
Run functions in parallel :comet:

Parallel fn Run functions in parallel. Limit the number of goroutines running at the same time. Installation go get -u github.com/rafaeljesus/parallel

Sep 26, 2022
Parallel S3 and local filesystem execution tool.
Parallel S3 and local filesystem execution tool.

s5cmd Overview s5cmd is a very fast S3 and local filesystem execution tool. It comes with support for a multitude of operations including tab completi

Jan 5, 2023
A library for parallel programming in Go

pargo A library for parallel programming in Go Package pargo provides functions and data structures for expressing parallel algorithms. While Go is pr

Nov 28, 2022