Sync MySQL data into elasticsearch

Last update: Dec 30, 2022

Comments: 14

go-mysql-elasticsearch is a service syncing your MySQL data into Elasticsearch automatically.

It uses mysqldump to fetch the origin data at first, then syncs data incrementally with binlog.

Call for Committer/Maintainer

Sorry that I have no enough time to maintain this project wholly, if you like this project and want to help me improve it continuously, please contact me through email ([email protected]).

Requirement: In the email, you should list somethings(including but not limited to below) to make me believe we can work together.

Your GitHub ID The contributions to go-mysql-elasticsearch before, including PRs or Issues. The reason why you can improve go-mysql-elasticsearch.

Install

Install Go (1.9+) and set your GOPATH
go get github.com/siddontang/go-mysql-elasticsearch, it will print some messages in console, skip it. :-)
cd $GOPATH/src/github.com/siddontang/go-mysql-elasticsearch
make

How to use?

Create table in MySQL.
Create the associated Elasticsearch index, document type and mappings if possible, if not, Elasticsearch will create these automatically.
Config base, see the example config river.toml.
Set MySQL source in config file, see Source below.
Customize MySQL and Elasticsearch mapping rule in config file, see Rule below.
Start ./bin/go-mysql-elasticsearch -config=./etc/river.toml and enjoy it.

Notice

MySQL supported version < 8.0
ES supported version < 6.0
binlog format must be row.
binlog row image must be full for MySQL, you may lost some field data if you update PK data in MySQL with minimal or noblob binlog row image. MariaDB only supports full row image.
Can not alter table format at runtime.
MySQL table which will be synced should have a PK(primary key), multi columns PK is allowed now, e,g, if the PKs is (a, b), we will use "a:b" as the key. The PK data will be used as "id" in Elasticsearch. And you can also config the id's constituent part with other column.
You should create the associated mappings in Elasticsearch first, I don't think using the default mapping is a wise decision, you must know how to search accurately.
mysqldump must exist in the same node with go-mysql-elasticsearch, if not, go-mysql-elasticsearch will try to sync binlog only.
Don't change too many rows at same time in one SQL.

Source

In go-mysql-elasticsearch, you must decide which tables you want to sync into elasticsearch in the source config.

The format in config file is below:

[[source]]
schema = "test"
tables = ["t1", t2]

[[source]]
schema = "test_1"
tables = ["t3", t4]

schema is the database name, and tables includes the table need to be synced.

If you want to sync all table in database, you can use asterisk(*).

[[source]]
schema = "test"
tables = ["*"]

# When using an asterisk, it is not allowed to sync multiple tables
# tables = ["*", "table"]

Rule

By default, go-mysql-elasticsearch will use MySQL table name as the Elasticserach's index and type name, use MySQL table field name as the Elasticserach's field name.
e.g, if a table named blog, the default index and type in Elasticserach are both named blog, if the table field named title, the default field name is also named title.

Notice: go-mysql-elasticsearch will use the lower-case name for the ES index and type. E.g, if your table named BLOG, the ES index and type are both named blog.

Rule can let you change this name mapping. Rule format in config file is below:

[[rule]]
schema = "test"
table = "t1"
index = "t"
type = "t"
parent = "parent_id"
id = ["id"]

    [rule.field]
    mysql = "title"
    elastic = "my_title"

In the example above, we will use a new index and type both named "t" instead of default "t1", and use "my_title" instead of field name "title".

Rule field types

In order to map a mysql column on different elasticsearch types you can define the field type as follows:

[[rule]]
schema = "test"
table = "t1"
index = "t"
type = "t"

    [rule.field]
    // This will map column title to elastic search my_title
    title="my_title"

    // This will map column title to elastic search my_title and use array type
    title="my_title,list"

    // This will map column title to elastic search title and use array type
    title=",list"

    // If the created_time field type is "int", and you want to convert it to "date" type in es, you can do it as below
    created_time=",date"

Modifier "list" will translates a mysql string field like "a,b,c" on an elastic array type '{"a", "b", "c"}' this is specially useful if you need to use those fields on filtering on elasticsearch.

Wildcard table

go-mysql-elasticsearch only allows you determind which table to be synced, but sometimes, if you split a big table into multi sub tables, like 1024, table_0000, table_0001, ... table_1023, it is very hard to write rules for every table.

go-mysql-elasticserach supports using wildcard table, e.g:

[[source]]
schema = "test"
tables = ["test_river_[0-9]{4}"]

[[rule]]
schema = "test"
table = "test_river_[0-9]{4}"
index = "river"
type = "river"

"test_river_[0-9]{4}" is a wildcard table definition, which represents "test_river_0000" to "test_river_9999", at the same time, the table in the rule must be same as it.

At the above example, if you have 1024 sub tables, all tables will be synced into Elasticsearch with index "river" and type "river".

Parent-Child Relationship

One-to-many join ( parent-child relationship in Elasticsearch ) is supported. Simply specify the field name for parent property.

[[rule]]
schema = "test"
table = "t1"
index = "t"
type = "t"
parent = "parent_id"

Note: you should setup relationship with creating the mapping manually.

Filter fields

You can use filter to sync specified fields, like:

[[rule]]
schema = "test"
table = "tfilter"
index = "test"
type = "tfilter"

# Only sync following columns
filter = ["id", "name"]

In the above example, we will only sync MySQL table tfiler's columns id and name to Elasticsearch.

Ignore table without a primary key

When you sync table without a primary key, you can see below error message.

schema.table must have a PK for a column

You can ignore these tables in the configuration like:

# Ignore table without a primary key
skip_no_pk_table = true

Elasticsearch Pipeline

You can use Ingest Node Pipeline to pre-process documents before indexing, like JSON string decode, merge fileds and more.

[[rule]]
schema = "test"
table = "t1"
index = "t"
type = "_doc"

# pipeline id
pipeline = "my-pipeline-id"

Node: you should create pipeline manually and Elasticsearch >= 5.0.

Why not other rivers?

Although there are some other MySQL rivers for Elasticsearch, like elasticsearch-river-jdbc, elasticsearch-river-mysql, I still want to build a new one with Go, why?

Customization, I want to decide which table to be synced, the associated index and type name, or even the field name in Elasticsearch.
Incremental update with binlog, and can resume from the last sync position when the service starts again.
A common sync framework not only for Elasticsearch but also for others, like memcached, redis, etc...
Wildcard tables support, we have many sub tables like table_0000 - table_1023, but want use a unique Elasticsearch index and type.

Todo

MySQL 8
ES 6
Statistic.

Donate

If you like the project and want to buy me a cola, you can through:

PayPal	微信
	[

Feedback

go-mysql-elasticsearch is still in development, and we will try to use it in production later. Any feedback is very welcome.

Email: [email protected]

Owner

siddontang

https://github.com/siddontang/go-mysql-elasticsearch

Comments

connection error message

Hi, I'm having consistent connection error message, but the data is propagated to elastic search correctly. Anyone has an idea why?

Here is the error message from the console: 2018/06/22 23:56:11 ERROR kill connection 17716 error ERROR 1094 (HY000): Unknown thread id: 17716 2018/06/22 23:56:11 INFO kill last connection id 17716 2018/06/22 23:56:11 INFO rotate to (mysql-bin-changelog.000013, 63856) 2018/06/22 23:56:11 INFO receive EOF packet, retry ReadPacket 2018/06/22 23:56:11 ERROR connection was bad 2018/06/22 23:56:11 INFO rotate binlog to (mysql-bin-changelog.000013, 63856) 2018/06/22 23:56:11 INFO save position (mysql-bin-changelog.000013, 63856) 2018/06/22 23:56:12 INFO begin to re-sync from (mysql-bin-changelog.000013, 63856) 2018/06/22 23:56:12 INFO register slave for master server ..........
No data synced
the programme works perfectly. But there is no data synced. below I attached the screenshot.

I am using XAMPP. This is my rivor.toml configuration. `# MySQL address, user and password

user must have replication privilege in MySQL.

my_addr = "127.0.0.1:3306" my_user = "root" my_pass = ""

Elasticsearch address

es_addr = "127.0.0.1:9200"

Path to store data, like master.info, and dump MySQL data

data_dir = "./var"

Inner Http status address

stat_addr = "127.0.0.1:12800"

pseudo server id like a slave

server_id = 1001

mysql or mariadb

flavor = "mysql"

mysqldump execution path

if not set or empty, ignore mysqldump.

mysqldump = "mysqldump"

MySQL data source

[[source]] schema = "test"

Only below tables will be synced into Elasticsearch.

"test_river_[0-9]{4}" is a wildcard table format, you can use it if you have many sub tables, like table_0000 - table_1023

I don't think it is necessary to sync all tables in a database.

#tables = ["test_river", "test_river_[0-9]{4}"] tables = ["a"]

Below is for special rule mapping

[[rule]] schema = "test" table = "a" index = "users" type = "user"

# title is MySQL test_river field name, es_title is the customized name in Elasticsearch [rule.field] # This will map column title to elastic search my_title title="es_title" # This will map column tags to elastic search my_tags and use array type tags="my_tags,list" # This will map column keywords to elastic search keywords and use array type keywords=",list"

wildcard table rule, the wildcard table must be in source tables

[[rule]]

schema = "test"

table = "test_river_[0-9]{4}"

index = "river"

type = "river"

# title is MySQL test_river field name, es_title is the customized name in Elasticsearch # [[rule.fields]] # mysql = "title" # elastic = "es_title"

#table name change to a `
Couldn't execute 'FLUSH TABLES WITH READ LOCK': Access denied for user 'root'@'%' (using password: YES) (1045)

[root@ip-172-31-20-242 go-mysql-elasticsearch]# ./bin/go-mysql-elasticsearch -config=./etc/river.toml [2015/10/20 21:55:29] dump.go:107 [Info] try dump MySQL and parse [2015/10/20 21:55:29] status.go:52 [Info] run status http server 127.0.0.1:12800 mysqldump: Couldn't execute 'FLUSH TABLES WITH READ LOCK': Access denied for user 'root'@'%' (using password: YES) (1045) [2015/10/20 21:55:29] canal.go:138 [Error] canal dump mysql err: exit status 2

Will it works with AWS RDS mysql ?
Map mysql fields to elasticsearch arrays
Hey,

First of all I'm not sure if this feature fits on your roadmap, but I needed a way to map mysql data on an elasticsearch's array type, in order to keep it easy to create specific filters on a search.

This is a first approach where you can map a comma separated varchar on an array field of ES.

*Caution, this feature is not retro-compatible, so definitions for rule.fields had changed a bit, example:

old definition

[rule.field] title = "my_title"

new definition

[[rule.fields]] mysql = "title elastic = "my_title"

Additionally you'll be able to define the type as "list" if you want to convert your comma separated varchar on an elastic array.

Defining a type:

[[rule.fields]] mysql = "title elastic = "my_title" type = "list"

It will map:

title = "a,b,c" # on mysql

as:

title = '{"a","b","c}' # on elastic

Again not sure if this fits on your current roadmap, but it was useful for me, so feel free to add it to your repo :-)
convert table name to lowercase
When capitalizing the table name in the *.toml file as shown below

tables = ["TABLE"]

you will see the error code below

ERRO[0000] index index: TABLE, type: TABLE, id: 1, status: 400, error: {"type":"invalid_index_name_exception","reason":"Invalid index name [TABLE], must be lowercase","index_uuid":"_na_","index":"TABLE"}

so this feature forces the table name to be lowercase and specified as the index and type name of the ES

I encounter the following errors

panic: runtime error: comparing uncomparable type []uint8

goroutine 18 [running]:
github.com/siddontang/go-mysql-elasticsearch/river.(*River).makeUpdateReqData(0xc2080dfc70, 0xc208160000, 0xc2080dfb80, 0xc2080ea2a0, 0xe, 0xe, 0xc2080ea380, 0xe, 0xe)
    /go/src/github.com/siddontang/go-mysql-elasticsearch/river/sync.go:189 +0x1d9
github.com/siddontang/go-mysql-elasticsearch/river.(*River).makeUpdateRequest(0xc2080dfc70, 0xc2080dfb80, 0xc2080b4270, 0x2, 0x2, 0x0, 0x0, 0x0, 0x0, 0x0)
    /go/src/github.com/siddontang/go-mysql-elasticsearch/river/sync.go:124 +0x552
github.com/siddontang/go-mysql-elasticsearch/river.(*rowsEventHandler).Do(0xc20802c020, 0xc2080b4300, 0x0, 0x0)
    /go/src/github.com/siddontang/go-mysql-elasticsearch/river/sync.go:38 +0x64d
github.com/siddontang/go-mysql/canal.(*Canal).travelRowsEventHandler(0xc20803e3c0, 0xc2080b4300, 0x0, 0x0)
    /go/src/github.com/siddontang/go-mysql-elasticsearch/Godeps/_workspace/src/github.com/siddontang/go-mysql/canal/handler.go:32 +0x13a
github.com/siddontang/go-mysql/canal.(*Canal).handleRowsEvent(0xc20803e3c0, 0xc2080b42a0, 0x0, 0x0)
    /go/src/github.com/siddontang/go-mysql-elasticsearch/Godeps/_workspace/src/github.com/siddontang/go-mysql/canal/sync.go:86 +0x2e3
github.com/siddontang/go-mysql/canal.(*Canal).startSyncBinlog(0xc20803e3c0, 0x0, 0x0)
    /go/src/github.com/siddontang/go-mysql-elasticsearch/Godeps/_workspace/src/github.com/siddontang/go-mysql/canal/sync.go:49 +0x6ed
github.com/siddontang/go-mysql/canal.(*Canal).run(0xc20803e3c0, 0x0, 0x0)
    /go/src/github.com/siddontang/go-mysql-elasticsearch/Godeps/_workspace/src/github.com/siddontang/go-mysql/canal/canal.go:144 +0x1a1
created by github.com/siddontang/go-mysql/canal.(*Canal).Start
    /go/src/github.com/siddontang/go-mysql-elasticsearch/Godeps/_workspace/src/github.com/siddontang/go-mysql/canal/canal.go:129 +0x67

goroutine 1 [chan receive, 99 minutes]:
main.main()
    /go/src/github.com/siddontang/go-mysql-elasticsearch/cmd/go-mysql-elasticsearch/main.go:82 +0x666

goroutine 5 [select, 99 minutes]:
github.com/siddontang/go/log.(*Logger).run(0xc20800c080)
    /go/src/github.com/siddontang/go-mysql-elasticsearch/Godeps/_workspace/src/github.com/siddontang/go/log/log.go:100 +0x267
created by github.com/siddontang/go/log.New
    /go/src/github.com/siddontang/go-mysql-elasticsearch/Godeps/_workspace/src/github.com/siddontang/go/log/log.go:80 +0x1de

goroutine 6 [syscall, 99 minutes]:
os/signal.loop()
    /usr/src/go/src/os/signal/signal_unix.go:21 +0x1f
created by os/signal.init·1
    /usr/src/go/src/os/signal/signal_unix.go:27 +0x35

goroutine 17 [IO wait, 99 minutes]:
net.(*pollDesc).Wait(0xc208010ed0, 0x72, 0x0, 0x0)
    /usr/src/go/src/net/fd_poll_runtime.go:84 +0x47
net.(*pollDesc).WaitRead(0xc208010ed0, 0x0, 0x0)
    /usr/src/go/src/net/fd_poll_runtime.go:89 +0x43
net.(*netFD).accept(0xc208010e70, 0x0, 0x7f37d510abb0, 0xc20802abb8)
    /usr/src/go/src/net/fd_unix.go:419 +0x40b
net.(*TCPListener).AcceptTCP(0xc20802c028, 0xc208020700, 0x0, 0x0)
    /usr/src/go/src/net/tcpsock_posix.go:234 +0x4e
net.(*TCPListener).Accept(0xc20802c028, 0x0, 0x0, 0x0, 0x0)
    /usr/src/go/src/net/tcpsock_posix.go:244 +0x4c
net/http.(*Server).Serve(0xc208064480, 0x7f37d510f4e0, 0xc20802c028, 0x0, 0x0)
    /usr/src/go/src/net/http/server.go:1728 +0x92
github.com/siddontang/go-mysql-elasticsearch/river.(*stat).Run(0xc20803cde0, 0xc20802b2d0, 0xd)
    /go/src/github.com/siddontang/go-mysql-elasticsearch/river/status.go:65 +0x4a5
created by github.com/siddontang/go-mysql-elasticsearch/river.NewRiver
    /go/src/github.com/siddontang/go-mysql-elasticsearch/river/river.go:62 +0x39a

goroutine 49 [IO wait]:
net.(*pollDesc).Wait(0xc2080101b0, 0x72, 0x0, 0x0)
    /usr/src/go/src/net/fd_poll_runtime.go:84 +0x47
net.(*pollDesc).WaitRead(0xc2080101b0, 0x0, 0x0)
    /usr/src/go/src/net/fd_poll_runtime.go:89 +0x43
net.(*netFD).Read(0xc208010150, 0xc2080a8000, 0x1000, 0x1000, 0x0, 0x7f37d510abb0, 0xc2080e05d8)
    /usr/src/go/src/net/fd_unix.go:242 +0x40f
net.(*conn).Read(0xc2080a4000, 0xc2080a8000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
    /usr/src/go/src/net/net.go:121 +0xdc
bufio.(*Reader).fill(0xc2080a6000)
    /usr/src/go/src/bufio/bufio.go:97 +0x1ce
bufio.(*Reader).Read(0xc2080a6000, 0xc2080e05d0, 0x4, 0x4, 0x5ee, 0x0, 0x0)
    /usr/src/go/src/bufio/bufio.go:174 +0x26c
io.ReadAtLeast(0x7f37d510add0, 0xc2080a6000, 0xc2080e05d0, 0x4, 0x4, 0x4, 0x0, 0x0, 0x0)
    /usr/src/go/src/io/io.go:298 +0xf1
io.ReadFull(0x7f37d510add0, 0xc2080a6000, 0xc2080e05d0, 0x4, 0x4, 0xc2081242a0, 0x0, 0x0)
    /usr/src/go/src/io/io.go:316 +0x6d
github.com/siddontang/go-mysql/packet.(*Conn).ReadPacketTo(0xc20800a440, 0x7f37d510f3a0, 0xc2081242a0, 0x0, 0x0)
    /go/src/github.com/siddontang/go-mysql-elasticsearch/Godeps/_workspace/src/github.com/siddontang/go-mysql/packet/conn.go:81 +0xe1
github.com/siddontang/go-mysql/packet.(*Conn).ReadPacket(0xc20800a440, 0x0, 0x0, 0x0, 0x0, 0x0)
    /go/src/github.com/siddontang/go-mysql-elasticsearch/Godeps/_workspace/src/github.com/siddontang/go-mysql/packet/conn.go:35 +0x9f
github.com/siddontang/go-mysql/replication.(*BinlogSyncer).onStream(0xc2080b2000, 0xc2081ec980)
    /go/src/github.com/siddontang/go-mysql-elasticsearch/Godeps/_workspace/src/github.com/siddontang/go-mysql/replication/binlogsyncer.go:465 +0x9b
created by github.com/siddontang/go-mysql/replication.(*BinlogSyncer).startDumpStream
    /go/src/github.com/siddontang/go-mysql-elasticsearch/Godeps/_workspace/src/github.com/siddontang/go-mysql/replication/binlogsyncer.go:230 +0x10f

goroutine 8 [IO wait]:
net.(*pollDesc).Wait(0xc2080aa530, 0x72, 0x0, 0x0)
    /usr/src/go/src/net/fd_poll_runtime.go:84 +0x47
net.(*pollDesc).WaitRead(0xc2080aa530, 0x0, 0x0)
    /usr/src/go/src/net/fd_poll_runtime.go:89 +0x43
net.(*netFD).Read(0xc2080aa4d0, 0xc208158000, 0x1000, 0x1000, 0x0, 0x7f37d510abb0, 0xc2080e0410)
    /usr/src/go/src/net/fd_unix.go:242 +0x40f
net.(*conn).Read(0xc2080a4018, 0xc208158000, 0x1000, 0x1000, 0x0, 0x0, 0x0)
    /usr/src/go/src/net/net.go:121 +0xdc
net/http.noteEOFReader.Read(0x7f37d510f378, 0xc2080a4018, 0xc208156058, 0xc208158000, 0x1000, 0x1000, 0x6f8280, 0x0, 0x0)
    /usr/src/go/src/net/http/transport.go:1270 +0x6e
net/http.(*noteEOFReader).Read(0xc20800b000, 0xc208158000, 0x1000, 0x1000, 0xc208013200, 0x0, 0x0)
    <autogenerated>:125 +0xd4
bufio.(*Reader).fill(0xc2080a67e0)
    /usr/src/go/src/bufio/bufio.go:97 +0x1ce
bufio.(*Reader).Peek(0xc2080a67e0, 0x1, 0x0, 0x0, 0x0, 0x0, 0x0)
    /usr/src/go/src/bufio/bufio.go:132 +0xf0
net/http.(*persistConn).readLoop(0xc208156000)
    /usr/src/go/src/net/http/transport.go:842 +0xa4
created by net/http.(*Transport).dialConn
    /usr/src/go/src/net/http/transport.go:660 +0xc9f

goroutine 9 [select]:
net/http.(*persistConn).writeLoop(0xc208156000)
    /usr/src/go/src/net/http/transport.go:945 +0x41d
created by net/http.(*Transport).dialConn
    /usr/src/go/src/net/http/transport.go:661 +0xcbc

what's wrong? How can I fix this?

receive signal hangup, closing. And zombie processes.

Hi sidong,

I use alpine docker image run this app.

The process hangup about half a day and generate a zombie process.

/go/src/github.com/siddontang/go-mysql-elasticsearch# tail -15 nohup.out
time="2018-03-31T13:10:29Z" level=info msg="save position (mysql.000004, 58885216)"
time="2018-03-31T13:10:53Z" level=info msg="save position (mysql.000004, 58885910)"
time="2018-03-31T13:11:03Z" level=info msg="save position (mysql.000004, 58886868)"
time="2018-03-31T13:11:09Z" level=info msg="save position (mysql.000004, 58888518)"
time="2018-03-31T13:11:24Z" level=info msg="save position (mysql.000004, 58889446)"
time="2018-03-31T13:11:25Z" level=info msg="receive signal hangup, closing"
time="2018-03-31T13:11:25Z" level=info msg="closing river"
time="2018-03-31T13:11:25Z" level=info msg="closing canal"
time="2018-03-31T13:11:25Z" level=info msg="syncer is closing..."
time="2018-03-31T13:11:25Z" level=error msg="canal start sync binlog err: context canceled"
time="2018-03-31T13:11:25Z" level=error msg="start canal err context canceled"
time="2018-03-31T13:11:25Z" level=error msg="connection was bad"
time="2018-03-31T13:11:25Z" level=error msg="close sync with err: Sync was closed"
time="2018-03-31T13:11:25Z" level=info msg="syncer is closed"
time="2018-03-31T13:11:25Z" level=info msg="save position (mysql.000004, 58889446)"

   63     1 root     Z        0   0%   3   0% [go-mysql-elasti]
  365     1 root     Z        0   0%  27   0% [go-mysql-elasti]
  247     1 root     Z        0   0%  15   0% [go-mysql-elasti]
  478     1 root     Z        0   0%  12   0% [go-mysql-elasti]
  438     1 root     Z        0   0%  11   0% [go-mysql-elasti]
  171     1 root     Z        0   0%  13   0% [bash]
  437     1 root     Z        0   0%   5   0% [bash]
  204     1 root     Z        0   0%  11   0% [bash]
  476     1 root     Z        0   0%   0   0% [bash]
  273     1 root     Z        0   0%  10   0% [bash]

I do not know why this happened, and how to kill the zombie process if their ppid is 1..

Thanks in advance!!!

Doesn't work

Hello, i have installed this extension, but doesn't work.

these are the logs:

[info] binlogsyncer.go:111 create BinlogSyncer with config {1001 mysql 127.0.0.1 3306 abcerca utf8 false false false false 0 0s 0s 0} github.com/siddontang/go-mysql/canal/canal.go:392: binlog must ROW format, but STATEMENT now github.com/siddontang/go-mysql/canal/canal.go:83: /root/go/src/github.com/siddontang/go-mysql-elasticsearch/river/river.go:108: /root/go/src/github.com/siddontang/go-mysql-elasticsearch/river/river.go:58:

no synchronization is performed, what can it be?

mysqldump: Got errno 32 on write

INFO[0000] create BinlogSyncer with config {1001 mysql 127.0.0.1 3306 root   utf8 false false <nil> false 0 0s 0s} 
INFO[0000] skip master data, get current binlog position (mysql-bin.000003, 107) 
INFO[0000] try dump MySQL and parse                     
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x8049a3c]

goroutine 12 [running]:
sync/atomic.AddUint64(0x18e5a90c, 0x1, 0x0, 0x0, 0x0)
	/usr/local/go/src/sync/atomic/asm_386.s:112 +0xc
github.com/siddontang/go-mysql-elasticsearch/vendor/github.com/siddontang/go/sync2.(*AtomicInt64).Add(0x18e5a90c, 0x1, 0x0, 0x18e72280, 0x13)
	/home/deploy/go/src/github.com/siddontang/go-mysql-elasticsearch/vendor/github.com/siddontang/go/sync2/atomic.go:52 +0x31
github.com/siddontang/go-mysql-elasticsearch/river.(*River).makeRequest(0x18e4fcc0, 0x18e62230, 0x8311e4d, 0x6, 0x18f32180, 0x1, 0x1, 0x7, 0x82c1c80, 0x0, ...)
	/home/deploy/go/src/github.com/siddontang/go-mysql-elasticsearch/river/sync.go:193 +0x2f1
github.com/siddontang/go-mysql-elasticsearch/river.(*River).makeInsertRequest(0x18e4fcc0, 0x18e62230, 0x18f32180, 0x1, 0x1, 0x1, 0x10, 0x18, 0x18e0f030, 0x18e4dc00)
	/home/deploy/go/src/github.com/siddontang/go-mysql-elasticsearch/river/sync.go:203 +0x53
github.com/siddontang/go-mysql-elasticsearch/river.(*eventHandler).OnRow(0x18e0ef20, 0x18e4dc00, 0x82c1c80, 0x18e0f030)
	/home/deploy/go/src/github.com/siddontang/go-mysql-elasticsearch/river/sync.go:73 +0x2fb
github.com/siddontang/go-mysql-elasticsearch/vendor/github.com/siddontang/go-mysql/canal.(*dumpParseHandler).Data(0x18e4da80, 0x18e12fd5, 0x7, 0x18f640dd, 0x19, 0x18eee200, 0x13, 0x20, 0x4, 0x835cf80)
	/home/deploy/go/src/github.com/siddontang/go-mysql-elasticsearch/vendor/github.com/siddontang/go-mysql/canal/dump.go:67 +0x77d
github.com/siddontang/go-mysql-elasticsearch/vendor/github.com/siddontang/go-mysql/dump.Parse(0x847eff0, 0x18e0ef68, 0x84809d0, 0x18e4da80, 0x816f300, 0x1, 0x0)
	/home/deploy/go/src/github.com/siddontang/go-mysql-elasticsearch/vendor/github.com/siddontang/go-mysql/dump/parser.go:81 +0x26f
github.com/siddontang/go-mysql-elasticsearch/vendor/github.com/siddontang/go-mysql/dump.(*Dumper).DumpAndParse.func1(0x18e0ef68, 0x84809d0, 0x18e4da80, 0x18e9a900, 0x18f0c1c0)
	/home/deploy/go/src/github.com/siddontang/go-mysql-elasticsearch/vendor/github.com/siddontang/go-mysql/dump/dump.go:182 +0x4e
created by github.com/siddontang/go-mysql-elasticsearch/vendor/github.com/siddontang/go-mysql/dump.(*Dumper).DumpAndParse
	/home/deploy/go/src/github.com/siddontang/go-mysql-elasticsearch/vendor/github.com/siddontang/go-mysql/dump/dump.go:181 +0x11a
root@server:/usr/local/es# mysqldump: Got errno 32 on write

cannot sync data

I don't know what configuration is wrong. mysqldump is OK. But sync does not work.

Could you help me to figure out why? Here are some logs:

dump.go:113 [Info] dump MySQL and parse OK, use 704.79 seconds, start binlog replication at (mariadb-bin.000001, 1478069)
sync.go:15 [Info] start sync binlog at (mariadb-bin.000001, 1478069)
canal.go:146 [Error] canal start sync binlog err: connection was bad

use lowercase for type

I think converting all to the lowercase makes sense. to fix https://github.com/siddontang/go-mysql-elasticsearch/issues/176

@perfectacle Can you try it?
How do I connect to an ES cluster

In river.toml file, I set the address to connect to ES like this

es_addr = "192.168.26.101:9200,192.168.26.102:9200,192.168.26.103:9200"

The program reported an error when I started.

It would be normal if I changed it to this address. es_addr = "192.168.26.101:9200"

So, How should I configure the ES cluster connection address?
Add shield mysql ddl command

我不知道该功能是否需要，目前我们用于同步mysql到es的数据，某些表会需要屏蔽delete操作，故增加了跳过mysql 某些ddl命令的功能

I don't know whether this function is required. At present, we are used to synchronize the data from Mysql to es. Some tables need to mask the delete operation, so the function of skipping some DDL commands of MySQL is added
你好，同步报错如下：sync.go:156 do ES bulk err invalid character '<' looking for beginning of value, close sync

[2021/09/15 20:41:03] [error] sync.go:156 do ES bulk err invalid character '<' looking for beginning of value, close sync [2021/09/15 20:41:03] [info] main.go:95 context is done with context canceled, closing [2021/09/15 20:41:03] [info] river.go:312 closing river [2021/09/15 20:41:03] [info] canal.go:238 closing canal [2021/09/15 20:41:03] [info] binlogsyncer.go:172 syncer is closing... [2021/09/15 20:41:03] [info] binlogsyncer.go:187 syncer is closed [2021/09/15 20:41:03] [info] master.go:54 save position (, 0) mysqldump: Got errno 32 on write [2021/09/15 20:41:03] [error] canal.go:224 canal dump mysql err: context canceled [2021/09/15 20:41:03] [error] river.go:297 start canal err context canceled

Related tags

Database tools go-mysql-elasticsearch

mysql to mysql 轻量级多线程的库表数据同步

goMysqlSync golang mysql to mysql 轻量级多线程库表级数据同步测试运行设置当前binlog位置并且开始运行 go run main.go -position mysql-bin.000001 1 1619431429 查询当前binlog位置，参数n为秒数，查询结

Nov 15, 2022

convert sql to elasticsearch DSL in golang(go)

_____ _ _ ____ _____ ___ ____ ____ ___ _ | ____| | / \ / ___|_ _|_ _|/ ___|/ ___| / _ \ | | | _| | | / _ \ \___ \ |

Jan 7, 2023

Zinc Search engine. A lightweight alternative to elasticsearch that requires minimal resources, written in Go.

Zinc Zinc is a search engine that does full text indexing. It is a lightweight alternative to elasticsearch and runs in less than 100 MB of RAM. It us

Jan 8, 2023

This package can parse date match expression, which used by ElasticSearch

datemath-parser this package is pure go package, this package can parse date match expression, which used by ElasticSearch. Date Math Definition you c

Jan 8, 2022

db-recovery is a tool for recovering MySQL data.

db-recovery is a tool for recovering MySQL data. It is used in scenarios where the database has no backup or binlog. It can parse data files and redo/undo logs to recover data.

Nov 17, 2022

Dumpling is a fast, easy-to-use tool written by Go for dumping data from the database(MySQL, TiDB...) to local/cloud(S3, GCP...) in multifarious formats(SQL, CSV...).

?? Dumpling Dumpling is a tool and a Go library for creating SQL dump from a MySQL-compatible database. It is intended to replace mysqldump and mydump

Nov 9, 2022

Go-fiber - Implement CRUD Data Go and Mysql using Authentication & Authorization

Implement CRUD Data Go and Mysql using Authentication & Authorization

Jun 8, 2022

Library for scanning data from a database into Go structs and more

scany Overview Go favors simplicity, and it's pretty common to work with a database via driver directly without any ORM. It provides great control and

Jan 9, 2023

a powerful mysql toolset with Go

go-mysql A pure go library to handle MySQL network protocol and replication. Call for Committer/Maintainer Sorry that I have no enough time to maintai

Dec 28, 2022

A high-performance MySQL proxy

kingshard 中文主页 Overview kingshard is a high-performance proxy for MySQL powered by Go. Just like other mysql proxies, you can use it to split the read

Dec 30, 2022

Golang MySql binary log replication listener

Go MySql binary log replication listener Pure Go Implementation of MySQL replication protocol. This allow you to receive event like insert, update, de

Oct 25, 2022

MySQL replication topology management and HA

orchestrator [Documentation] orchestrator is a MySQL high availability and replication management tool, runs as a service and provides command line ac

Jan 4, 2023

Vitess is a database clustering system for horizontal scaling of MySQL.

Vitess Vitess is a database clustering system for horizontal scaling of MySQL through generalized sharding. By encapsulating shard-routing logic, Vite

Jan 3, 2023

一个使 mysql，pgsql 数据库表自动生成 go struct 的工具

db2go 一个使 mysql、pgsql 数据库表自动生成 go struct 的工具快速使用将项目放入到GOPATH/src目录下

Nov 25, 2022

🐳 A most popular sql audit platform for mysql

?? A most popular sql audit platform for mysql

Jan 6, 2023

GoRose(go orm), a mini database ORM for golang, which inspired by the famous php framwork laravle's eloquent. It will be friendly for php developer and python or ruby developer. Currently provides six major database drivers: mysql,sqlite3,postgres,oracle,mssql, Clickhouse.

GoRose ORM _______ ______ .______ ______ _______. _______ / _____| / __ \ | _ \ / __ \ / || ____| |

Dec 24, 2022

Sync MySQL data into elasticsearch

Call for Committer/Maintainer

Install

How to use?

Notice

Source

Rule

Rule field types

Wildcard table

Parent-Child Relationship

Filter fields

Ignore table without a primary key

Elasticsearch Pipeline

Why not other rivers?

Todo

Donate

Feedback

Owner

siddontang

Comments

connection error message

No data synced

user must have replication privilege in MySQL.

Elasticsearch address

Path to store data, like master.info, and dump MySQL data

Inner Http status address

pseudo server id like a slave

mysql or mariadb

mysqldump execution path

if not set or empty, ignore mysqldump.

MySQL data source

Only below tables will be synced into Elasticsearch.

"test_river_[0-9]{4}" is a wildcard table format, you can use it if you have many sub tables, like table_0000 - table_1023

I don't think it is necessary to sync all tables in a database.

Below is for special rule mapping

wildcard table rule, the wildcard table must be in source tables

[[rule]]

schema = "test"

table = "test_river_[0-9]{4}"

index = "river"

type = "river"

Couldn't execute 'FLUSH TABLES WITH READ LOCK': Access denied for user 'root'@'%' (using password: YES) (1045)

Map mysql fields to elasticsearch arrays

convert table name to lowercase

I encounter the following errors

receive signal hangup, closing. And zombie processes.

Doesn't work

mysqldump: Got errno 32 on write

cannot sync data

use lowercase for type

How do I connect to an ES cluster

Add shield mysql ddl command

我不知道该功能是否需要，目前我们用于同步mysql到es的数据，某些表会需要屏蔽delete操作，故增加了跳过mysql 某些ddl命令的功能

你好，同步报错如下：sync.go:156 do ES bulk err invalid character '<' looking for beginning of value, close sync

Related tags

mysql to mysql 轻量级多线程的库表数据同步

convert sql to elasticsearch DSL in golang(go)

Zinc Search engine. A lightweight alternative to elasticsearch that requires minimal resources, written in Go.

This package can parse date match expression, which used by ElasticSearch

db-recovery is a tool for recovering MySQL data.

Dumpling is a fast, easy-to-use tool written by Go for dumping data from the database(MySQL, TiDB...) to local/cloud(S3, GCP...) in multifarious formats(SQL, CSV...).

Go-fiber - Implement CRUD Data Go and Mysql using Authentication & Authorization

Library for scanning data from a database into Go structs and more

a powerful mysql toolset with Go

A high-performance MySQL proxy

Golang MySql binary log replication listener

MySQL replication topology management and HA

Vitess is a database clustering system for horizontal scaling of MySQL.

一个使 mysql，pgsql 数据库表自动生成 go struct 的工具

🐳 A most popular sql audit platform for mysql

GoRose(go orm), a mini database ORM for golang, which inspired by the famous php framwork laravle's eloquent. It will be friendly for php developer and python or ruby developer. Currently provides six major database drivers: mysql,sqlite3,postgres,oracle,mssql, Clickhouse.

Vitess is a database clustering system for horizontal scaling of MySQL.

GitHub's Online Schema Migrations for MySQL

Gaea is a mysql proxy, it's developed by xiaomi b2c-dev team.