PgSQL compatible on distributed database TiDB

TiDB for PostgreSQL

LICENSE Language Build Status Go Report Card codecov GitHub commit activity

Introduction

TiDB for PostgreSQL is an open source launched by Digital China Cloud Base to promote and integrate into the open source community of TiDB. In addition to its high availability, horizontal scalability, and cloud-native features, TiDB for PostgreSQL is compatible with the PostgreSQL protocol. On the one hand, you can use the PostgreSQL client to connect to TiDB for PostgreSQL, which we have basically completed. On the other hand, you can also use the unique grammatical features of PostgreSQL in your business system, which is in the development process.

More details can be found in the Design Document

Background

As the general trend of the massive explosion of information in the current data era becomes more and more obvious, the database is more and more likely to become the bottleneck of the upper-level business system. For this problem, TiDB solves most of migration problems of business system based on MySQL database through its highly compatibility of MySQL protocol. In view of its distributed transactions with strong consistency, flexible scalability, and excellent disaster recovery backup architecture, users prefer to use mature solutions of TiDB rather than scale MySQL by sharding. However, there is no mature solution for the upper-level business system based on PostgreSQL to migrate to TiDB. If you want to migrate a PostgreSQL-based business system to TiDB, you have to modify lots of code of business system. To solve this,we try to refactor the underlying source code of TiDB to make it compatible with the PostgreSQL protocol, making it possible to migrate PostgreSQL-based business systems to TiDB without modifying much code.

Development progress

The largest amount of work for the development of TiDB for PostgreSQL is compatibility with PostgreSQL, which has two things to do. One is to implement the PostgreSQL connection protocol, which we have basically achieved, So PostgreSQL client can connect to TiDB for PostgreSQL. The other is the unique PostgreSQL syntax. TiDB for PostgreSQL can support general sql syntax, but there are a certain amount of work to do to make it compatible with PostgreSQL’s unique syntax. This is one of the reasons why TiDB for PostgreSQL is open sourced. We hope that through the open source, engineers interested in our TiDB for PostgreSQL can work together to make our project bigger and better.

Quick start

First, make sure you have a Go environment, because TiDB for PostgreSQL is based on the Go language.

TiDB for PostgreSQL can be started on a single node without pd and tikv.

If there is no pd and tikv, it will create mock pd and mock tikv to maintain the stable operation of the system.

The following is an example of locally compiling and running TiDB for PostgreSQL on localhost.

mkdir -p  $GOPATH/src/github.com/digitalchina

cd  $GOPATH/src/github.com/digitalchina

git clone https://github.com/DigitalChinaOpenSource/TiDB-for-PostgreSQL.git

cd TiDB-for-PostgreSQL/tidb-server

go run main.go

# If you get an error: export ordinal to larger
# Please run the following cmd:
go run -buildmode=exe  main.go

After starting the main program of TiDB for PostgreSQL , it will run on port 4000 of the host.

How to use PostgreSQL client to connect to TiDB for PostgreSQL? Here, we take the command line tool psql that comes with PostgreSQL as an example.

Server [localhost]:
Database [postgres]: test
Port [5433]: 4000
Username [postgres]: root
psql (13.1, server 8.3.11)
Type "help" to get help information.

test=# show tables;
 Tables_in_test
----------------
 t1
 t2
(2 行记录)

Docker

# Log in to your dockerhub account
docker login

# Pull the image and start the container
docker pull dcleeray/tidb-for-pg
docker run -it --name tidbforpg -p 4000:4000 -d  dcleeray/tidb-for-pg:latest

Cluster deployment

TiDB for PostgreSQL also supports cluster deployment.

Our current modification work does not involve the communication module of each component in the cluster. Therefore, the connection of TiDB for PostgreSQL to pd and tikv will not be affected in any way.

We recommend using binary file to deploy TiDB for PostgreSQL cluster.

First, download the official binary package file and unzip it.

wget http://download.pingcap.org/tidb-v4.0.11-linux-amd64.tar.gz
wget http://download.pingcap.org/tidb-v4.0.11-linux-amd64.sha256

sha256sum -c tidb-v4.0.11-linux-amd64.sha256

tar -xzf tidb-v4.0.11-linux-amd64.tar.gz
cd tidb-v4.0.11-linux-amd64/bin

Second, deploy each node in the cluster in order. According to the cluster architecture of TiDB for PostgreSQL, pd nodes are deployed first, then the tikv node, and finally the TiDB for PostgreSQL node. The number of nodes in the cluster is not specified, but there is at least one of each type.

Deploy a pd node

./pd-server --name=pd1 --data-dir=pd1 --client-urls="http://pdhost:2379" --peer-urls="http://host:2380" -L "info" --log-file=pd.log

Deploy three tikv nodes

./tikv-server --pd="pdhost:2379" --addr="kvhost1:20160"  --data-dir=tikv  --log-file=tikv1.log
./tikv-server --pd="pdhost:2379" --addr="kvhost2:20160"  --data-dir=tikv  --log-file=tikv2.log
./tikv-server --pd="pdhost:2379" --addr="kvhost3:20160"  --data-dir=tikv  --log-file=tikv3.log

Deploy a TiDB for PostgreSQL node. You need to compile the TiDB for PostgreSQL project into a binary file named tidb-server and upload it to this server for replace the original tidb-server

./tidb-server --store=tikv  --path="pdhost:2379" --log-file=tidb.log

With the above done, a TiDB for PostgreSQL cluster is successfully deployed, and you can connect to TiDB for PostgreSQL cluster in the same way as you did with single TiDB for PostgreSQL demonstrated earlier.

Contribute code to TiDB for postgresql

We greatly welcome and appreciate developers who are interested in TiDB for PostgreSQL to make contributions. At present, we provide several directions to get started.

Learning Guide

Contribution Guide

Schema structure

The database structure of PostgreSQL is very different from that of MySQL, which is also a big problem we encountered in our development. It involves the realization of database system tables, system views, and system functions. It’s a lot of work and very hard. Developers are required to be very familiar with the database table structure of both PostgreSQL and MySQL.

Postgresql grammatical features

Compared with MySQL, PostgreSQL has many new syntax features. For example, Returning can return specified column or all columns of the modified row. Implementing a syntax involves modifying codes of the Parser module, TiDB for PostgreSQL's internal plan structure, plan optimization, plan execution, and data write-back, and many other parts. It’s really very hard. We recommend that developers start with clauses such as RETURNING and modify them on the basis of the original code. After being familiar with the logic of TiDB for PostgreSQL planning and execution, you can try to implement a brand new statement.

Version notice

To keep our code improvement running stable, we currently choose a fixed version for development. The following is our selected version:

TiDB v4.0.11

PostgreSQL 13

License

Apache-2.0 License

Owner
DigitalChinaOpenSource
Digital China Group's open source projects, we welcome you to show your talents and grow together here.
DigitalChinaOpenSource
Comments
  • Executor: reworked SetInsertParamType

    Executor: reworked SetInsertParamType

    Signed-off-by: David [email protected]

    What's changed

    Reworked SetInsertParamType function in the executor module

    Effects

    Now more unit tests can be passed, in particular, the following tests should no longer fail or panic:

    • testSuite.TestBatchInsertWithOnDuplicate
    • testSuite8.TestShardRowIDBits
    • seqTestSuite.TestPrepared
    • seqTestSuite.TestPreparedIssue8644
    • testSessionSuite.TestAutoIncrementWithRetry
    • testSessionSuite.TestLastInsertID
    • testSessionSuite.TestPrepareZero
    • testSessionSuite.TestPrimaryKeyAutoIncrement
  • core: add the implemet of setparamtype at show, showddl, datasource, limit plan ; executor: add new case 'LogicalLimit' at Next()

    core: add the implemet of setparamtype at show, showddl, datasource, limit plan ; executor: add new case 'LogicalLimit' at Next()

    What problem does this PR solve?

    Issue Number: close #28

    Problem Summary

    some interface have not be implemented,so some sql statement can't get parameter type。

    What is changed and how it works?

    Proposal: xxx

    What's Changed:

    New implement method 'setparamtype' at show ,showddl, datasource, limit plan

    How it Works:

    set the parameter type from the plan tree

    Related changes

    • Need to cherry-pick to the release branch

    Check List

    Tests

    • Unit test
    • Integration test
    • Manual test (add detailed scripts or steps below)
    • No code

    Side effects

    • Performance regression
      • Consumes more CPU
      • Consumes more MEM
    • Breaking backward compatibility

    Release note

  • makefile: upload unit tests coverage to codecov

    makefile: upload unit tests coverage to codecov

    What problem does this PR solve?

    Problem Summary: upload unit tests coverage to codecov

    What is changed and how it works?

    Proposal: codecov unit test coverage

    What's Changed: add scripts in makefile

    How it Works: make upload-coverage

  • Support for Postgres transaction syntax in Parser

    Support for Postgres transaction syntax in Parser

    Development Task

    Description

    The SQL standard syntax for starting a transaction is START TRANSACTION, which both MySQL and PostgreSQL supports. They also support syntax like START TRANSACTION + [modifier], where modifier can be READ WRITE, READ ONLY, etc. For the ease of use, they also support use BEGIN instead of START TRANSACTION However, they treat BEGIN + [modifier] differently. (like BEGIN READ WRITE)

    In Postgres this is completely fine, BEGIN/BEGIN WORK/BEGIN TRANSACTION are all treated as aliases for START TRANSACTION

    MySQL, on the other hand, does not allow modifier after BEGIN, a limitation which TiDB inherits.

    In TiDB, the standard way to start new transaction via go code is

    txn, err := db.Begin()

    which calls underlying go-mysql-driver via go's database module, and return a transaction handle (the txn above). After we switch the driver from lib/pq, because pq starts transaction by BEGIN READ WRITE when no modifier provided, it will cause parser to complain. image

    Previously in unit tests fixing, we have been solving the problem by skipping Begin()method and execute SQL statement START TRANSACTION; and execute COMMIT; at the end of query block. But since we are not using Begin(), we can't get txn handle. It will cause problem when multiple transactions are nested or intersected.

    The ideal solution is for parser to support begin + modifier syntax.

    References

    PostgreSQL: Documentation: 9.1: BEGIN MySQL :: MySQL 8.0 Reference Manual :: 13.3.1 START TRANSACTION, COMMIT, and ROLLBACK Statements MySQL Difference between BEGIN and START TRANSACTION - Database Administrators Stack Exchange

  • Support return result set after deletion

    Support return result set after deletion

    What problem does this PR solve?

    Issue Number: close #xxx

    Problem Summary:

    What is changed and how it works?

    Proposal: xxx

    What's Changed:

    How it Works:

    Related changes

    • Need to cherry-pick to the release branch

    Check List

    Tests

    • Unit test
    • Integration test
    • Manual test (add detailed scripts or steps below)
    • No code

    Side effects

    • Performance regression
      • Consumes more CPU
      • Consumes more MEM
    • Breaking backward compatibility

    Release note

  • server: fixed a bug that params description will be sent when no params in handleStmtDescription

    server: fixed a bug that params description will be sent when no params in handleStmtDescription

    What problem does this PR solve?

    Fixed a bug that params description will be sent when no params in handleStmtDescription. Translated some comments. Handled flush message.

  • variable,session: add search_path system variable.

    variable,session: add search_path system variable.

    What problem does this PR solve?

    Add a PostgreSQL system variable named "search_path".

    What is changed and how it works?

    What's Changed:

    Add a system variable,named 'search_path',default value is 'public'.

    How it Works:

    Now,we can run the related SQL statements properly,like ' set search_path to public'

  • Executor: Added support for setting isolation level in begin

    Executor: Added support for setting isolation level in begin

    What problem does this PR solve?

    Issue Number: close #53 Added support for setting isolation level in begin statement

    What is changed and how it works?

    Changed the executor for begin statement

    Test

    Manual test

  • executor: uncommented some previously skipped test

    executor: uncommented some previously skipped test

    Signed-off-by: AmoebaProtozoa [email protected]

    What's Changed:

    Uncommented some previously skipped test.

    They work fine as long as fail-point is enabled

    Check List

    Tests

    • make dev-tmp passed
  • Is the TiDB going to support ISOLATION operation?

    Is the TiDB going to support ISOLATION operation?

    Current I run odoo , an open source ERP with postgresql, and it report that

    psycopg2.errors.SyntaxError: syntax error at or near "ISOLATION" LINE 1: BEGIN ISOLATION LEVEL REPEATABLE READ ^

    So I'd like to know if it is going to support this key word ?

  • Explaintest, Server: Changed Explaintest's driver

    Explaintest, Server: Changed Explaintest's driver

    What's changed

    Changed explaintest's driver to lib/pq

    Changed command tag of Begin statement

    Improved mysql server status translation

    Test

    make dev-tmp Passed

  • TiDB4PG panic: runtime error: invalid memory address or nil pointer dereference

    TiDB4PG panic: runtime error: invalid memory address or nil pointer dereference

    Bug Report

    Please answer these questions before submitting your issue. Thanks!

    1. What did you do?

    send the sourcecode to linux. cd to tidb-server directory run: go run main.go &

    2. What did you expect to see?

    It can keep alive long

    3. What did you see instead?

    It can start up ,but panic some hours ago.

    the error: [root@loader bubble]# panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0xc0 pc=0x2e01153]

    goroutine 327269 [running]: github.com/pingcap/tidb/server.(*clientConn).writeError(0xc000113700, 0x3b29c80, 0xc001baedb0, 0x3acad40, 0xc001584050, 0x1, 0x0) /opt/TiDB-for-PostgreSQL/server/conn.go:1138 +0x1b3 github.com/pingcap/tidb/server.(*Server).onConn(0xc0013b46e0, 0xc000113700) /opt/TiDB-for-PostgreSQL/server/server.go:439 +0x147 created by github.com/pingcap/tidb/server.(*Server).Run /opt/TiDB-for-PostgreSQL/server/server.go:370 +0x7c9 exit status 2

    [1]- Exit 1 go run main.go (wd: /opt/TiDB-for-PostgreSQL/tidb-server)

  • Parameter processing in bind message

    Parameter processing in bind message

    Development Task

    In the extend query, backend server will get parameter values through the bind messages. And tidb-server will use the datum structure to store the values of these parameters.

    The datum is only used to store value of mysql data type and each type will have a different datum to store in TiDB.

    We need to parse the bind message and store the parameter values in different datum according to different types. That is to improve the function parseBindArgs.

    Related Files

    server/conn_stmt.go

  • TiDB dashboard can not login in

    TiDB dashboard can not login in

    Bug Report

    I can not open tidb dashboard

    1. What did you do?

    When i deploy tidb for postgresql,i tried to open tidb dashboard on pdid:2379/dashboard。 I can see login web page。but I can not login in it。

    2. What did you expect to see?

    I expect to login in to tidb dashboard

    3. What did you see instead?

    It told that u failed login

  • Update server information from TiDB to TiDB for PostgreSQL

    Update server information from TiDB to TiDB for PostgreSQL

    description

    Update some console info like "Welcome to TiDB"。 When I start TiDB for PostgreSQL source code on my local PC。I see “Welcome to TiDB” on the console output

    task

    Update console inforamtion,It should be like “Welcome to TiDB for PostgreSQL” and so on

    expected

    It will show TiDB for PostgreSQL server information when started locally or connected by cmd or alomost anywhere

  • To support PG compliant SQL DML statement

    To support PG compliant SQL DML statement

    Development Task

    Description

    Since TiDB is Mysql compliant database, we need to adjust DCParser yacc paser files to remove mysql keywords and add pg keywords, and then implement the keywords logic functions in TiDB4PG project.

    Task List

    DCParser:

    Please check the issue: https://github.com/DigitalChinaOpenSource/DCParser/issues/21

    TiDB4PG

    Keywords Implement
    • [ ] Returning Keyword
Dumpling is a fast, easy-to-use tool written by Go for dumping data from the database(MySQL, TiDB...) to local/cloud(S3, GCP...) in multifarious formats(SQL, CSV...).

?? Dumpling Dumpling is a tool and a Go library for creating SQL dump from a MySQL-compatible database. It is intended to replace mysqldump and mydump

Nov 9, 2022
simply SQL Parser for Go ( powered by vitess and TiDB )

vitess-sqlparser Simply SQL and DDL parser for Go (powered by vitess and TiDB ) this library inspired by https://github.com/xwb1989/sqlparser (origina

Jan 4, 2023
A tool for finding corrupted data rows in TiDB

tidb-bad-rows A tool for finding corrupted data rows in TiDB. It scans the target table and using a divide and conquer paradigm to locate all corrupte

Nov 17, 2021
A MySQL-compatible relational database with a storage agnostic query engine. Implemented in pure Go.

go-mysql-server go-mysql-server is a SQL engine which parses standard SQL (based on MySQL syntax) and executes queries on data sources of your choice.

Jan 2, 2023
[mirror] the database client and tools for the Go vulnerability database

The Go Vulnerability Database golang.org/x/vulndb This repository is a prototype of the Go Vulnerability Database. Read the Draft Design. Neither the

Dec 29, 2022
Database - Example project of database realization using drivers and models

database Golang based database realization Description Example project of databa

Feb 10, 2022
An experimental toolkit for injecting alternate authentication strategies into a PostgreSQL-compatible wire format.

PG Auth Proxy This is an experimental toolkit for injecting alternate authentication strategies into a PostgreSQL-compatible wire format. This is a pr

Jan 20, 2022
go mysql driver, support distributed transaction

Go-MySQL-Driver A MySQL-Driver for Go's database/sql package Features Requirements Installation Usage DSN (Data Source Name) Password Protocol Address

Jul 23, 2022
a lightweight distributed transaction management service, support xa tcc saga
a lightweight distributed transaction management service, support xa tcc saga

a lightweight distributed transaction management service, support xa tcc saga

Dec 29, 2022
Distributed tracing using OpenTelemetry and ClickHouse

Distributed tracing backend using OpenTelemetry and ClickHouse Uptrace is a dist

Jan 2, 2023
🏋️ dbbench is a simple database benchmarking tool which supports several databases and own scripts

dbbench Table of Contents Description Example Installation Supported Databases Usage Custom Scripts Troubeshooting Development Acknowledgements Descri

Dec 30, 2022
Database wrapper that manage read write connections

rwdb Database wrapper that manage read write connections Install go get github.com/andizzle/rwdb Create connections package main import "github.com/

Dec 10, 2022
Vitess is a database clustering system for horizontal scaling of MySQL.

Vitess Vitess is a database clustering system for horizontal scaling of MySQL through generalized sharding. By encapsulating shard-routing logic, Vite

Jan 3, 2023
sqlx is a library which provides a set of extensions on go's standard database/sql library

sqlx is a library which provides a set of extensions on go's standard database/sql library. The sqlx versions of sql.DB, sql.TX, sql.Stmt, et al. all leave the underlying interfaces untouched, so that their interfaces are a superset on the standard ones. This makes it relatively painless to integrate existing codebases using database/sql with sqlx.

Jan 7, 2023
Convert data exports from various services to a single SQLite database
Convert data exports from various services to a single SQLite database

Bionic Bionic is a tool to convert data exports from web apps to a single SQLite database. Bionic currently supports data exports from Google, Apple H

Dec 9, 2022
Zero boilerplate database operations for Go
Zero boilerplate database operations for Go

(Now compatible with MySQL and PostgreSQL!) Everyone knows that performing simple DATABASE queries in Go takes numerous lines of code that is often re

Jan 2, 2023
Vitess is a database clustering system for horizontal scaling of MySQL.

Vitess Vitess is a database clustering system for horizontal scaling of MySQL through generalized sharding. By encapsulating shard-routing logic, Vite

Jan 4, 2023
Additions to Go's database/sql for super fast performance and convenience.

gocraft/dbr (database records) gocraft/dbr provides additions to Go's database/sql for super fast performance and convenience. $ go get -u github.com/

Jan 1, 2023