Vitess is a database clustering system for horizontal scaling of MySQL through generalized sharding.

Maven Central Build Status codebeat badge Go Report Card FOSSA Status CII Best Practices Coverage

Vitess

Vitess is a database clustering system for horizontal scaling of MySQL through generalized sharding.

By encapsulating shard-routing logic, Vitess allows application code and database queries to remain agnostic to the distribution of data onto multiple shards. With Vitess, you can even split and merge shards as your needs grow, with an atomic cutover step that takes only a few seconds.

Vitess has been a core component of YouTube's database infrastructure since 2011, and has grown to encompass tens of thousands of MySQL nodes.

For more about Vitess, please visit vitess.io.

Vitess has a growing community. You can view the list of adopters here.

Reporting a Problem, Issue, or Bug

To report a problem, the best way to get attention is to create a GitHub issue using proper severity level based on this guide.

For topics that are better discussed live, please join the Vitess Slack workspace. You may post any questions on the #general channel or join some of the special-interest channels.

Follow Vitess Blog for low-frequency updates like new features and releases.

Security

Reporting Security Vulnerabilities

To report a security vulnerability, please email vitess-maintainers.

See Security for a full outline of the security process.

Security Audit

A third party security audit was performed by Cure53. You can see the full report here.

License

Unless otherwise noted, the Vitess source files are distributed under the Apache Version 2.0 license found in the LICENSE file.

FOSSA Status

Comments
  • workflow: Add horizontal resharding workflow.

    workflow: Add horizontal resharding workflow.

    This is a beta version implementation, which simply supports creating the horizontal workflow through UI and updating progress of each step.

    In the end-to-end test (test/workflow_horizontal_resharding.py), only happy path is tested. To simplify the test, rather than fetching information from front-end and checking explicitly, I manually interacting with the UI after setting up the environments.

    Unit test will be added later.

    Possible Optimizations (wait for feedback from reviewers, the listed points might be useless):

    • make UI more friendly, s.t. the user input doesn't have to input the shard list
    • supports action for the workflow, which allows the user to stop and restart the workflow
    • adding possible corner case end-to-end test

    This change is Reviewable

  • Experimental: automated, scheduled, dependency free online DDL via gh-ost/pt-online-schema-change

    Experimental: automated, scheduled, dependency free online DDL via gh-ost/pt-online-schema-change

    This PR (work in progress) introduces zero dependency online schema changes with gh-ost/pt-online-schema-change.

    UPDATE: this comment edited to reflect support for pt-online-schema-change. Originally this PR only supported gh-ost. Mostly whenever you see gh-ost, consider pt-online-schema-change to apply, as well.

    TL;DR

    User will issue:

    alter with 'gh-ost' table example modify id bigint not null;
    
    alter with 'pt-osc' table example modify id bigint not null
    

    or

    $ vtctl -topo_implementation etcd2 -topo_global_server_address localhost:2379 -topo_global_root /vitess/global \
        ApplySchema -sql "alter with 'gh-ost' table example modify id bigint unsigned not null" commerce
    
    $ vtctl -topo_implementation etcd2 -topo_global_server_address localhost:2379 -topo_global_root /vitess/global \
        ApplySchema -sql "alter with 'pt-osc' table example modify id bigint unsigned not null" commerce
    

    and vitess will schedule an online schema change operation to run on all relevant shards, then proceed to apply the change via gh-ost on all shards.

    While this PR is WIP, this flow works. More breakdown to follow, indicating what's been done and what's still missing.

    The ALTER TABLE problem

    First, to iterate the problem: schema changes have always been a problem with MySQL; a straight ALTER is a blocking operation; a ONLINE ALTER is only "online" on the master/primary, but is effectively blocking on replicas. Online schema change tools like pt-online-schema-change and gh-ost overcome these limitations by emulating an ALTER on a "ghost" table, which is populated from the original table, then swapped in its space.

    For disclosure, I authored gh-ost's code as part of the database infrastructure team at GitHub.

    Traditionally, online schema changes are considered to be "risky". Trigger based migrations add significant load onto the master server, and their cut-over phase is known to be a dangerous point. gh-ost was created at GitHub to address these concerns, and successfully eliminated concerns for operational risks: with gh-ost the load on the master is low, and well controlled, and the cut-over phase is known to cause no locking issues. gh-ost comes with different risks: it applies data changes programmatically, thus the issue of data integrity is of utmost importance. Another note of concern is data traffic: going out from MySQL into gh-ost and back into MySQL (as opposed to all-in MySQL in pt-online-schema-change).

    This way or the other, running an online schema change is typically a manual operation. A human being will schedule the migration, kick it running, monitor it, possibly cut-over. In a sharded environment, a developer's request to ALTER TABLE explodes to n different migrations, each needs to be scheduled, kicked, monitored & tracked.

    Sharded environments are obviously common for vitess users and so these users feel the pain more than others.

    Schema migration cycle & steps

    Schema management is a process that begins with the user designing a schema change, and ends with the schema being applied in production. This is a breakdown of schema management steps as I know them:

    1. Design code
    2. Publish changes (pull request)
    3. Review
    4. Formalize migration command (the specific ALTER TABLE or pt-online-schema-change or gh-ost command)
    5. Locate: where in production should this migration run?
    6. Schedule
    7. Execute
    8. Audit/monitor
    9. Cut-over/complete
    10. Cleanup
    11. Notify user
    12. Deploy & merge

    What we propose to address

    Vitess's architecture uniquely positions it to be able to automate away much of the process. Specifically:

    • Formalize migration command: turning an ALTER TABLE statement into a gh-ost invocation is super useful if done by vitess, since vitess can not only validate schema/params, but also can provide credentials, identify a throttle-control replica, can instruct gh-ost on how to communicate progress via hooks, etc.
    • Locate: given schema/table, vitess just knows where the table is located. It knows if the schema is sharded. It knows who the shards are, who the shards masters are. It knows where to run gh-ost. Last, vitess can tell us which replicas we can use for throttling.
    • Schedule: vitess is again in a unique position to schedule migrations. The fact someone asks for a migration to run does not mean the migration should start right away. For example, a shard may already be running an earlier migration. Running two migrations at a time is less than ideal, and it's best to wait out the first migration before beginning the second. A scheduling mechanism is both useful to running the migrations in optimal order/sequence, as well as providing feedback to the user ("your migration is on hold because this and that", or "your migration is 2nd in queue to run")
    • Execute: vttablet is the ideal entity to run a migration; can read instructions from topo server and can write progress to topo server. vitess is aware of possible master failovers and can request a re-execute is a migration is so interrupted mid process.
    • Audit/monitor: vtctld API can offer endpoints to track status of a migration (e.g. "in progress on -80, in queue on 80-"). It may offer progress pct and ETA.
    • cut-over/complete: in my experience with gh-ost, the cut-over phase is safe to automate away.
    • cleanup: the old table needs to be dropped; vttablet is in an excellent position to automate that away.

    What this PR does, and what we expect to achieve

    The guideline for this PR is: zero added dependencies; everything must be automatically and implicitly available via a normal vitess installation.

    A breakdown:

    User facing

    This PR enables the user to run an online schema migration (aka online DDL) via:

    • vtgate: the user connects to vitess with their standard MySQL client, and issues a ALTER WITH 'gh-ost' TABLE ... statement. Notice this isn't a valid MySQL syntax -- it's a hint for vitess that we want to run this migration online. vitess still supports synchronous, "normal" ALTER TABLE statements, which IMO should be discouraged.
    • vtctl: the user runs vtctl ApplySchema -sql "alter with _gh-ost' table ...".

    The response, in both cases, is a migration ID, or a job ID, if you will. Consider the following examples.

    via vtgate:

    
    mysql> create table example(id int auto_increment primary key, name tinytext);
    
    mysql> show create table example \G
    
    CREATE TABLE `example` (
      `id` int NOT NULL AUTO_INCREMENT,
      `name` tinytext,
      PRIMARY KEY (`id`)
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
    
    mysql> alter with 'gh-ost' table example modify id bigint not null, add column status int, add key status_dx(status);
    +--------------------------------------+
    | uuid                                 |
    +--------------------------------------+
    | 211febfa-da2d-11ea-b490-f875a4d24e90 |
    +--------------------------------------+
    
    -- <wait...>
    
    mysql> show create table example \G
    
    CREATE TABLE `example` (
      `id` bigint NOT NULL,
      `name` tinytext,
      `status` int DEFAULT NULL,
      PRIMARY KEY (`id`),
      KEY `status_dx` (`status`)
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
    

    via vtctl:

    $ mysql -e "show create table example\G"
    
    CREATE TABLE `example` (
      `id` bigint NOT NULL,
      `name` tinytext,
      `status` int DEFAULT NULL,
      PRIMARY KEY (`id`),
      KEY `status_dx` (`status`)
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
    
    
    $ vtctl -topo_implementation etcd2 -topo_global_server_address localhost:2379 -topo_global_root /vitess/global \
        ApplySchema -sql "alter with 'gh-ost'  table example modify id bigint unsigned not null" commerce
    8ec347e1-da2e-11ea-892d-f875a4d24e90
    
    
    $ mysql -e "show create table example\G"
    
    CREATE TABLE `example` (
      `id` bigint unsigned NOT NULL,
      `name` tinytext,
      `status` int DEFAULT NULL,
      PRIMARY KEY (`id`),
      KEY `status_dx` (`status`)
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci
    

    In both cases, a UUID is returned, which can be used for tracking (WIP) the progress of the migration across shards.

    Parser

    Vitess' parser now accepts ALTER WITH 'gh-ost' TABLE and ALTER WITH 'pt-osc' TABLE syntax. We're still to determine if this is the exact syntax we want to go with.

    Topo

    Whether submitted by vtgate or vtctl, we don't immediately run the migration. As mentioned before, we may wish to postpone the migration. Perhaps the relevant servers are already running a migration.

    Instead, we write the migration request into global topo, e.g.:

    • key: /vitess/global/schema-migration/requests/90c5afd4-da38-11ea-a3ff-f875a4d24e90
    • content:
    {"keyspace":"commerce","table":"example","sql":"alter table example modify id bigint not null","uuid":"90c5afd4-da38-11ea-a3ff-f875a4d24e90","online":true,"time_created":1596701930662801294,"status":"requested"}
    

    Once we create the request in topo, we immediately return the generated UUID/migration ID (90c5afd4-da38-11ea-a3ff-f875a4d24e90 in the above example) to the user.

    vtctld

    vtctld gets a conceptual "upgrade" with this PR. It is no longer a reactive service. vtctld now actively monitors new schema-migration/requests in topo.

    ~~When it sees such a request, it evaluates what are the relevant n shards.~~

    ~~With current implementaiton, it writes n "job" entries, one per shard. e.g.~~

    • /vitess/global/schema-migration/jobs/commerce/-80/ce45b84a-da2d-11ea-b490-f875a4d24e90 and /vitess/global/schema-migration/jobs/commerce/80-/ce45b84a-da2d-11ea-b490-f875a4d24e90 for a keyspace with two shards; or just
    • /vitess/global/schema-migration/jobs/commerce/0/1dd17132-da23-11ea-a3d2-f875a4d24e90 for a keyspace with one shard.

    DONE: WIP: we will investigate use of new VExec to actually distribute the jobs to vttablet.

    what vtctld does now, is, once it sees a migration request, it pushes a VExec request for that migration. If the VExec request succeeds, that means all shards have been notified, and vtctld can stow away the migration request (work is complete as far as vtctld is concerned). If VExec returns with an error, that means at least one shard did not get the request, and vtctld will keep retrying pushing this request.

    vttablet

    This is where most of the action takes place.

    vttablet runs a migration service which continuously probes for, schedules, and executes migrations.

    DONE: ~~With current implementation, tablets which have tablet_type=MASTER continuously probe for new entries. We look to replace this with VExec.~~

    migration requests are pushed via VExec; the request includes the INSERT IGNORE query that persists the migration in _vt.schema_migrations. The tablet no longer reads from, nor writes to, Global Topo.

    A new table is introduced: _vt.schema_migrations, which is how vttablet manages and tracks its own migrations.

    vttablet will only run a single migration at a time.

    vttablet will see if there's an unhandled migration requests. It will queue it.

    vttablet will make a migration ready if there's no running migration and no other migration is marked as ready.

    vttablet will run a ready migration. This is really the interesting part, with lots of goodies:

    • vttablet will evaluate the gh-ost ... command to run. It will obviously populate --alter=... --database=....
    • vttablet creates a temp directory where it generates a script to run gh-ost.
    • vttablet creates a hooks path and auto-generates hook files. The hooks will interact with vttablet
    • vttablet has an API endpoint by which the hooks can communicate gh-ost's status (started/running/success/failure) with vttablet.
    • vttablet provides gh-ost with --hooks-hint which is the migration's UUID.
    • vttablet automatically generates a gh-ost user on the MySQL server, with a random password. The password is never persisted and does not appear on ps. It is written to, and loaded from, an environment variable.
    • vttablet grants the proper privileges on the newly created account
    • vttablet will destroy the account once migration completes.
    • vitess repo includes a gh-ost binary. We require gh-ost from openark/gh-ost as opposed to github/gh-ost because we've had to make some special adjustments to gh-ost s oas to support this flow. I do not have direct ownership to github/gh-ost and cannot enforce those changes upstream, though I have made the contribution requestss upstream.
    • make build automatically appends gh-ost binary, compressed, to vttablet binary, via Ricebox.
    • vttablet, upon startup, auto extracts gh-ost binary into /tmp/vt-gh-ost. Please note that the user does not need to install gh-ost.
    • WIP: vttablet to report back the job as complete/failed. We look to use VExec. TBD.

    Tracking breakdown

    • [x] New OnlineDDL struct, defines a migration request and its status
    • [x] Parser supports ALTER WITH 'gh-ost' TABLE and ALTER WITH 'pt-osc' TABLE syntax
    • [x] builder and analyzer to create an Online DDL plan (write to topo)
    • [x] vtctl to skip "big changes" check when -online_schema_change is given
    • [x] tablet_executor to submit an online DDL request to topo as opposed to running it on tablets
    • [x] vtctld runs a daemon to monitor for, and review migration requests
    • [x] vtctld evaluates which shards are affected
    • [x] _vt.schema_migrations backend table to support migration automation (on each shard))
    • [x] vttablet validates MySQL connection and variables
    • [x] vttablet creates migration command
    • [x] vttablet creates hooks
    • [x] vttablet provides HTTP API for hooks to report their status back
    • [x] vttablet creates gh-ost user with random password
    • [x] vttablet destroys gh-ost user upon completion
    • [x] gh-ost embedded in vttablet binary and auto-extracted by vttablet
    • [x] vttablet runs a dry-run execution
    • [x] vttablet runs a --execute (actual) execution
    • [x] vttablet supports a Cancel request (not used yet) to abort migration
    • [x] vttablet as a state machine to work throught the migration steps
    • [x] counters for gh-ost migration requests, suceessful and failed migrations
    • [x] use of VExec to apply migrations onto tablets
    • [x] use of VExec to control migrations (abort, retry)
    • [ ] consider flow for retries
    • [ ] identify a reparent operation that runs during a migration, probabaly auto-restart the migration
    • [ ] vttablet to heuristically check for available disk space
    • [x] tracking, auditing of all migrations
    • [x] getting gh-ost logs if necessary
    • [x] what's the best way to suggest we want an online migration? Does current ALTER WITH 'gh-ost' TABLE... and ALTER WITH 'pt-osc' TABLE syntax make sense? Other?
    • [ ] For first iteration, migrations and Reshard operations should be mutually exclusive. Can't run both at the same time. Next iterations will remove this constraint.
    • [x] ~~throttle by replica~~
    • [ ] ~~wait for replica to catch up with new credentials before starting the migration~~
    • [x] Use vttablet throttler
    • [x] pt-online-schema-change bundled inside vttablet binary
    • [x] support pt-online-schema-change
    • [ ] ~~define foreign key flags for pt-online-schema-change execution~~ - user can define as runtime flags
    • [x] clenaup online-ddl directory after success
    • [ ] control throttling
    • [x] control termination (panic abort)
    • [x] control termination (panic abort) even after vttablet itself crashes
    • [x] pt-online-schema-change passwords are in cleartext. Can we avoid that?
    • [x] vtctl ApplySchema use same WITH 'gh-ost' and WITH 'pt-osc' query hints as in vtgate.
    • [x] support override of gh-ost and pt-online-schema-change paths
    • [x] cleanup pt-osc triggers after migration failure
    • [x] forcibly remove pt-osc triggers on migration cancellation (overlaps with previous bullet, but has stronger guarantee)
    • [x] cleanup pt-osc triggers from stale/zombie pt-osc migration
    • [x] vtctl OnlineDDL command for simple visibility and manipulation. See https://github.com/vitessio/vitess/pull/6547#issuecomment-681879259
    • [x] end to end tests
    • [x] populate artifacts column, suggesting which tables need to be cleaned up after migration

    Quite likely more entries to be added.

    Further reading, resources, acknowledgements

    We're obviously using gh-ost. I use my own openark/gh-ost since I have no ownership of the original https://github.com/github/gh-ost. gh-ost was/is developed by GitHub 2016-2020.

    pt-online-schema-change is part of the popular Percona Toolkit

    The schema migratoin scheduling and tracking work is based on my previous work at GitHub. The implementation in this PR is new and rewritten, but based on concepts that have matured on my work on skeefree. Consider these resources:

    Also:

    • An early presentation on gh-ost

    Initial incarnation of this PR: https://github.com/planetscale/vitess/pull/67; some useful comments on that PR.

    Call for feedback

    We're looking for community's feedback on the above suggestions/flow. Thank you for taking the time to read and respond!

  • Adding TLS support to the Java client and JDBC driver

    Adding TLS support to the Java client and JDBC driver

    re: Issue #2209

    TLS connections are already supported by VTGate, gRPC, and all of the non-Java clients. For Java to reach parity, support is needed in the grpc-client, client, and jdbc modules under src/github.com/youtube/vitess/java.

    This PR is a work in progress... visible for early review and vetting, but not yet ready to merge.

    At this point, the VitessJDBCUrl class parses an optional parameter, which indicates that a connection should use SSL. The parameter follows standard MySQL JDBC conventions.

    Pending commits will:

    1. Update GrpcClientFactory, to expose a method for creating a TLS-enabled RpcClient in addition to the current plaintext one.
    2. Update VitessVTGateManager, to use this new method when the JDBC URL calls for SSL.
    3. Write unit test coverage for the above, using a vtgateclienttest mock in a style similar to the src/github.com/youtube/vitess/test/encrypted_transport.py script.
    4. End-to-end testing. Update examples/local/vtgate-up.sh, to use TLS with a self-signed cert when an optional parameter is passed. Also update the examples/client_jdbc.sh test script, to likewise use a TLS connection when an optional parameter is passed.

    Any feedback about this approach is definitely welcomed!


    This change is Reviewable

  • using nginx for vtadmin web

    using nginx for vtadmin web

    Signed-off-by: Priya Bibra [email protected]

    Description

    Need to move away from using the serve library due PRISMA-2022-0039 (CVSS 7.5) - minimatch 3.0.4 It's underlying dependency serve-handler uses a vulnerable version of minimatch and the dependency itself is not well maintained. Recently learned that minimatch is also a dependency of react-dev-utils which uses recursive-readdir; also not well mainained.

    This PR uses a multistage docker build to first build the static content using node/npm and serve it using nginx in the final image. This way, our dependencies and any related vulnerabilities they may have are not leaked into the deployed image.

    Related Issue

    https://github.com/vitessio/vitess/issues/10676

    Testing Notes

    Image was built using docker build -t vtadmin-test, successfully runs on local host after docker run -p 8080:14201 vtadmin-test:latest

  • Single shard targeting in V3

    Single shard targeting in V3

    There is no way to target a single shard in the V3 APIs.

    We need to target single shards for two reasons:

    1. We want to shard some of our processing in that we have jobs that process an entire single shard at the time.
    2. There's a very large number of scatter/gather functionality that is currently not supported (e.g. limit on joins, order by and so on). Some we can work around on the query level but some we have to build our own scatter/gather logic.

    Item 2 could eventually be addressed by extending scatter/gather in V3 but item 1 can never be addressed by V3.

  • easier build

    easier build

    I worked on this because I finished a brew formula to install Vitess on macOS with one command: brew install vitess. Also, there was some conversation on the Slack channel to improve the build process and make it easier.

    The main change in here is: we will push the vendor libs, instead of getting them during the bootstrap.sh script run. Advantages of this approach:

    • We ensure the version of our libs, not just by the vendor.json file, but they actually will follow the history of this repo.
    • Makes the build process a lot easier. With this, I can build Vitess without needing to run boostrap.sh. This is useful for folks that don't need to change the code but just want to build it.
    • Homebrew build process doesn't allow us to do go get during it, as that is insecure. But, as we vendor our dependencies we don't need to go get during build time anymore.

    Drawbacks:

    • when changing the vendor.json file, we need to run go vendor sync or the bootstrap script again, to upload the new version of the lib.
    • This repository will get bigger in git size. (the local size will still be the same, though)

    I think the advantages outweighs the disadvantages, as the ratio of local builds vs changing vendor.json file is bigger.

    At, least at GitHub, thats what we do, we vendored and upload the lib sources to git. Also, open source projects such as Consul do the same.

    Thoughts? cc @sougou @vmg

  • helm: Allow non-leader Orchestrator instances to accept requests

    helm: Allow non-leader Orchestrator instances to accept requests

    Orchestrator 3.0.7 added a proxy that forwards master only requests, so we don’t have to workaround that by having perpetually unready pods via the /api/leader-check endpoint

    cc @shlomi-noach @enisoc

  • Adds table name to the field proto

    Adds table name to the field proto

    https://dev.mysql.com/doc/refman/5.7/en/c-api-data-structures.html

    The table and table_length fields are available in the C api but are not passed through to clients for use. One of our core java client libraries depends on http://docs.oracle.com/javase/7/docs/api/java/sql/ResultSetMetaData.html#getTableName(int), which in the existing Vitess Driver was hardcoded to return null.

    This PR adds table to the Field proto, populates it in the mysql.Fields() function, and allows access in the Java client. All tests updated to account for the new field, and succeeding.

    I've also tested this in our deployment. However, I've not been able to get the full make test to work in my local dev yet, this was tested with make site_test.

    This is my first foray into both go and vitess, so feedback welcome.

  • Preserve case of column names

    Preserve case of column names

    When Vitess receives a query, the tokenizer lowercases all column names so they can be string-compared safely throughout the system. This means all column names returned in the eventual query result are lowercase as well. This causes a problem in PHP when users expect to look up columns by name with the same case they requested in the original query:

    $result = query('select Column from table');
    $row = $result->fetch();
    echo $row['Column']; // array key doesn't exist
    echo $row['column']; // works
    

    In Java (JDBC) they address this by specifying that column name lookups are case-insensitive. We recently fixed our Java client to conform to this (https://github.com/youtube/vitess/pull/1572).

    We could do something similar in PHP by implementing ArrayAccess or extending ArrayObject. Each of those has its pros and cons, but either way, the object we return would not be a true array, so things like array_keys() wouldn't work.

    @tslater @Rastusik Can you weigh in on whether this makes sense from a PHP app writer's perspective?

    @sougou

  • RFC: Deprecate MariaDB Support in 2022 and Remove in 2023

    RFC: Deprecate MariaDB Support in 2022 and Remove in 2023

    Proposal

    Deprecate official MariaDB support in Vitess 14.0 (~June 2022) and remove it entirely in 16.0 (~Feb 2023). This gives Vitess users approximately 2 years where they can continue to use Vitess with MariaDB and offers adequate time to transition to MySQL 8 (MySQL 5.7 is currently scheduled for EOL in 2023).

    Reasoning

    Note: MySQL and Percona Server for MySQL are considered fully equivalent in Vitess.

    The reasons for proposing this are as follows:

    • Vitess only officially supports MariaDB 10.0-10.3 today
    • MariaDB is a hard fork of MySQL and 10.X is a very different database than MySQL 8.X (which is quickly becoming the default with Vitess)
    • Though there was some outreach from the MariaDB Foundation last year to get Vitess working with more recent versions of MariaDB, that effort seems to have been abandoned
    • MariaDB's GTID implementation — which is NOT compatible with MySQL which uses auto generated UUIDs for the unique host identifier — uses domain_id and server_id together as the "unique" host identifier while every MariaDB instance defaults to 0-1 for its unique host identifier which is in NO way unique. MariaDB's GTID implementation also do not use sets (MySQL uses GTID_EXECUTED and GTID_PURGED sets), but only the last seen sequence value from the given host identifier (so e.g. 0-1:20000 vs 15b57a66-e10d-11eb-a4de-7499a366173e:1-20000 in MySQL) — so you have no way to know if two servers have executed the same full set of GTIDs or not and you cannot easily detect drift and ensure consistency and correctness. When you combine these two things, this is a big problem for Vitess, which relies on fairly complex replication topologies. This makes MariaDB unsafe to use with Vitess VReplication, and you will run into problems (surfaced duplicate GTIDs are the least of your problems, the worse issue is undetected drift and inconsistencies within a shard). This is hardly the only problem with MariaDB usage in Vitess, but I assert that this alone makes MariaDB an unsafe and unsuitable choice.

    And most importantly:

    • Vitess is not well tested on MariaDB today within the project nor the user base
      • There are no known Vitess users with MariaDB in production (please let us know if you do!)

    Given all of this, it seems reasonable to discontinue spending the limited human and capital resources available to the Vitess project — which would otherwise go toward fixing bugs and adding features in Vitess — in trying to maintain official MariaDB support (at least for 10.4+) going forward.

    :warning: While we would not officially support MariaDB as a database within Vitess, we would still want to maintain a clear migration path into Vitess for existing MariaDB installations.

    Feedback

    Please note that we want to do what's best for the Vitess project and its community, so your input is crucial! This is a proposal, it is not an announcement of a decision that's been made. If you have any questions, concerns, or other feedback please let us know!

  • grpc build problem

    grpc build problem

    While executing ./bootstrap.sh on OSX I get the following error:

    [HOSTCXX] Compiling src/compiler/csharp_generator.cc
    src/compiler/csharp_generator.cc:47:43: error: no member named 'GetReflectionClassName' in namespace 'google::protobuf::compiler::csharp'
    using google::protobuf::compiler::csharp::GetReflectionClassName;
          ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
    src/compiler/csharp_generator.cc:237:26: error: use of undeclared identifier 'GetReflectionClassName'
                 "umbrella", GetReflectionClassName(service->file()), "index",
                             ^
    2 errors generated.
    make: *** [/Users/yfinkelstein/work/vitess_workspace/dist/grpc/grpc/objs/opt/src/compiler/csharp_generator.o] Error 1
    ERROR: gRPC build failed
    

    I'm following build setup instructions line by line as described in your "How to build".

  • Bug Report: gen4 planner causes 'Row count exceeded 10000' errors

    Bug Report: gen4 planner causes 'Row count exceeded 10000' errors

    Overview of the Issue

    Hi Vitess experts, we are running v13 and trying to upgrade to v15. We noticed an issue that seems to be a bug of the ‘gen4’ planner. Adding a hint to use v3 planner works. Our query is like: SELECT t2.* FROM table1 AS t1 INNER JOIN table2 AS t2 ON t1.table2_id = t2.id WHERE t1.id IN (1234, 5678) It works fine in v13, but in v15, it fails because of the ‘Row count exceeded 10000’ error. The above query would return just two rows. Since table1 and table2 are sharded, so it looks like v3 planner was able to find the 2 rows using the where clause before doing the inner join, while gen4 seems to be doing the inner join first thus too many rows?

    Reproduction Steps

    Below are the vindex info for the two tables in case it’s relevant. "vindexes": { "table1_keyspace_idx": { "type": "consistent_lookup_unique", "params": { "from": "table2_id", "ignore_nulls": "true", "table": "table1_keyspace_idx", "to": "keyspace_id" }, "owner": "table1" },
    ... "table1": { "columnVindexes": [ { "column": "id", "name": "hash" }, { "column": "table2_id", "name": "table1_keyspace_idx" } ], ...
    "table2": { "columnVindexes": [ { "column": "some_id", "name": "xxhash" } ], "autoIncrement": { "column": "id", "sequence": "wish.table2_seq" } },

    Binary Version

    v15.0.2
    

    Operating System and Environment details

    root@app05dev-temporal-0-repl-0:/# cat /etc/os-release
    PRETTY_NAME="Debian GNU/Linux 10 (buster)"
    NAME="Debian GNU/Linux"
    VERSION_ID="10"
    VERSION="10 (buster)"
    VERSION_CODENAME=buster
    ID=debian
    HOME_URL="https://www.debian.org/"
    SUPPORT_URL="https://www.debian.org/support"
    BUG_REPORT_URL="https://bugs.debian.org/"
    root@app05dev-temporal-0-repl-0:/# uname -sr
    Linux 5.13.0-1031-aws
    root@app05dev-temporal-0-repl-0:/# uname -m
    x86_64
    

    Log Fragments

    No response

  • VReplication: Test Migrations From MariaDB to MySQL

    VReplication: Test Migrations From MariaDB to MySQL

    Description

    We EOL'd formal support for MariaDB in Vitess v14+ but we always want to have a migration path into Vitess for current MariaDB users so they should be able to use VReplication's MoveTables/Migrate workflows for this purpose. The new TestMoveTablesMariaDBToMySQL end-to-end test ensures that we continue to support that use case.

    In this PR we consolidate all of the MariaDB 10.x config files — there were no substantial config differences between them — into a single MariaDB 10 config file and we modify mysqlctl to use this single config file anytime MariaDB 10 is detected (MariaDB 11.0 was just announced). This was needed as the test uses the latest GA release of MariaDB available today: 10.10.2.

    We also remove the MariaDB 10.3 unit test here as 10.3 will shortly be EOL and this new end-to-end test confirms the key bit of support that we must ensure that we maintain in v16+.

    Related Issue(s)

    • Fixes: https://github.com/vitessio/vitess/issues/10732
    • Related to: https://github.com/vitessio/vitess/pull/11985

    Checklist

    • [x] "Backport to:" labels have been added if this change should be back-ported
    • [x] Tests were added
    • [x] Documentation is not required
  • Fix rbac config in the vtop example

    Fix rbac config in the vtop example

    Description

    This PR fixes the rbac config in the vtop example as pointed out in #11665

    Related Issue(s)

    • #11665

    Checklist

    • [x] "Backport to:" labels have been added if this change should be back-ported
    • [x] Tests were added or are not required
    • [x] Documentation was added or is not required

    Deployment Notes

  • Add missing backslash to run.sh script

    Add missing backslash to run.sh script

    Backslash was missing, which was causing, that --schema_dir was not applied and therefore alters were not applied and tables in databases were not created. This will fix it

    Signed-off-by: kbslvsk [email protected]

    Description

    This is a bug fix, that will fix unapplied option --schema_dir, which was causing alters not applying

    Related Issue(s)

    https://github.com/vitessio/vitess/issues/12032

    Checklist

    • [x] "Backport to:" labels have been added if this change should be back-ported
    • [x] Tests were added or are not required
    • [x] Documentation was added or is not required

    Deployment Notes

    N/A

  • Bug Report: Missing backlash causing --schema_dir not applying and therefore alters not applying

    Bug Report: Missing backlash causing --schema_dir not applying and therefore alters not applying

    Overview of the Issue

    In docker/vttestserver/run.sh on line 45, there is a newline escaping backslash missing, which is causing not applying alters and not creating tables. This should get fixed by adding the backlash.

    Reproduction Steps

    Add backslash at the end of line 45 in docker/vttestserver/run.sh

    Binary Version

    N/A
    

    Operating System and Environment details

    N/A
    

    Log Fragments

    No response

  • Add lauchable to unit tests as well

    Add lauchable to unit tests as well

    Description

    This PR adds launchable integration to unit tests as well.

    Related Issue(s)

    Checklist

    • [x] "Backport to:" labels have been added if this change should be back-ported
    • [x] Tests were added or are not required
    • [x] Documentation was added or is not required

    Deployment Notes

Beerus-DB: a database operation framework, currently only supports Mysql, Use [go-sql-driver/mysql] to do database connection and basic operations

Beerus-DB · Beerus-DB is a database operation framework, currently only supports Mysql, Use [go-sql-driver/mysql] to do database connection and basic

Oct 29, 2022
golang bigcache with clustering as a library.

clusteredBigCache This is a library based on bigcache with some modifications to support clustering and individual item expiration Bigcache is an exce

Sep 26, 2022
TiDB is an open source distributed HTAP database compatible with the MySQL protocol
TiDB is an open source distributed HTAP database compatible with the MySQL protocol

Slack Channel Twitter: @PingCAP Reddit Mailing list: lists.tidb.io For support, please contact PingCAP What is TiDB? TiDB ("Ti" stands for Titanium) i

Jan 9, 2023
A MySQL-compatible relational database with a storage agnostic query engine. Implemented in pure Go.

go-mysql-server is a SQL engine which parses standard SQL (based on MySQL syntax) and executes queries on data sources of your choice. A simple in-memory database and table implementation are provided, and you can query any data source you want by implementing a few interfaces.

Dec 27, 2022
RadonDB is an open source, cloud-native MySQL database for building global, scalable cloud services

OverView RadonDB is an open source, Cloud-native MySQL database for unlimited scalability and performance. What is RadonDB? RadonDB is a cloud-native

Dec 31, 2022
Run MySQL Database on Docker

Run MySQL Database on Docker cd <path>/resources/docker sudo docker-compose up (sudo for linux) This will start a container MySQL Database running on

Jan 1, 2022
A distributed MySQL binlog storage system built on Raft
A distributed MySQL binlog storage system built on Raft

What is kingbus? 中文 Kingbus is a distributed MySQL binlog store based on raft. Kingbus can act as a slave to the real master and as a master to the sl

Dec 31, 2022
Owl is a db manager platform,committed to standardizing the data, index in the database and operations to the database, to avoid risks and failures.

Owl is a db manager platform,committed to standardizing the data, index in the database and operations to the database, to avoid risks and failures. capabilities which owl provides include Process approval、sql Audit、sql execute and execute as crontab、data backup and recover .

Nov 9, 2022
This is a simple graph database in SQLite, inspired by "SQLite as a document database".

About This is a simple graph database in SQLite, inspired by "SQLite as a document database". Structure The schema consists of just two structures: No

Jan 3, 2023
Hard Disk Database based on a former database

Hard Disk Database based on a former database

Nov 1, 2021
Simple key value database that use json files to store the database

KValDB Simple key value database that use json files to store the database, the key and the respective value. This simple database have two gRPC metho

Nov 13, 2021
The Prometheus monitoring system and time series database.

Prometheus Visit prometheus.io for the full documentation, examples and guides. Prometheus, a Cloud Native Computing Foundation project, is a systems

Dec 31, 2022
Go reproduction of Bustub--a simple relational database system.

Bustub in Golang Bustub is the course project of CMU15-445 Database System, which is a simple relational database system. This repo is a golang reprod

Dec 18, 2021
The MySQL Cluster Autopilot Management with GTID and Raft
The MySQL Cluster Autopilot Management with GTID and Raft

Xenon Overview Xenon is a MySQL HA and Replication Management tool using Raft protocol. Xenon has many cool features, such as: Fast Failover with no l

Jan 3, 2023
Virsas-mod-db - Quick way to init mysql, postgres and redis connection from multiple services without duplicating the code

Quick way to init mysql, postgres and redis connection from multiple services without duplicating the code.

Jan 23, 2022
An embedded key/value database for Go.

bbolt bbolt is a fork of Ben Johnson's Bolt key/value store. The purpose of this fork is to provide the Go community with an active maintenance and de

Jan 1, 2023
BuntDB is an embeddable, in-memory key/value database for Go with custom indexing and geospatial support
BuntDB is an embeddable, in-memory key/value database for Go with custom indexing and geospatial support

BuntDB is a low-level, in-memory, key/value store in pure Go. It persists to disk, is ACID compliant, and uses locking for multiple readers and a sing

Dec 30, 2022
CockroachDB - the open source, cloud-native distributed SQL database.
CockroachDB - the open source, cloud-native distributed SQL database.

CockroachDB is a cloud-native SQL database for building global, scalable cloud services that survive disasters. What is CockroachDB? Docs Quickstart C

Jan 2, 2023
ACID key-value database.

Coffer Simply ACID* key-value database. At the medium or even low latency it tries to provide greater throughput without losing the ACID properties of

Dec 7, 2022