vitess-gh/doc/SeparatingVttabletMysql.md

3.8 KiB

Separating VTTablet and MySQL

This document explores the setup where VTTablet and MySQL are not running on the same server, or in the same container. This is mostly useful when using a managed MySQL server (like CloudSQL or RDS).

In this setup, Vitess is not responsible for any replication management or backups. Schema Swap is also not possible, as it relies on backups / restores.

Note: this document is a work in progress, meant to centralize all findings on this subject. Eventually, we want to have an end to end test proving this setup works, probably on Kubernetes / GCE using CloudSQL.

VTTablet Configuration

The following adjustments need to be made to VTTablet command line parameters:

  • Do not specify -mycnf-file: instead, specify the -mycnf_server_id parameter.

  • Do not use -mycnf_socket_file. There is no local MySQL unix socket file.

  • Specify the host and port of the MySQL daemon for the -db\_host and -db\_port command line parameters. Do not specify -db\_socket parameters.

  • Disable restores / backups, by not passing any backup command line parameters. Specifically, --restore_from_backup and -backup_storage_implementation should not be set.

Since cluster management and replication are not handled by Vitess, we just need to make sure the tablet type in the topology is correct before running vttablet. Usually, vttablet can be started with -init_tablet_type replica, even for a primary (as primary/master is not allowed), and it will discover and set its type correctly to primary. When Vitess doesn't manage that at all, running vtctl InitShardPrimary is not possible, so there is no way to start as the primary just using vttablet. There are two solutions:

  • Preferred: Start the desired primary tablet with -init_tablet_type replica, and then run a vtctl TabletExternallyReparented <tablet alias>.

  • Run vtctl InitTablet ... <tablet alias> with the desired primary tablet alias, and then start vttablet with no -init... parameters. Note: InitTablet has been deprecated so the first method is preferred.

Other Configurations

vtctl and vtctld can be run with the -disable_active_reparents flag. This would make all explicit reparent commands unreachable (like InitShardPrimary or PlannedReparentShard).

vtgate and vtworker don't need any special configuration.

Runtime Differences

There are some subtle differences when connecting to MySQL using TCP, as opposed to connecting using a Unix socket. The main one is that when the server closes the connection, the client knows it when writing on a unix socket, but only realizes it when reading the response from a TCP socket. So using TCP, it is harder to distinguish a query that was killed from a query that was sent on a closed connection. It matters because we don't want to retry a query that was killed, but we want to retry a query sent on a closed connection.

If this becomes an issue, we can find ways around it.

Resharding

Given that Vitess doesn't control cluster management or replication, is resharding still possible? The following features are necessary:

  • being able to start / stop replication on a slave. So we can keep a slave in a specific state while we copy or compare data.

  • have access to the current replication position, as MariaDB GTID or MySQL 5.6 GTIDSet.

  • be able to connect as a replica to the primary MySQL, and start from an arbitrary past GTID, to support filtered replication.

Note: in some setups, the primary for a shard is never moved around, and is made durable by a persistent drive. It would be possible in that case to have an alternative implementation of the Vitess GTID code using a filename / file position. Filtered replication would always need to connect to the primary of the source shard then. Let us know if you are in that situation, it is not a lot of work to implement.