* feat: add vttablet rpc to reset replication parameters
Signed-off-by: Manan Gupta <manan@planetscale.com>
* feat: added end to end testing for the rpc and fixed bug
Signed-off-by: Manan Gupta <manan@planetscale.com>
* feat: fix typing error
Signed-off-by: Manan Gupta <manan@planetscale.com>
* feat: add basic full status rpc functionality and add test for it
Signed-off-by: Manan Gupta <manan@planetscale.com>
* feat: add all the fields needed in full status
Signed-off-by: Manan Gupta <manan@planetscale.com>
* test: moved the test to reparent tests and improved it
Signed-off-by: Manan Gupta <manan@planetscale.com>
* feat: bug fix for no replication status and no primary status
Signed-off-by: Manan Gupta <manan@planetscale.com>
* feat: add version to the full status output
Signed-off-by: Manan Gupta <manan@planetscale.com>
* feat: add binlog information to full status
Signed-off-by: Manan Gupta <manan@planetscale.com>
* docs: fix the comment explaining the binlog information
Signed-off-by: Manan Gupta <manan@planetscale.com>
* feat: add semi-sync statuses to full status
Signed-off-by: Manan Gupta <manan@planetscale.com>
* feat: call the correct command
Signed-off-by: Manan Gupta <manan@planetscale.com>
* feat: add server uuid and id to full status
Signed-off-by: Manan Gupta <manan@planetscale.com>
* feat: make server_id a uint32 to accept the correct range of values
Signed-off-by: Manan Gupta <manan@planetscale.com>
* feat: add few more fields to the full status like version comment, semi-sync settings, binlog_row_image
Signed-off-by: Manan Gupta <manan@planetscale.com>
* feat: generate vtadmin proto files
Signed-off-by: Manan Gupta <manan@planetscale.com>
* test: add assertion to check binlog row format is read correctly
Signed-off-by: Manan Gupta <manan@planetscale.com>
* feat: split GTID mode in its own function because mariadb doesn't support it
Signed-off-by: Manan Gupta <manan@planetscale.com>
* feat: fix parsing of empty mariadb gtid set
Signed-off-by: Manan Gupta <manan@planetscale.com>
* docs: add doucmentation for existing fields in ReplicationStatus
Signed-off-by: Manan Gupta <manan@planetscale.com>
* feat: add relay log file position to the replication status output
Signed-off-by: Manan Gupta <manan@planetscale.com>
* test: augmented full status test to check all the different positions stored
Signed-off-by: Manan Gupta <manan@planetscale.com>
* feat: add additional fields to replication status and read source user
Signed-off-by: Manan Gupta <manan@planetscale.com>
* feat: read sql delay from show replica status output
Signed-off-by: Manan Gupta <manan@planetscale.com>
* feat: read ssl allowed from show replica status output
Signed-off-by: Manan Gupta <manan@planetscale.com>
* feat: read has replication filters from show replica status output
Signed-off-by: Manan Gupta <manan@planetscale.com>
* feat: read auto position and using gtid from show replica status output
Signed-off-by: Manan Gupta <manan@planetscale.com>
* feat: add replication lag unknown too to replication status
Signed-off-by: Manan Gupta <manan@planetscale.com>
* feat: return nils from replication and primary postiion if it is not present
Signed-off-by: Manan Gupta <manan@planetscale.com>
* feat: rename FileRelayLogPosition in replication status output to RelayLogSourceBinLogEquivalentPosition and augment test to make sure rpc changes are backward compatible
Signed-off-by: Manan Gupta <manan@planetscale.com>
* feat: update vtadmin proto files
Signed-off-by: Manan Gupta <manan@planetscale.com>
* refactor: rename BinLog to binlog in renamed proto field
Signed-off-by: Manan Gupta <manan@planetscale.com>
After #9512 we always attempted to start the replication SQL_Thread(s) when waiting for a given position. The problem with this, however, is that if the SQL_Thread is running but the IO_Thread is not then the tablet repair does not try and start replication on a replica tablet. So in certain states such as when initializing a shard, replication may end up in a non-healthy state and never be repaired.
This changes the behavior so that:
1. We only attempt to start the SQL_Thread(s) if it's not already running
2. If we explicitly start the SQL_Thread(s) then we also explicitly reset it to what it was (stopped) as we exit the call
Because the caller should be/have a TabletManager which has a mutex, this should ensure that the replication manager calls are serialized and because we are resetting the replication state after mutating it, everything should work as it did before #9512 with the exception being that when waiting we ensure that the replica at least has the possibility of catching up.
Signed-off-by: Matt Lord <mattalord@gmail.com>
* Added a MasterStatus() method and carried it through the entire gRPC
chain, similar to what we do for SlaveStatus(). Other WIP changes to be
updated after getting approval on the MasterStatus() parts.
Signed-off-by: Peter Farr <Peter@PrismaPhonic.com>
* Small fixes per review suggestions.
Signed-off-by: Peter Farr <Peter@PrismaPhonic.com>
* Descriptive comments added per review suggestions.
Signed-off-by: Peter Farr <Peter@PrismaPhonic.com>
* Added unit tests and small fixes.
Signed-off-by: Peter Farr <Peter@PrismaPhonic.com>
* Refactored per review suggestion. Under new design, it's not possible to unit test anymore, since we don't have mock connections. Removed unit test because of this.
Signed-off-by: Peter Farr <Peter@PrismaPhonic.com>
* Fix very odd rebase issue I've never seen before, where rebase didn't show me conflicts in various files that it allowed to continue without fixing known conflicts.
Signed-off-by: Peter Farr <Peter@PrismaPhonic.com>
* Added new MasterStatus return for DemoteMaster and carried it through the entire call chain.
Signed-off-by: Peter Farr <Peter@PrismaPhonic.com>
* Remove deprecated field everywhere we possibly can, and ensure compatibility for newer client interacting with older server.
Signed-off-by: Peter Farr <Peter@PrismaPhonic.com>
* Remove copying of XXX fields, per review suggestion.
Signed-off-by: Peter Farr <Peter@PrismaPhonic.com>
* Added the option to StopReplicationAndGetStatus() to stop only the IO Thread, and passed it all the way down the call chain to MysqlDaemon, where we can now call a new method (implemented in all flavors) which stops only the io thread, if that's what was requested.
Signed-off-by: Peter Farr <Peter@PrismaPhonic.com>
* Oops
Signed-off-by: Peter Farr <Peter@PrismaPhonic.com>
* Remove hook per review suggestion.
Signed-off-by: Peter Farr <Peter@PrismaPhonic.com>
* Adjusted stop slave io thread to pass in a ctx because it's a new function. Adjusted StopReplicationAndGetStatus so that it stops the slave before getting slave status. This will ensure that the relay log information is correct.
Signed-off-by: Peter Farr <Peter@PrismaPhonic.com>
* Add back in logic to bail if slave is already stopped.
Signed-off-by: Peter Farr <Peter@PrismaPhonic.com>
* We have to patch these in because after calling stop slave there won't be a master host and master port anymore when we grab slave status. Instead we need to patch in positions so we retain master host and master port, otherwise set master will assume the tablet is the master because it has no master host and master port. In retrospect its probably a bad idea that we assume no master host and no master port means we've found the master (in set master).
Signed-off-by: Peter Farr <Peter@PrismaPhonic.com>
* Refactored per offline discussions. We now return before and after slave status, so we are more explicit, and don't nest business logic into subfields of a hybrid struct.
Signed-off-by: Peter Farr <Peter@PrismaPhonic.com>
* Fix issues that cropped up after merge conflict
Signed-off-by: Peter Farr <Peter@PrismaPhonic.com>
* Change way we get this state to also pull in the Connecting state. We are either running, or attempting to run. Either way we are not not running.
Signed-off-by: Peter Farr <Peter@PrismaPhonic.com>
* Changed references from slave to replica per review suggestion.
Signed-off-by: Peter Farr <Peter@PrismaPhonic.com>
* Embed StopReplicationStatus into StopReplicationAndGetStatusResponse and rename fields to make it clear.
Signed-off-by: Peter Farr <Peter@PrismaPhonic.com>
* Various fixes per review suggestions.
Signed-off-by: Peter Farr <Peter@PrismaPhonic.com>
* Lets try out issuing a stop no matter what and see what happens.
Signed-off-by: Peter Farr <Peter@PrismaPhonic.com>
* Adding back in bailouts. They are necessary.
Signed-off-by: Peter Farr <Peter@PrismaPhonic.com>
* Get rid of more slave references.
Signed-off-by: Peter Farr <Peter@PrismaPhonic.com>
* Changed stopIOThreadOnly to an enum so we can change the way in which we stop replication in the future.
Signed-off-by: Peter Farr <Peter@PrismaPhonic.com>
* Add a test to ensure that we can stop the io thread only.
Signed-off-by: Peter Farr <Peter@PrismaPhonic.com>
* Fix incorrect test methodology.
Signed-off-by: Peter Farr <Peter@PrismaPhonic.com>
* Used a generated enum from proto per convention for the stop replication mode.
Signed-off-by: Peter Farr <Peter@PrismaPhonic.com>
* Scrub more references of slave without obfuscating MySQL statements that are being called under the hood.
Signed-off-by: Peter Farr <Peter@PrismaPhonic.com>
When generating masterGTIDSet in file:pos most likely you will have a topology
like the following:
Source A -> Target B (B has a vreplication stream from A)
From the target perspective, the source A is the master and you want to generate
a gtid that is based on binlog file position of that server.
As an example, let's see this topology:
Master A -> Source B -> Target C (C has vreplication stream from B)
Prior to this change, masterGTIDSet was returning the binlogfile:pos of A. But
in reality, the Target C wants the position of B.
Signed-off-by: Rafael Chacon <rafael@slack-corp.com>
* Prior to this commit, flavorpos was using lexicographical comparison of the gtids.
Thas was a bug in this context.
Signed-off-by: Rafael Chacon <rafael@slack-corp.com>
Fixes#5532Fixes#5569Fixes#5571
With this fix, unit tests pass for all flavors.
Also fix test.go to cover the newer flavors.
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
The vstreamer sent GTIDs "as they came". With the new change,
GTIDs are sent only when they matter: on COMMIT, DDL or OTHER.
This new approach makes the protocol easier to understand. Also,
it makes it easier for filePos to continuously send file and position.
The correct values will get used when significant events like
COMMIT are encountered.
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
In this scheme, the filePos reader detects whether we are in a
transaction or not, and emits appropriate GTID events.
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>