A new SingleDb flag has been added to Begin, which would survive
through the Session. If set, any transaction that exceeds a single
shard will be failed.
VTGate acquires a command line flag: transaction_mode, which can be
single, multi or 2pc. In single mode, it will fail Begins that request
multi-shard. In multi mode, it will fail commits that request 2pc.
In 2pc mode, everything will be allowed.
The per-request flags specify what the app wants, and the vtgate flags
specify what it allows.
* 2pc: Go and python clients
For Go, the transaction mode settings are supported through
the context. This means that it will only work with go1.8.
For Python, it's a cursor constructor parameter.
* 2pc: end-to-end test
The 'Resolve' name was misleading. It was implying that
some kind of resolution work was being performed. Conclude
is a more accurate description of this action.
This will make the type look more generic, so that it could be used for other
purposes as well, not only for a worker during resharding. In particular I plan
to use the same tablet type for schema swap reserving a seed vttablet to execute
the schema change.
I also add a "drain_reason" tag that will be set by anyone setting the DRAINED
tablet type, so that it's easy to understand why it is drained.
This adds support for an 'allprivs' user which is supposed to have more
privileges than 'app' user, but unlike 'dba' user 'allprivs' won't have SUPER
privilege. New method ExecuteFetchAsAllPrivs() is added to TabletManager
service to be able to execute queries as this new 'allprivs' user.
'allprivs' user will be used for administrative tasks done by Vitess, such as
changes in metadata tables that schema swap process will be doing. 'allprivs'
user shouldn't have SUPER privilege so that schema swap could safely change
replicated metadata on the master without risk to commit the changes when
master has been already demoted to be a slave.
ResultExtras is part of Query Result. It contains:
- an optional EventToken (returned if ExecuteOptions.include_event_token
is set)
- an optional 'fresher' flag (set if ExecuteOptions.compare_event_token
is set, and that token is older than the current TabletServer event
token.)
The flags are set by TabletServer, and merged by VtGate.ScatterConn.
They are added to both to queryservice and vtgateservice.
Right now, they only contain the exclude_field_names option.
exclude_field_names is used by vttablet to strip the name from the
fields record. Note the clients don't use that yet, I want the ability
to do it first, then I'll add support for it in the clients.
Allowing ExecuteOptions in vtctl commands. That's how they're tested
now.
Side-changes:
- Removing wantFields from dbclient interface. We never want them.
- Small optimization on query fields for sequences.
The new setting is similar, but independent to the existing option "ignore_n_slowest_rdonlys".
For example, if both are set to 1, you can have one 1 slow REPLICA and 1 slow RDONLY tablet.
It is enabled with the -watch_replication_stream command line
argument. It does two things:
- remember the last applied Event Token (and exposes it in vars).
- if the event is a DDL, it forces a schema reload.
Also ignoring local binlog connections in FindSlaves, as they have no
impact on reparenting.
min_duration_between_changes_sec was split into separate "increases" and "decreases" fields.
This is required because an insert test needs to wait longer than a decrease test.
Additionally, there is a new field "spread_backlog_across_sec". Before this change, the value of "min_duration_between_changes_sec" was used for this.
The vtgate API takes a starting timestamp, or a starting EventToken. It
will only use the starting EventToken if it's relevant. This is mostly
for tests, but could be used by real clients too to avoid the timestamp
search on the servers.
The only restriction in the vtgate routing implementation is that a
query can only end up on one shard. The stream aggregation code inside
vtgate will be added later.
This change includes:
- proto changes.
- implementing the server side interface.
- implementing the client side interface.
- adding a vtctl VtTabletUpdateStream command to stream from a given
tablet. This is used in end-to-end tests.
- using the python vtgate_client update_stream API in all end-to-end
tests.
- removing the python vttablet direct stream_update API.
- vtgate now better preserves remote errors through its API now, as
withSuffix and withPrefix will preserve the error codes of all
VtError, not just *VitessError.
- Also adding callerid vtgateclienttest tests for all API calls.
The UpdateStream call makes a lot more sense in the QueryService part of
the API. It is meant to be used by vtgate. The Binlog side of the API is
then only used by filtered replication.
Adding vttablet support to start streaming from a timestamp, not just a
replication position. It finds the starting position by loking at the
binary logs.
In the process, change the following things:
- UpdateStream is now added to gateway.Gateway, so it flows through
l2vtgate if needed.
- the python client now uses the new service.
- updated the tests as required.
- using Context instead of sync2.ServiceManager. Removing
sync2.ServiceManager as it is now unused.
- tabletserver.TabletServer now remembers the ongoing UpdateStream
connections, and closes them on state change.
- Adding an EventToken structure in query.proto.
- Using it in Binlog Streamer and filtered replication.
- Using it in Update Stream in POS events as well.
Note the main change there is that an EventToken has a replication
position (GTIDSet), not a transaction ID (GTID). Both server and clients
were computing the position individually anyway by accumulating
transaction IDs, might as well just send the position. And it will make
more sense for later use of EventToken.
Also, we don't set the 'Shard' field of the EventToken just yet. I'm
still not sure vttablet should do it, as opposed to vtgate.
In order to do that, I had to:
- Adding File/Dir APIs to topo.Backend.
- Implementing them for all flavors: memorytopo, zk, etcd, tee.
None of the backward compatible code is there yet,
so these can only be used for new objects.
- Adding corresponding unit tests.
- Fixing a few fake etcd-specific bugs.
Also now using CancelFunc for all topo watches.
Summary of changes:
- Added new methods to Throttler RPC interface.
- Implemented methods for gRPC.
- Wrote RPC regression tests for the new methods (throttlerclient_testsuite.go).
- Wrote unit tests for the actual code (manager_test.go).
- Wrote e2e tests which are tested by worker.py (vtworker) and resharding.py (filtered replication).
When mysqld is started on a database from an older version (the only use case
when mysql_upgrade will do anything and is really necessary) it may fail to
start due to mysql.* tables having old structure. --skip-grant-tables will
ensure that mysqld doesn't try to read mysql.* tables before mysql_upgrade fixed
their structure. Since anyone can connect to mysqld without password when it's
started with --skip-grant-tables, we should also pass --skip-networking to it.
To make passing those flags to mysqld possible I'm extending notion of
mysqld_start hook to accept list of any arguments that should be passed to
mysqld.
Executing the command vttablet will completely delete the existing data and
then replace it from the latest backup. The sequence of actions is basically
the same as vttablet has already used to setup a freshly started replica.
Also mysqlctl interface is extended with ReinitConfig function. It's used by
vttablet to change server_id of the replica before restoring it from backup.
That's necessary to avoid a possible situation when restored replica skips
transactions in the replication stream which have the same server_id.
The new RestoreFromBackup command will be used later in the implementation of
an automated pivot.
This change doesn't add any tests for the new functionality because I didn't
find a place where I could add these tests. Please tell me if there is actually
a place that I missed. Meanwhile I tested this manually on the example
Kubernetes setup.
Changing Shard's db_name to db_name_override.
Rewording a bunch of comments in proto.
Factoring out tablet creation code in vtcombo.
Rebuilding keyspace graph for redirected keyspaces too,
and also setting ShardingColumn{Name,Type} for them.
Lameduck is clearer now.
Re-adding a lameduck case: when going unhealthy.
All lameduck mode is now conditioned on serving_state_grace_period > 0.
Adding reserved obsolete comments to proto for the fields I removed.
We're only interested in the fact that the receiving master has seen a lower delay at some point in time. We don't care about the actual timestamps of the filtered replication statements which were applied.
1. define rules in the Maven build files to compile the data protos at build time.
2. define a new vtgate service interface that uses the proto3 data structures and
defines an abstract service.
In short, the tableacl configuration will be represented as a
tableacl.proto. tableacl module holds a global "currentACL" instance
which maps a table group to its ACLs.
1. Introduce tableacl.proto which defines acl per table group.
A table group either contains only one table or many tables
that share the same name prefix.
2. Remove "All" and "AllString" funcs from acl.Factory interface.
Batch requests in vttablet currently allow "begin" and "commit"
statements. They were treated as special cases where vttablet would
start a transaction if such statements were seen.
Initially, this looked flexible, but I realize that this has drawbacks.
For example, someone could issue a begin without a commit, or one could
issue begin inside begin, or a begin-commit-begin, or a "bEgIn", etc.
Checking for all these corner cases complicates the logic, and does not
add much value to the app.
This change adds an AsTransaction flag to the VTGate Batch
requests. If it's true, then the entire batch is executed in a
transaction. Otherwise, the statements are executed loosely.
Consequently, I'll be outlawing begin and commit as valid statements
inside batch requests.
The next change will be for vttablet.