* PlannedReparentShard: Fix more known-recoverable problems.
PlannedReparentShard should be able to fix replication as long as all
tablets are reachable and all replication positions are in a
mutually-consistent state.
PRS also no longer trusts that the shard record contains up-to-date
information on the master, because we update that record asynchronously
now. Instead, it looks at MasterTermStartTime values stored in each
master tablet's record, so it makes the same choice of master as
vtgates.
Signed-off-by: Anthony Yeh <enisoc@planetscale.com>
* PlannedReparentShard: Add -lag_threshold flag.
Signed-off-by: Anthony Yeh <enisoc@planetscale.com>
* Fix expected error in reparent test.
Signed-off-by: Anthony Yeh <enisoc@planetscale.com>
* PRS: Add test case for graceful recovery.
Signed-off-by: Anthony Yeh <enisoc@planetscale.com>
* PRS: Measure replication progress instead of lag.
Signed-off-by: Anthony Yeh <enisoc@planetscale.com>
To make this possible, some things are added:
- The capability to lock all tables on a tablet, to momenterily stop updates
- Once the database is locked, we can create multiple consistent snapshot
transactions that all share the same view of the data
- Adds the capability to have replication move forward to a specific point
in the transaction log
This commit also refactors tabletserver and tx_engine, moving logic of
state transitions into the tx engine.
Signed-off-by: Andres Taylor <antaylor@squareup.com>
MigrateServedTypes has been made idempotent: if it fails in
the middle, you can safely retry the operation. If the operation
has previously succeeded, retrying it will be a no-op (except
for master migration).
For master migration. A new Frozen field has been added to the
tablet control record. This field signifies the point of no
return. If a migrate fails before reaching this state, then
we undo everything and re-enable the source shards. Once we
go past the 'frozen' state, you can only go forward. If there
are failures after the frozen state, the migrate can be safely
retried until successful. Once successful, a retry will return
an error saying that there's no resharding in progress.
The resharding end to end test has been updated to demonstrate
these behaviors.
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
Had to make a few changes for this:
* binlog_player reads its own start & stop position. This is
a more correct design. Previously, the caller supplied this.
* On VReplication update, the new controller inherits the previous
controller's blp stats. This gives better continuity.
* split clone workers still need to call refresh state to make
the tablets non-serving.
* MigrateServedTypes/From need to delete the VReplication streams
for master.
Signed-off-by: Sugu Sougoumarane <ssougou@gmail.com>
All Python tests fail on my Macbook without this:
> $ ./test/python_client_test.py
> Traceback (most recent call last):
> File "./test/python_client_test.py", line 24, in <module>
> import utils
> File "/Users/mberlin/vitess/src/vitess.io/vitess/test/utils.py", line 55, in <module>
> socket.getfqdn(), None, 0, 0, 0, socket.AI_CANONNAME)[0][3]
> socket.gaierror: [Errno 8] nodename nor servname provided, or not known
Signed-off-by: Michael Berlin <mberlin@google.com>
Now vtgate accepts the '--enable_forwarding' option. It exposes a
QueryService interface, that acts like the l2vtgate process used to.
Changed all the tests to work with vtgate in that mode now.
BUG=71858563
The logic depends on the refresh being smaller than or equal to the
TTL so add a startup assertion that it is configured that way.
Also update all test cases that override the ttl to also override
the refresh interval to avoid hitting this assertion.
Issues #3401, #3412
Now that we always dial asynchronously, we have to make sure that
requests don't fail fast. Otherwise, we sometimes see errors if
request is made immediately after dialing; Because the connection
is not ready yet.
Consequently, dial timeouts are obsolete. So, I've removed all
usage of it.
Set the max message size in tests to 5 MB which is larger than the
4MB default grpc max message size. Then add a test that ensures a
message larger than the default can be sent end to end using the
mysql client protocol.
Also ensure that queries larger than the max are properly flagged
as errors.
Please refer to #2694 and #2670 for motivation and reasoning for
this change.
I've tried to follow best practice in inserting the copyright
headers. In other open source projects, not all files carry
the notice. For example documentation doesn't. I've followed
similar ground rules.
I did not change the php because there is a separate LICENSE
file there by Pixel Federation. We'll first need to notify
them our intent before changing anything there.
As for the presubmit check, it's going to be non-trivial
because of the number of exceptions, like file types,
directories and generated code. So, it will have to be
a separate project.
This addresses #2723.
This change is essentially a new protocol for V3. Although
backward compatible, it changes the connection model.
Basically, the newer V3 features will work only if you used
the new protocol.
The new model deprecates keyspace_shard, tablet_type and options
from ExecuteRequest and moves them into the Session. This means
that the Session is generally not empty, and may be updated
by any call to Execute or ExecuteBatch, even if the statements are
not transactional. Consequently, transactional methods like Begin,
etc. are deprecated in favor of Execute("begin").
Transaction modes will now be supported by new `SET` syntax, which
will correspondingly update the Session variable.
This also makes a connection that contains a session non-multiplexable.
We'll need to resolve whether it's still worth exposing this flexibility
to the clients, and if so, how.
For now, I've updated the Go driver to use the "modern" protocol.
However, the low level rpc (Impl) continues to support the older
functions like Begin, etc. This allows us to test the legacy
functionality.
All other clients: Python, PHP and Java are currently unchanged.
Given that this is a major protocol change, it hints at a 3.0,
but the changes are 2.1 compatible with the following exceptions:
* Go driver uses the new protocol.
* vtclient binary requires `-target` instead of `-tablet-type`, etc.
All tests are passing including PHP & Java clients. In terms
of upgrade:
* PHP and Java can be upgraded by just updating the code.
* Python will probably require a brand new library. The existing
vtdb contains way too much baggage, and it may not be worth
retrofitting this new incompatible protocol onto what's
currently there. I'm looking for a name. `vtdb2`, `vitessdb`?
- Introducing topodata.CellInfo. Stored in the global topology server, it
describes a connection to a topology server for a cell. It has both a
server address and a root directory to use. Also adding vtctl commands
to deal with it. Existing zk and etcd implementations ignore these flags
for now.
- Now passing toplevel server address and root as generic topology
parameters. And using a topo server factory method.
- Changing the topo server registration to use a Factory method. It
creates a topology server implementation with a server address and a
root path. Existing zk and etcd implementations ignore these flags for
now.
- adding a zk2 topology implementation. It doesn't use any of the go/zk
code, and allows the specification of a root directory for both the
global cell, and each individual cell. It also is using a different
directory structure, consistent with what we want all new topo
implementations to use. And it stores the data as protobuf, not json.
- deprecating the legacy 'zookeeper' implementation. Adding instructions
to migrate from old server style to new server style.
- removing old janitor code, been replaced by topo validator workflow.
- Using a more generic vtctld topo explorer, will use it in all servers
soon.
- Fixing the ZK command line to decode protos and use new connection
library.
- topo2topo now also copies VSchema.
This change adds a set of tests executing schema swaps with different error
conditions. This also fixes a few bugs in the schema swap code discovered while
running these tests.
I've left out the row support functions because most of them are
not supported by the protocol. The most important changes
are supported: contexts and named arguments.
* Deleted the deprecated shard specific connection support.
* Built go1.8, fixed compilation errors and added tests.
* Fixed documentation that was still stating support for v1 API.
* Fixed custom_sharding test that was relying on v1 API.
* Doc comments updated to explain isolation levels and named args.
* Code review comments addressed.
- Including UI display of the redirect, and a fix in zk topology.
- Better Update support in vtctld2.
Now always using a single Update structure, and json-encoding it.
Omitting a bunch of fileds if empty in json encoding.
- Adding unit test for long polling.
It turns out we can receive an unrelated PREVIOUS_GTIDS_EVENT when
asking MySQL to stream from a given position. This is confusing our
state machine and has to be ignored.
During a resharding, vtgate traffic is migrated from the old tablets to
the new tablets for each tablet type (RDONLY, REPLICA, MASTER).
Before this change, the query service of the old tablets was shut down
immediately and vtgate did serve errors until it noticed that it should
use the new tablets instead.
This is not necessary for RDONLY and REPLICA tablets where it is okay to
keep serving from old tablets for a limited time. (The MASTER is not
migrated yet and the old RDONLY and REPLICA tablets still get the latest
changes from the old MASTER.)
Therefore, we're waiting for some time now before we shutdown the query
service. During this time, the keyspace remains locked.
This change is a short-term solution. In the near future, it will be
replaced with the "vtctl WaitForDrain" command which actively watches
the QPS on all tablets and stops waiting once the QPS went to zero or a
timeout is reached.
They are added to both to queryservice and vtgateservice.
Right now, they only contain the exclude_field_names option.
exclude_field_names is used by vttablet to strip the name from the
fields record. Note the clients don't use that yet, I want the ability
to do it first, then I'll add support for it in the clients.
Allowing ExecuteOptions in vtctl commands. That's how they're tested
now.
Side-changes:
- Removing wantFields from dbclient interface. We never want them.
- Small optimization on query fields for sequences.