As part of removing googleBinlogEvent parsing support, I reworked the
binlog_streamer_test to use fake events instead of real ones. The actual
parsing of real events is already tested in its own unit test. Using
fake events simplifies the binlog_streamer_test, which really is only
meant to test the logic for converting a stream of events into a stream
of transaction objects.
In order to support split queries for map-reduce jobs,
we need a way to send queries to specific shards. But
we cannot assume that there will be physical keyspace_id
column in the table, which will prevent us from specifying
a keyrange on it.
So, we introduce the KEYRANGE construct which is a boolean
expression that VTGate can use to match the range to a set
of shards. Since the construct is not understood by VTTablet
or MySQL, VTGate will also strip the construct before passing
the query down.
For owned indexes, we have to know which rows are going
to be deleted so that we can delete the corresponding vindex
entries. So, deletes need to run a subquery to identify
these rows. An empty subquery means that the table does
not own any vindexes.
Since each keyspace can have its own ShardKey, it's not a good
ShardKey as a special name because the index may also need to map
to a physical table, which will cause conflicts. Instead, and index
now has a Type field, which can be ShardKey or Lookup.
We're going to see if keyspace_id column based sharding
can be deprecated in favor of an on-the-fly hash based
sharding which achieves the same thing.
Also, we should support auto-inc through lookup so we
can generate systemwide unique ids for our keys.
You can now specify a new syntax for queries that
have an IN clause:
select * from t where id in ::list
The corresponding bind variable would be:
{"list": field_types.List([1,2,3])}
- List bind vars are only allowed for IN and NOT IN clauses.
- A bind variable can be a list if and only if
it's referred as ::list.
- A list must contain at least one value.
For the python client, you need to supply lists using the
field_types.List class. This is because of legacy behavior
where lists were previously encoded as strings.
We should soon change this API to use native lists once we
confirm that no one will be affected by this change.
Row and column tuples are acquiring different semantic
meanings. This is in preparation to allow lists as
bind vars. We're only going to allow lists for column
tuples. There's not much benefit to supporting list
bind vors for row tuples.
The previous rowcache implementation only supported "LIMIT 1"
for primary key fetches. This CL extends this suppport for
IN clauses and subqueries. It also supports limit clauses
with bind variables, which requires validation to be delayed
till the execution phase.