Adding Sharding.md reference, fixing list markup syntax.

This commit is contained in:
Alain Jobart 2015-05-11 10:46:44 -07:00
Родитель dcd87a541d
Коммит fea04f1503
9 изменённых файлов: 155 добавлений и 147 удалений

Просмотреть файл

@ -18,6 +18,7 @@ and a more [detailed presentation from @Scale '14](http://youtu.be/5yDO-tmIoXY).
### Intro
* [Helicopter overview](http://vitess.io):
high level overview of Vitess that should tell you whether Vitess is for you.
* [Sharding in Vitess](http://vitess.io/doc/Sharding)
* [Frequently Asked Questions](http://vitess.io/doc/FAQ).
### Using Vitess

Просмотреть файл

@ -30,19 +30,19 @@ eventual consistency guarantees), run data analysis tools that take a long time
### Tablet
A tablet is a single server that runs:
- a MySQL instance
- a vttablet instance
- a local row cache instance
- an other per-db process that is necessary for operational purposes
* a MySQL instance
* a vttablet instance
* a local row cache instance
* an other per-db process that is necessary for operational purposes
It can be idle (not assigned to any keyspace), or assigned to a keyspace/shard. If it becomes unhealthy, it is usually changed to scrap.
It has a type. The commonly used types are:
- master: for the mysql master, RW database.
- replica: for a mysql slave that serves read-only traffic, with guaranteed low replication latency.
- rdonly: for a mysql slave that serves read-only traffic for backend processing jobs (like map-reduce type jobs). It has no real guaranteed replication latency.
- spare: for a mysql slave not used at the moment (hot spare).
- experimental, schema, lag, backup, restore, checker, ... : various types for specific purposes.
* master: for the mysql master, RW database.
* replica: for a mysql slave that serves read-only traffic, with guaranteed low replication latency.
* rdonly: for a mysql slave that serves read-only traffic for backend processing jobs (like map-reduce type jobs). It has no real guaranteed replication latency.
* spare: for a mysql slave not used at the moment (hot spare).
* experimental, schema, lag, backup, restore, checker, ... : various types for specific purposes.
Only master, replica and rdonly are advertised in the Serving Graph.
@ -107,11 +107,11 @@ There is one local instance of that service per Cell (Data Center). The goal is
using the remaining Cells. (a Zookeeper instance running on 3 or 5 hosts locally is a good configuration).
The data is partitioned as follows:
- Keyspaces: global instance
- Shards: global instance
- Tablets: local instances
- Serving Graph: local instances
- Replication Graph: the master alias is in the global instance, the master-slave map is in the local cells.
* Keyspaces: global instance
* Shards: global instance
* Tablets: local instances
* Serving Graph: local instances
* Replication Graph: the master alias is in the global instance, the master-slave map is in the local cells.
Clients are designed to just read the local Serving Graph, therefore they only need the local instance to be up.

Просмотреть файл

@ -6,30 +6,30 @@ and what kind of effort it would require for you to start using Vitess.
## When do you need Vitess?
- You store all your data in a MySQL database, and have a significant number of
* You store all your data in a MySQL database, and have a significant number of
clients. At some point, you start getting Too many connections errors from
MySQL, so you have to change the max_connections system variable. Every MySQL
connection has a memory overhead, which is just below 3 MB in the default
configuration. If you want 1500 additional connections, you will need over 4 GB
of additional RAM – and this is not going to be contributing to faster queries.
- From time to time, your developers make mistakes. For example, they make your
* From time to time, your developers make mistakes. For example, they make your
app issue a query without setting a LIMIT, which makes the database slow for
all users. Or maybe they issue updates that break statement based replication.
Whenever you see such a query, you react, but it usually takes some time and
effort to get the story straight.
- You store your data in a MySQL database, and your database has grown
* You store your data in a MySQL database, and your database has grown
uncomfortably big. You are planning to do some horizontal sharding. MySQL
doesnt support sharding, so you will have write the code to perform the
sharding, and then bake all the sharding logic into your app.
- You run a MySQL cluster, and use replication for availability: you have a master
* You run a MySQL cluster, and use replication for availability: you have a master
database and a few replicas, and in the case of a master failure some replica
should become the new master. You have to manage the lifecycle of the databases,
and communicate the current state of the system to the application.
- You run a MySQL cluster, and have custom database configurations for different
* You run a MySQL cluster, and have custom database configurations for different
workloads. Theres the master where all the writes go, fast read-only replicas
for web clients, slower read-only replicas for batch jobs, and another kind of
slower replicas for backups. If you have horizontal sharding, this setup is

Просмотреть файл

@ -122,7 +122,7 @@ vtctl MigrateServedTypes -reverse test_keyspace/0 replica
## Scrap the source shard
If all the above steps were successful, its safe to remove the source shard (which should no longer be in use):
- For each tablet in the source shard: `vtctl ScrapTablet <source tablet alias>`
- For each tablet in the source shard: `vtctl DeleteTablet <source tablet alias>`
- Rebuild the serving graph: `vtctl RebuildKeyspaceGraph test_keyspace`
- Delete the source shard: `vtctl DeleteShard test_keyspace/0`
* For each tablet in the source shard: `vtctl ScrapTablet <source tablet alias>`
* For each tablet in the source shard: `vtctl DeleteTablet <source tablet alias>`
* Rebuild the serving graph: `vtctl RebuildKeyspaceGraph test_keyspace`
* Delete the source shard: `vtctl DeleteShard test_keyspace/0`

Просмотреть файл

@ -54,14 +54,14 @@ live system, it errs on the side of safety, and will abort if any
tablet is not responding right.
The actions performed are:
- any existing tablet replication is stopped. If any tablet fails
* any existing tablet replication is stopped. If any tablet fails
(because it is not available or not succeeding), we abort.
- the master-elect is initialized as a master.
- in parallel for each tablet, we do:
- on the master-elect, we insert an entry in a test table.
- on the slaves, we set the master, and wait for the entry in the test table.
- if any tablet fails, we error out.
- we then rebuild the serving graph for the shard.
* the master-elect is initialized as a master.
* in parallel for each tablet, we do:
* on the master-elect, we insert an entry in a test table.
* on the slaves, we set the master, and wait for the entry in the test table.
* if any tablet fails, we error out.
* we then rebuild the serving graph for the shard.
### Planned Reparents: vtctl PlannedReparentShard
@ -69,14 +69,14 @@ This command is used when both the current master and the new master
are alive and functioning properly.
The actions performed are:
- we tell the old master to go read-only. It then shuts down its query
* we tell the old master to go read-only. It then shuts down its query
service. We get its replication position back.
- we tell the master-elect to wait for that replication data, and then
* we tell the master-elect to wait for that replication data, and then
start being the master.
- in parallel for each tablet, we do:
- on the master-elect, we insert an entry in a test table. If that
* in parallel for each tablet, we do:
* on the master-elect, we insert an entry in a test table. If that
works, we update the MasterAlias record of the global Shard object.
- on the slaves (including the old master), we set the master, and
* on the slaves (including the old master), we set the master, and
wait for the entry in the test table. (if a slave wasn't
replicating, we don't change its state and don't start replication
after reparent)
@ -95,15 +95,15 @@ just make sure the master-elect is the most advanced in replication
within all the available slaves, and reparent everybody.
The actions performed are:
- if the current master is still alive, we scrap it. That will make it
* if the current master is still alive, we scrap it. That will make it
stop what it's doing, stop its query service, and be unusable.
- we gather the current replication position on all slaves.
- we make sure the master-elect has the most advanced position.
- we promote the master-elect.
- in parallel for each tablet, we do:
- on the master-elect, we insert an entry in a test table. If that
* we gather the current replication position on all slaves.
* we make sure the master-elect has the most advanced position.
* we promote the master-elect.
* in parallel for each tablet, we do:
* on the master-elect, we insert an entry in a test table. If that
works, we update the MasterAlias record of the global Shard object.
- on the slaves (excluding the old master), we set the master, and
* on the slaves (excluding the old master), we set the master, and
wait for the entry in the test table. (if a slave wasn't
replicating, we don't change its state and don't start replication
after reparent)
@ -121,33 +121,33 @@ servers. We then trigger the 'vtctl TabletExternallyReparented'
command.
The flow for that command is as follows:
- the shard is locked in the global topology server.
- we read the Shard object from the global topology server.
- we read all the tablets in the replication graph for the shard. Note
* the shard is locked in the global topology server.
* we read the Shard object from the global topology server.
* we read all the tablets in the replication graph for the shard. Note
we allow partial reads here, so if a data center is down, as long as
the data center containing the new master is up, we keep going.
- the new master performs a 'SlaveWasPromoted' action. This remote
* the new master performs a 'SlaveWasPromoted' action. This remote
action makes sure the new master is not a MySQL slave of another
server (the 'show slave status' command should not return anything,
meaning 'reset slave' should have been called).
- for every host in the replication graph, we call the
* for every host in the replication graph, we call the
'SlaveWasRestarted' action. It takes as parameter the address of the
new master. On each slave, we update the topology server record for
that tablet with the new master, and the replication graph for that
tablet as well.
- for the old master, if it doesn't successfully return from
* for the old master, if it doesn't successfully return from
'SlaveWasRestarted', we change its type to 'spare' (so a dead old
master doesn't interfere).
- we then update the Shard object with the new master.
- we rebuild the serving graph for that shard. This will update the
* we then update the Shard object with the new master.
* we rebuild the serving graph for that shard. This will update the
'master' record for sure, and also keep all the tablets that have
successfully reparented.
Failure cases:
- The global topology server has to be available for locking and
* The global topology server has to be available for locking and
modification during this operation. If not, the operation will just
fail.
- If a single topology server is down in one data center (and it's not
* If a single topology server is down in one data center (and it's not
the master data center), the tablets in that data center will be
ignored by the reparent. When the topology server comes back up,
just re-run 'vtctl InitTablet' on the tablets, and that will fix

Просмотреть файл

@ -1,35 +1,40 @@
# Resharding
In Vitess, resharding describes the process of re-organizing data dynamically, with very minimal downtime (we manage to
completely perform most data transitions with less than 5 seconds of read-only downtime - new data cannot be written,
existing data can still be read).
In Vitess, resharding describes the process of re-organizing data
dynamically, with very minimal downtime (we manage to completely
perform most data transitions with less than 5 seconds of read-only
downtime - new data cannot be written, existing data can still be
read).
See the description of [how Sharding works in Vitess](Sharding.md) for
higher level concepts on Sharding.
## Process
To follow a step-by-step guide for how to shard a keyspace, you can see [this page](HorizontalReshardingGuide.md).
In general, the process to achieve this goal is composed of the following steps:
- pick the original shard(s)
- pick the destination shard(s) coverage
- create the destination shard(s) tablets (in a mode where they are not used to serve traffic yet)
- bring up the destination shard(s) tablets, with read-only masters.
- backup and split the data from the original shard(s)
- merge and import the data on the destination shard(s)
- start and run filtered replication from original to destination shard(s), catch up
- move the read-only traffic to the destination shard(s), stop serving read-only traffic from original shard(s). This transition can take a few hours. We might want to move rdonly separately from replica traffic.
- in quick succession:
- make original master(s) read-only
- flush filtered replication on all filtered replication source servers (after making sure they were caught up with their masters)
- wait until replication is caught up on all destination shard(s) masters
- move the write traffic to the destination shard(s)
- make destination master(s) read-write
- scrap the original shard(s)
* pick the original shard(s)
* pick the destination shard(s) coverage
* create the destination shard(s) tablets (in a mode where they are not used to serve traffic yet)
* bring up the destination shard(s) tablets, with read-only masters.
* backup and split the data from the original shard(s)
* merge and import the data on the destination shard(s)
* start and run filtered replication from original to destination shard(s), catch up
* move the read-only traffic to the destination shard(s), stop serving read-only traffic from original shard(s). This transition can take a few hours. We might want to move rdonly separately from replica traffic.
* in quick succession:
* make original master(s) read-only
* flush filtered replication on all filtered replication source servers (after making sure they were caught up with their masters)
* wait until replication is caught up on all destination shard(s) masters
* move the write traffic to the destination shard(s)
* make destination master(s) read-write
* scrap the original shard(s)
## Applications
The main application we currently support:
- in a sharded keyspace, split or merge shards (horizontal sharding)
- in a non-sharded keyspace, break out some tables into a different keyspace (vertical sharding)
* in a sharded keyspace, split or merge shards (horizontal sharding)
* in a non-sharded keyspace, break out some tables into a different keyspace (vertical sharding)
With these supported features, it is very easy to start with a single keyspace containing all the data (multiple tables),
and then as the data grows, move tables to different keyspaces, start sharding some keyspaces, ... without any real
@ -38,17 +43,17 @@ downtime for the application.
## Scaling Up and Down
Here is a quick table of what to do with Vitess when a change is required:
- uniformly increase read capacity: add replicas, or split shards
- uniformly increase write capacity: split shards
- reclaim free space: merge shards / keyspaces
- increase geo-diversity: add new cells and new replicas
- cool a hot tablet: if read access, add replicas or split shards, if write access, split shards.
* uniformly increase read capacity: add replicas, or split shards
* uniformly increase write capacity: split shards
* reclaim free space: merge shards / keyspaces
* increase geo-diversity: add new cells and new replicas
* cool a hot tablet: if read access, add replicas or split shards, if write access, split shards.
## Filtered Replication
The cornerstone of Resharding is being able to replicate the right data. Mysql doesn't support any filtering, so the
Vitess project implements it entirely:
- the tablet server tags transactions with comments that describe what the scope of the statements are (which keyspace_id,
* the tablet server tags transactions with comments that describe what the scope of the statements are (which keyspace_id,
which table, ...). That way the MySQL binlogs contain all filtering data.
- a server process can filter and stream the MySQL binlogs (using the comments).
- a client process can apply the filtered logs locally (they are just regular SQL statements at this point).
* a server process can filter and stream the MySQL binlogs (using the comments).
* a client process can apply the filtered logs locally (they are just regular SQL statements at this point).

Просмотреть файл

@ -42,11 +42,11 @@ $ vtctl -wait-time=30s ValidateSchemaKeyspace user
## Changing the Schema
Goals:
- simplify schema updates on the fleet
- minimize human actions / errors
- guarantee no or very little downtime for most schema updates
- do not store any permanent schema data in Topology Server, just use it for actions.
- only look at tables for now (not stored procedures or grants for instance, although they both could be added fairly easily in the same manner)
* simplify schema updates on the fleet
* minimize human actions / errors
* guarantee no or very little downtime for most schema updates
* do not store any permanent schema data in Topology Server, just use it for actions.
* only look at tables for now (not stored procedures or grants for instance, although they both could be added fairly easily in the same manner)
Were trying to get reasonable confidence that a schema update is going to work before applying it. Since we cannot really apply a change to live tables without potentially causing trouble, we have implemented a Preflight operation: it copies the current schema into a temporary database, applies the change there to validate it, and gathers the resulting schema. After this Preflight, we have a good idea of what to expect, and we can apply the change to any database and make sure it worked.
@ -71,35 +71,37 @@ type SchemaChange struct {
```
And the associated ApplySchema remote action for a tablet. Then the performed steps are:
- The database to use is either derived from the tablet dbName if UseVt is false, or is the _vt database. A use dbname is prepended to the Sql.
- (if BeforeSchema is not nil) read the schema, make sure it is equal to BeforeSchema. If not equal: if Force is not set, we will abort, if Force is set, well issue a warning and keep going.
- if AllowReplication is false, well disable replication (adding SET sql_log_bin=0 before the Sql).
- We will then apply the Sql command.
- (if AfterSchema is not nil) read the schema again, make sure it is equal to AfterSchema. If not equal: if Force is not set, we will issue an error, if Force is set, well issue a warning.
* The database to use is either derived from the tablet dbName if UseVt is false, or is the _vt database. A use dbname is prepended to the Sql.
* (if BeforeSchema is not nil) read the schema, make sure it is equal to BeforeSchema. If not equal: if Force is not set, we will abort, if Force is set, well issue a warning and keep going.
* if AllowReplication is false, well disable replication (adding SET sql_log_bin=0 before the Sql).
* We will then apply the Sql command.
* (if AfterSchema is not nil) read the schema again, make sure it is equal to AfterSchema. If not equal: if Force is not set, we will issue an error, if Force is set, well issue a warning.
We will return the following information:
- whether it worked or not (doh!)
- BeforeSchema
- AfterSchema
* whether it worked or not (doh!)
* BeforeSchema
* AfterSchema
### Use case 1: Single tablet update:
- we first do a Preflight (to know what BeforeSchema and AfterSchema will be). This can be disabled, but is not recommended.
- we then do the schema upgrade. We will check BeforeSchema before the upgrade, and AfterSchema after the upgrade.
* we first do a Preflight (to know what BeforeSchema and AfterSchema will be). This can be disabled, but is not recommended.
* we then do the schema upgrade. We will check BeforeSchema before the upgrade, and AfterSchema after the upgrade.
### Use case 2: Single Shard update:
- need to figure out (or be told) if its a simple or complex schema update (does it require the shell game?). For now we'll use a command line flag.
- in any case, do a Preflight on the master, to get the BeforeSchema and AfterSchema values.
- in any case, gather the schema on all databases, to see which ones have been upgraded already or not. This guarantees we can interrupt and restart a schema change. Also, this makes sure no action is currently running on the databases we're about to change.
- if simple:
- nobody has it: apply to master, very similar to a single tablet update.
- some tablets have it but not others: error out
- if complex: do the shell game while disabling replication. Skip the tablets that already have it. Have an option to re-parent at the end.
- Note the Backup, and Lag servers won't apply a complex schema change. Only the servers actively in the replication graph will.
- the process can be interrupted at any time, restarting it as a complex schema upgrade should just work.
* need to figure out (or be told) if its a simple or complex schema update (does it require the shell game?). For now we'll use a command line flag.
* in any case, do a Preflight on the master, to get the BeforeSchema and AfterSchema values.
* in any case, gather the schema on all databases, to see which ones have been upgraded already or not. This guarantees we can interrupt and restart a schema change. Also, this makes sure no action is currently running on the databases we're about to change.
* if simple:
* nobody has it: apply to master, very similar to a single tablet update.
* some tablets have it but not others: error out
* if complex: do the shell game while disabling replication. Skip the tablets that already have it. Have an option to re-parent at the end.
* Note the Backup, and Lag servers won't apply a complex schema change. Only the servers actively in the replication graph will.
* the process can be interrupted at any time, restarting it as a complex schema upgrade should just work.
### Use case 3: Keyspace update:
- Similar to Single Shard, but the BeforeSchema and AfterSchema values are taken from the first shard, and used in all shards after that.
- We don't know the new masters to use on each shard, so just skip re-parenting all together.
* Similar to Single Shard, but the BeforeSchema and AfterSchema values are taken from the first shard, and used in all shards after that.
* We don't know the new masters to use on each shard, so just skip re-parenting all together.
This translates into the following vtctl commands:

Просмотреть файл

@ -57,8 +57,8 @@ level picture of all the servers and their current state.
### vtworker
vtworker is meant to host long-running processes. It supports a plugin infrastructure, and offers libraries to easily pick tablets to use. We have developed:
- resharding differ jobs: meant to check data integrity during shard splits and joins.
- vertical split differ jobs: meant to check data integrity during vertical splits and joins.
* resharding differ jobs: meant to check data integrity during shard splits and joins.
* vertical split differ jobs: meant to check data integrity during vertical splits and joins.
It is very easy to add other checker processes for in-tablet integrity checks (verifying foreign key-like relationships), and cross shard data integrity (for instance, if a keyspace contains an index table referencing data in another keyspace).

Просмотреть файл

@ -41,12 +41,12 @@ An entire Keyspace can be locked. We use this during resharding for instance, wh
### Shard
A Shard contains a subset of the data for a Keyspace. The Shard record in the global topology contains:
- the MySQL Master tablet alias for this shard
- the sharding key range covered by this Shard inside the Keyspace
- the tablet types this Shard is serving (master, replica, batch, …), per cell if necessary.
- if during filtered replication, the source shards this shard is replicating from
- the list of cells that have tablets in this shard
- shard-global tablet controls, like blacklisted tables no tablet should serve in this shard
* the MySQL Master tablet alias for this shard
* the sharding key range covered by this Shard inside the Keyspace
* the tablet types this Shard is serving (master, replica, batch, …), per cell if necessary.
* if during filtered replication, the source shards this shard is replicating from
* the list of cells that have tablets in this shard
* shard-global tablet controls, like blacklisted tables no tablet should serve in this shard
A Shard can be locked. We use this during operations that affect either the Shard record, or multiple tablets within a Shard (like reparenting), so multiple jobs dont concurrently alter the data.
@ -61,19 +61,19 @@ This section describes the data structures stored in the local instance (per cel
### Tablets
The Tablet record has a lot of information about a single vttablet process running inside a tablet (along with the MySQL process):
- the Tablet Alias (cell+unique id) that uniquely identifies the Tablet
- the Hostname, IP address and port map of the Tablet
- the current Tablet type (master, replica, batch, spare, …)
- which Keyspace / Shard the tablet is part of
- the health map for the Tablet (if in degraded mode)
- the sharding Key Range served by this Tablet
- user-specified tag map (to store per installation data for instance)
* the Tablet Alias (cell+unique id) that uniquely identifies the Tablet
* the Hostname, IP address and port map of the Tablet
* the current Tablet type (master, replica, batch, spare, …)
* which Keyspace / Shard the tablet is part of
* the health map for the Tablet (if in degraded mode)
* the sharding Key Range served by this Tablet
* user-specified tag map (to store per installation data for instance)
A Tablet record is created before a tablet can be running (either by `vtctl InitTablet` or by passing the `init_*` parameters to vttablet). The only way a Tablet record will be updated is one of:
- The vttablet process itself owns the record while it is running, and can change it.
- At init time, before the tablet starts
- After shutdown, when the tablet gets scrapped or deleted.
- If a tablet becomes unresponsive, it may be forced to spare to remove it from the serving graph (such as when reparenting away from a dead master, by the `vtctl ReparentShard` action).
* The vttablet process itself owns the record while it is running, and can change it.
* At init time, before the tablet starts
* After shutdown, when the tablet gets scrapped or deleted.
* If a tablet becomes unresponsive, it may be forced to spare to remove it from the serving graph (such as when reparenting away from a dead master, by the `vtctl ReparentShard` action).
### Replication Graph
@ -86,16 +86,16 @@ The Serving Graph is what the clients use to find which EndPoints to send querie
#### SrvKeyspace
It is the local representation of a Keyspace. It contains information on what shard to use for getting to the data (but not information about each individual shard):
- the partitions map is keyed by the tablet type (master, replica, batch, …) and the values are list of shards to use for serving.
- it also contains the global Keyspace fields, copied for fast access.
* the partitions map is keyed by the tablet type (master, replica, batch, …) and the values are list of shards to use for serving.
* it also contains the global Keyspace fields, copied for fast access.
It can be rebuilt by running `vtctl RebuildKeyspaceGraph`. It is not automatically rebuilt when adding new tablets in a cell, as this would cause too much overhead and is only needed once per cell/keyspace. It may also be changed during horizontal and vertical splits.
#### SrvShard
It is the local representation of a Shard. It contains information on details internal to this Shard only, but not to any tablet running in this shard:
- the name and sharding Key Range for this Shard.
- the cell that has the master for this Shard.
* the name and sharding Key Range for this Shard.
* the cell that has the master for this Shard.
It is possible to lock a SrvShard object, to massively update all EndPoints in it.
@ -104,10 +104,10 @@ It can be rebuilt (along with all the EndPoints in this Shard) by running `vtctl
#### EndPoints
For each possible serving type (master, replica, batch), in each Cell / Keyspace / Shard, we maintain a rolled-up EndPoint list. Each entry in the list has information about one Tablet:
- the Tablet Uid
- the Host on which the Tablet resides
- the port map for that Tablet
- the health map for that Tablet
* the Tablet Uid
* the Host on which the Tablet resides
* the port map for that Tablet
* the health map for that Tablet
## Workflows Involving the Topology Server
@ -144,13 +144,13 @@ For locking, we use an auto-incrementing file name in the `/action` subdirectory
Note the paths used to store global and per-cell data do not overlap, so a single ZK can be used for both global and local ZKs. This is however not recommended, for reliability reasons.
- Keyspace: `/zk/global/vt/keyspaces/<keyspace>`
- Shard: `/zk/global/vt/keyspaces/<keyspace>/shards/<shard>`
- Tablet: `/zk/<cell>/vt/tablets/<uid>`
- Replication Graph: `/zk/<cell>/vt/replication/<keyspace>/<shard>`
- SrvKeyspace: `/zk/<cell>/vt/ns/<keyspace>`
- SrvShard: `/zk/<cell>/vt/ns/<keyspace>/<shard>`
- EndPoints: `/zk/<cell>/vt/ns/<keyspace>/<shard>/<tablet type>`
* Keyspace: `/zk/global/vt/keyspaces/<keyspace>`
* Shard: `/zk/global/vt/keyspaces/<keyspace>/shards/<shard>`
* Tablet: `/zk/<cell>/vt/tablets/<uid>`
* Replication Graph: `/zk/<cell>/vt/replication/<keyspace>/<shard>`
* SrvKeyspace: `/zk/<cell>/vt/ns/<keyspace>`
* SrvShard: `/zk/<cell>/vt/ns/<keyspace>/<shard>`
* EndPoints: `/zk/<cell>/vt/ns/<keyspace>/<shard>/<tablet type>`
We provide the 'zk' utility for easy access to the topology data in ZooKeeper. For instance:
```
@ -171,11 +171,11 @@ We use the `_Data` filename to store the data, JSON encoded.
For locking, we store a `_Lock` file with various contents in the directory that contains the object to lock.
We use the following paths:
- Keyspace: `/vt/keyspaces/<keyspace>/_Data`
- Shard: `/vt/keyspaces/<keyspace>/<shard>/_Data`
- Tablet: `/vt/tablets/<cell>-<uid>/_Data`
- Replication Graph: `/vt/replication/<keyspace>/<shard>/_Data`
- SrvKeyspace: `/vt/ns/<keyspace>/_Data`
- SrvShard: `/vt/ns/<keyspace>/<shard>/_Data`
- EndPoints: `/vt/ns/<keyspace>/<shard>/<tablet type>`
* Keyspace: `/vt/keyspaces/<keyspace>/_Data`
* Shard: `/vt/keyspaces/<keyspace>/<shard>/_Data`
* Tablet: `/vt/tablets/<cell>-<uid>/_Data`
* Replication Graph: `/vt/replication/<keyspace>/<shard>/_Data`
* SrvKeyspace: `/vt/ns/<keyspace>/_Data`
* SrvShard: `/vt/ns/<keyspace>/<shard>/_Data`
* EndPoints: `/vt/ns/<keyspace>/<shard>/<tablet type>`