doc: Proof-read Life of a Query and made some corrections.

This commit is contained in:
Michael Berlin 2015-02-18 18:11:06 -08:00
Родитель 3e54278192
Коммит a23df51dd5
1 изменённых файлов: 20 добавлений и 19 удалений

Просмотреть файл

@ -4,7 +4,7 @@ Life of A Query
* [From Client to VtGate](#from-client-to-vtgate)
* [From VtGate to VtTablet](#from-vtgate-to-vttablet)
* [From VtTablet to MySQL](#from-vttablet-to-mysql)
* [Put it all together](#put-it-all-together)
* [Putting it all together](#putting-it-all-together)
* [TopoServer](#toposerver)
* [Streaming Query](#streaming-query)
* [Scatter Query](#scatter-query)
@ -13,57 +13,57 @@ Life of A Query
* [VtGate to VtTablet Code Path](#vtgate-to-vttablet-code-path)
* [VtTablet to MySQL Code Path](#vttablet-to-mysql-code-path)
A query means a request for information from database and it involves four componenets in the case of Vitess, including client application, VtGate, VtTablet and MySQL instance. This doc explains interaction happens between and within components.
A query means a request for information from database and it involves four components in the case of Vitess, including the client application, VtGate, VtTablet and MySQL instance. This doc explains the interaction which happens between and within components.
![](https://raw.githubusercontent.com/youtube/vitess/master/doc/life_of_a_query.png)
At a very high level, as the graph shows, first client sends a query to VtGate. VtGate then resolves the query and routes it to the right VtTablets. For each VtTablet that receives the query, it does necessary validations and passes the query to underlying MySQL instance. After gathering results from MySQL, VtTablet sends response back to VtGate. Once VtGate receives response from all VtTablets, it sends the combined result to client. In the presence of VtTablet errors, VtGate will retry the query if errors are recoverable and it only fails the query if either errors are unrecoverable or maximum retry times has been reached.
At a very high level, as the graph shows, first the client sends a query to VtGate. VtGate then resolves the query and routes it to the right VtTablets. For each VtTablet that receives the query, it does necessary validations and passes the query to the underlying MySQL instance. After gathering results from MySQL, VtTablet sends the response back to VtGate. Once VtGate receives responses from all VtTablets, it sends the combined result to the client. In the presence of VtTablet errors, VtGate will retry the query if errors are recoverable and it only fails the query if either errors are unrecoverable or the maximum number of retries has been reached.
## From Client to VtGate
A client application first sends a bson rpc with an embedded sql query to VtGate. VtGate's rpc server unmarshals this rpc request, call appropriate VtGate method and return its result back to client. As following graph shows, VtGate has a rpc server that listens to localhost:port/\_bson\_rpc\_ for http requests and localhost:port/\_bson\_rpc\_/auth for https requests.
A client application first sends a bson rpc with an embedded sql query to VtGate. VtGate's rpc server unmarshals this rpc request, calls the appropriate VtGate method and return its result back to client. VtGate has an rpc server that listens to localhost:port/\_bson\_rpc\_ for http requests and localhost:port/\_bson\_rpc\_/auth for https requests.
![](https://raw.githubusercontent.com/youtube/vitess/master/doc/life_of_a_query_client_to_vtgate.png)
VtGate keeps an in-memory table that stores all available rpc methods for each service, e.g. VtGate uses "VTGate" as its service name and most its methods defined in [go/vt/vtgate/vtgate.go](../go/vt/vtgate/vtgate.go) are used to serve rpc request [go/rpcplus/server.go](../go/rpcplus/server.go).
VtGate keeps an in-memory table that stores all available rpc methods for each service, e.g. VtGate uses "VTGate" as its service name and most of its methods defined in [go/vt/vtgate/vtgate.go](../go/vt/vtgate/vtgate.go) are used to serve rpc request [go/rpcplus/server.go](../go/rpcplus/server.go).
## From VtGate to VtTablet
![](https://raw.githubusercontent.com/youtube/vitess/master/doc/life_of_a_query_vtgate_to_vttablet.png)
After receiving a rpc call from client and one of its Execute* method being invoked, VtGate needs to figure out which shards should receive the query and send query to each of them. In addition, VtGate talks to topo server to get necessary information to create a VtTablet connection for each shard. At this point, VtGate is able to send query to the right VtTablets in parallel. VtGate also does retry if timeout happens or some VtTablets return recoverable errors.
After receiving an rpc call from the client and one of its Execute* method being invoked, VtGate needs to figure out which shards should receive the query and send it to each of them. In addition, VtGate talks to the topo server to get necessary information to create a VtTablet connection for each shard. At this point, VtGate is able to send the query to the right VtTablets in parallel. VtGate also does retry if timeout happens or some VtTablets return recoverable errors.
Internally, VtGate has a ScatterConn instance and uses it to execute queries across multiple ShardConn connections. A ScatterConn performs the query on selected shards in parallel. It first obtains a ShardConn connection for every shard and sends query use ShardConn's execute method. If the requested session is in a transaction, it will open a new transactions on the connection, and updates the Session with the transaction id. If the session already contains a transaction id for the shard, it reuses it. If there are any unrecoverable errors during a transaction, it rolls back the transaction for all shards.
Internally, VtGate has a ScatterConn instance and uses it to execute queries across multiple ShardConn connections. A ScatterConn performs the query on selected shards in parallel. It first obtains a ShardConn connection for every shard and sends the query using the ShardConn's execute method. If the requested session is in a transaction, it will open a new transaction on the connection, and update the Session with the transaction id. If the session already contains a transaction id for the shard, it reuses it. If there are any unrecoverable errors during a transaction, it rolls back the transaction for all shards.
A ShardConn object represents a load balanced connection to a group of VtTablets that belong the same shard. ShardConn can be concurrently used across goroutines.
A ShardConn object represents a load balanced connection to a group of VtTablets that belong to the same shard. ShardConn can be concurrently used across goroutines.
## From VtTablet to MySQL
![](https://raw.githubusercontent.com/youtube/vitess/master/doc/life_of_a_query_vttablet_to_mysql.png)
Once received a rpc call from VtGate, VtTablet do a few checks before passing query to MySQL. It first validates the current VtTablet state including sessions id, then generates a query plan and applies predefined query rules and do ACL check. It also checks whether the query hits row cache and returns result immediately if so. In addition, VtTablet consolidates duplicate queries from executing simultaneously and shares results between them. At this point, VtTablet has no way but pass the query down to MySQL layer and waits for the result.
Once VtTablet received an rpc call from VtGate, it does a few checks before passing the query to MySQL. First, it validates the current VtTablet state including the session id, then generates a query plan and applies predefined query rules and does ACL checks. It also checks whether the query hits the row cache and returns the result immediately if so. In addition, VtTablet consolidates duplicate queries from executing simultaneously and shares results between them. At this point, VtTablet has no way but pass the query down to MySQL layer and wait for the result.
## Put it all together
## Putting it all together
![](https://raw.githubusercontent.com/youtube/vitess/master/doc/life_of_a_query_all.png)
## TopoServer
A topo server stores information to help VtGate navigate query to the right VtTablets. It contains keyspace to shards mappings, keyspace id to shard mapping and ports that a VtTablet listens to (EndPoint). VtGates caches those information in the memory and periodically do updates if there are changes happened in the topo server.
A topo server stores information to help VtGate navigate the query to the right VtTablets. It contains keyspace to shards mappings, keyspace id to shards mappings and ports that a VtTablet listens to (EndPoint). VtGate caches this information in memory and periodically updates it if changes have happened in the topo server.
## Streaming Query
In general speaking, a streaming query means query results will be returned as a stream. In Vitess's case, both VtGate and VtTablet will send result back as soon as it is available. VtTablet by default will collect a fixed number of rows returned from MySQL, send them back to VtGate and repeats the above step until all rows have been returned.
Generally speaking, a streaming query means query results will be returned as a stream. In Vitess' case, both VtGate and VtTablet will send results back as soon as it is available. VtTablet by default will collect a fixed number of rows returned from MySQL, send them back to VtGate and repeats the above step until all rows have been returned.
## Scatter Query
A scatter query, as its name indicates, will hit multiple shards. In Vitess, a scatter query is recognized once VtGate determines a query need to hit multiple VtTablets. VtGate then sends query to these VtTablets, assembles the result after receiving all response and returns the combined result to the client.
A scatter query, as its name indicates, will hit multiple shards. In Vitess, a scatter query is recognized once VtGate determines a query needs to hit multiple VtTablets. VtGate then sends the query to these VtTablets, assembles the result after receiving all responses and returns the combined result to the client.
## Misc
### Rpc Server Code Path (VtGate)
Init a rpc server
Init an rpc server
```
go/cmd/vtgate/vtgate.go: main() ->
@ -74,18 +74,19 @@ go/cmd/vtgate/vtgate.go: main() ->
go/rpcwrap/rpcwrap.go: ServeRPC("bson", NewServerCodec) -> // common code to register rpc server
```
ServeRPC("bson", NewServerCodec) register a rpcHandler instance whose ServeHTTP(http.ResponseWriter, *http.Request) will be called for every http request
ServeRPC("bson", NewServerCodec) registers an rpcHandler instance whose ServeHTTP(http.ResponseWriter, *http.Request) will be called for every http request
Rpc server handle http request
The rpc server handles the http request:
```
go/rpcwrap/rpcwrap.go rpcHandler.ServeHTTP ->
go/rpcwrap/rpcwrap.go rpcHandler.server.ServeCodecWithContext ->
go/rpcplus/server.go Server.ServeCodecWithContext(context.Context, ServerCodec) (note: rpcHandler uses a global DefaultServer instance defined in the sever.go) ->
go/rpcplus/server.go Server.readRequest(ServeCodec) will use a given codec to extract (service, methodType, request, request arguments, reply value, keep reading), go/rpcplus/server.go
Finally we do "service.call(..)" with parameters provided in the request. In the current setup, service.call will always call some method in VtGate (go/vt/vtgate/vtgate.go).
go/rpcplus/server.go Server.ServeCodecWithContext(context.Context, ServerCodec) (note: rpcHandler uses a global DefaultServer instance defined in the server.go) ->
go/rpcplus/server.go Server.readRequest(ServeCodec) will use a given codec to extract (service, methodType, request, request arguments, reply value, keep reading)
```
Finally we do "service.call(..)" with parameters provided in the request. In the current setup, service.call will always call some method in VtGate (go/vt/vtgate/vtgate.go).
### VtGate to VtTablet Code Path
Here is the code path for a query with keyspace id.