Currently, the worker tests are hanging from time to time.
Travis kills these tests automatically after 10 minutes (of not
receiving any output). We can be more aggressive about it and kill them
ourselves after 5 minutes.
1. move common dependencies into parent pom.
2. upgrade com.google.guava to 18.0.
3. rename java build target to java_client_test.
4. fix makefile so java client test will run tests from each sub package.
We sporadically see that the worker.py test times out on Travis CI after 10 minutes. Enabling logging will give us better visibility to debug the problem.
1. Get rid of SetupCommand.java and move its logic to TestEnv.
2. Add getRpcClientFactory method in TestEnv.
3. Use setters & getters in TestEnv.
4. define vtgate.test.env and vtgate.rpcclient.factory properties in vtgate-client/pom.xml.
I removed functions from client.py because they need
to be rethought in light of the new API, but missed
removing them from client_test.py.
I've now added client_test to our integration tests
to make sure it doesn't get missed.
I'll submit this right away because it breaks import.
Moved tests out of the "ci_skip_integration_test" target because they don't seem to be so flaky.
Included "ci_skip_integration_test" target in Travis because the tests in there don't seem to be so flaky and we want maximum coverage.
Between 1, 2 and 4, 4 resulted in the shortest total duration. 4 was also better than 2 which is the capped number of availble cores in a Travis CI container.
I suspect that the default is 32 (the number of cores) and by setting it explicitly we effectively reduce the value - but reduce the stress on the system and therefore everything goes faster.
With this change, the following dependencies are no longer installed as well:
- New Relic monitoring (is no longer necessary)
- Java dependencies (no more necessary since we killed most of the old Java code)
Use single line shell command to generate proto file instead of
hard coding command for each proto.
Steps:
1. list all proto files.
2. remove 'proto/' prefix and '.proto' suffix.
3. run protoc for each proto and put in go/vt/proto/${proto_file_name}
cannot have the same <package name>.<data type> as these, or we cannot
load them at the same time. So to fix this:
- renaming the conflicting ones from xxx.proto to xxxdata.proto.
- renaming vtgateservice.VTGate to vtgateservice.Vitess
Note we can still change the names I chose here, just not back to
conflicting ones. If anyone has better ideas, we can implement
in subsequent changes. This is to get the import to google3 unstuck.
The automation framework allows to automate cluster operations which
require a series of manual steps e.g. resharding.
A Cluster Operation has a list of task containers which are processed
sequentially. Each task container can contain one or more tasks which
will be executed in parallel.
Here's an example of a cluster operation with two task containers. The
second task container has two tasks:
- step 1
- step 2a | step 2b
If the task container contains one task, the task can emit new task
containers which will be inserted after the current task container. This
mechanism is used to fully expand Cluster Operations by special tasks
which emit new task containers e.g. "ReshardingTask".
This patchset implements the minimal steps to automate "resharding"
whereas task implementations for "vtctl" and "vtworker" are missing.
These will be added in later, separate commits.
With this change, developers must have "mvn" installed or "make test"
will fail.
Travis CI did already run the test. Now "make test" and Travis CI config are both in sync again.