Previously, with workspace=temp-schema (the default), all workspace operations
were written to the binary log, meaning that replicas would execute them. This
can be undesirable for performance or operational cleanliness reasons, since
running the operations on replicas serves no real purpose: the temp schema is
dropped after each run, and Skeema won't interact with the replicas anyway.
This commit adds a new enum option, temp-schema-binlog, which can be set to
either "on" (keep the old behavior of writing these queries to binlog), "off"
(skip binlogging of workspace queries -- error if non-super-user), or the
default of "auto" (skip binlogging of workspace queries only if super-user).
This relates to issue #93.
This commit adds a new option, temp-schema-threads, which controls the
concurrency level for workspace CREATE and DROP queries when using
workspace=temp-schema. The default value for this option is 5 threads, which
is a reduction from the previous hardcoded default concurrency of 10 threads.
Users may wish to raise this value to improve Skeema's performance in some
scenarios, or lower this value to prevent mutex contention in other scenarios.
Refer to the option reference for more information about tuning this option.
This relates to #93. Keeping that issue open for now though since an
additional related improvement is also planned (ability to disable binlogging
for temp-schema operations)
This commit deprecates the reuse-temp-schema option, since it is not commonly
used, and doesn't provide any real advantages over the default behavior of
dropping and recreating the temp schema each time.
This commit also fixes two edge case bugs in reuse-temp-schema's behavior:
* When used in an environment with multiple schemas per instance and
differing schema-level default charset or collation values, a previously-
processed schema's defaults could remain in-place on the temp schema,
inadvertently affecting the introspected workspace for subsequent schemas.
* When used in the presence of stored procs/funcs, the procs/funcs would not
be cleaned up between runs, typically causing the subsequent run to fail
with duplicate routine errors.
Integration testing coverage has been added to ensure that the temp schema is
truly empty of all supported object types prior to use. This will avoid an
undetected regression if e.g. support for views or triggers is added to a
future version of Skeema.
* Despite being enums, lint-* options now treat falsey values (including
--skip- prefix) as equivalent to "ignore"
* Enum values are now lowercase throughout all usage messages and docs. In
terms of functionality, enums have always been case-insensitive, but the
use of all-caps in usage/docs led many to think it was required.
When using workspace=docker, if Skeema finds a pre-existing container created
from a previous run, the image of the container is verified. The previous
logic was overly strict, erroring in two valid situations:
* If the local Docker tag for the major.minor version now points to a
different patch release, due to other non-Skeema Docker usage on the system
* If the version of Docker is several years old and uses the previous
hashing scheme for image identification
In either of these cases, Skeema no longer errors, as long as the Dockerized
instance's flavor (vendor and major.minor version) can be introspected and
correctly match that of the desired image (via Skeema's flavor option).
The host-wrapper and schema options both support shelling out to an external
script, whose output is parsed to obtain a list of host:port or schema names,
respectively. When parsing the output, these options support a few different
delimiters (in order from highest precedence: newlines, commas, tabs, spaces)
and attempt to split the output on the highest-precedence delimiter found in
the output string.
However, since newline has the highest precedence, this means output with a
trailing newline would cause Skeema to assume newline is the delimiter, even
if the output was actually using some other delimiter.
This commit fixes the bug by removing trailing (and leading) spacing of any
type, prior to searching the output for delimiters.
Additionally, this commit fixes exposure of error messages around invalid
values for host and host-wrapper, which were inadvertently obscure ("dir maps
to an empty list of instances") previously in some cases.
* New field Timeout provides a way to terminate the process after specified
amount of time has elapsed
* New field CombineOutput provides a way to redirect STDERR to STDOUT
These are not used in Skeema yet, but may prove useful in the future.
This commit fixes edge cases where the behavior of Skeema's password option
did not exactly match the behavior of the MySQL command-line client. In
particular, this caused problems with Travis CI's default ~/.my.cnf file.
These command-line invocations now correctly use no password, rather than
prompting from a TTY:
* --password=
* --password=''
* --password=""
Ditto for these option file lines:
* password=
* password=''
* password=""
In other words, Skeema only prompts for a password on STDIN if no equals sign
is found immediately after the password option.
* Add options.md index link at top
* Alphabetize in options.md
* Other related doc tweaks
* README: Update contributors list
* Tests: move skip-my-cnf logic from TestConfigIgnoreOptions into existing
TestAddGlobalConfigFiles (achieves same test coverage)
Changes logic to dynamically ignore my-cnf based off flag, meaning it
can potentially be disabled by an earlier configuration option.
Add documentation for the flag.
* linter.Result now includes a map field, storing pointers to all schemas
that were introspected during linting. The schemas are keyed by dir path.
* util.ShellOut now includes a Dir field, allowing callers to specify the
initial working directory of the process.
* util.NewShellOut() has been removed, as it was unused and would not support
adding additional fields to the ShellOut struct. Callers should create
&ShellOut{} values directly with the desired fields.
Skeema parses the standard per-user MySQL option file, ~/.my.cnf, for username
and password values. The intended behavior is to ignore the host option
though, if present in this specific file. However, a regression in Skeema
v1.0.6 broke this, and presence of host in ~/.my.cnf was triggering a fatal
error.
This commit fixes the regression, so that host is now ignored in ~/.my.cnf
again. See discussion in #64 for more context.
BACKGROUND:
Skeema's behavior does not rely on parsing SQL DDL, as this can be too brittle across various MySQL versions and vendors, which have subtle differences in features and functionality. Instead, Skeema uses metadata reported directly from the database to introspect schemas, using information_schema as well as various SHOW commands.
In order to accurately introspect the schemas represented in your filesystem's *.sql files, Skeema actually runs the files' CREATE TABLE statements in a temporary location, now called a "workspace." Previously (and still by default), Skeema creates, uses, and then drops a temporary schema on each database it interacts with.
WHAT'S NEW:
This PR adds the ability to instead use a local Docker container for workspace operations. Two new options control this behavior:
* `workspace=docker` tells Skeema to dynamically manage local Docker container(s) for workspace operations, instead of using a temporary schema on each live DB.
* `docker-cleanup` controls how to manage the container lifecycle as Skeema is exiting. The default, `docker-cleanup=none`, leaves containers running so that subsequent invocations of Skeema are faster. Setting `docker-cleanup=stop` stops containers but does not remove them, and `docker-cleanup=destroy` deletes them entirely.
This functionality is especially useful when running Skeema from a different region/datacenter than your database -- for example, running Skeema on your laptop, when your databases are in AWS. Using `workspace=docker` greatly reduces painful network latency in this scenario, especially if you have a large number of tables. See discussion in #25 for background.
* Fix `skeema help`, `skeema --help`, etc which were broken by refactor in #44
* Fix `skeema add-environment --help`, which was always broken due to having a
required positional arg (although other forms like `skeema help
add-environment` worked previously)
* Add tests to help handlers to ensure no error is returned
* When `skeema` exits, gracefully close all connection pools, to avoid aborted
connection counter/logging in some versions of MySQL
* `skeema diff`: If the only differences for a dir are schema-level DDL, the
exit code now reflects this as a difference
* applier.TargetGroupChanForDir: skipCount return value is no longer a pointer
* cmd_init.go: Remove unnecessary createOptionFile() function
* cmd_pull.go: Track skipCount by return value, rather than a pointer arg
This PR moves much of Skeema's logic out of the main package and into several
new sub-packages, which can be imported by other applications if desired.
Functionality is largely unchanged, and no new features have been added. But a
few foundational benefits of this work include:
* The codebase no longer assumes a 1:1 mapping between *.sql files and tables.
This will eventually permit non-table object types (views, procs, grants, etc)
to be stored in the same repo as schemas, if desired. See #41 for background.
* In the upcoming Skeema 1.1.x series, it will be possible to use a local Docker
instance for temp schema operations. This performs better in high-latency,
high-table-count scenarios; see #25 for background.
* The limit on max *.sql file size has been removed. Closes#34.
* `skeema pull` now performs much better than before, as long as --normalize is
enabled (which it is by default).
* The code supporting `skeema push --concurrent-instances` is now much cleaner
and more idiomatic.
* Test coverage has been improved.