Make sure the `OneShotOptimizer.suggest()` call always returns the same
configuration.
Changes related to `MockEnv` determinism moved to #769Closes#755
---------
Co-authored-by: Brian Kroth <bpkroth@users.noreply.github.com>
Builds off of #762, #763, and #764.
Prepares rules and configs to enable isort and black formatters and
checks but doesn't enable them yet.
After these are enabled (next PR) we will reformat all files and ignore
that revision in git blame configs.
Then, we can convert configs stored in `setup.cfg` and `.pylintrc` to
the top level `pyproject.toml` and remove the older configs.
This PR builds off of #762 and #763
These are in part
1. followup fixups for #746 (e.g., to allow setuptools-scm to be pulled
in at build time as a build dependency only when conda an pip have
mismatched version issues), and
2. Modernization improvements to allow us to make better use of other
tools (e.g., `black` that only accept `pyproject.toml` files as their
configuration files).
To do so, we move some configs from `setup.py` to `pyproject.toml` for
each module.
However, to retain the ability to rewrite URLs in published README.md
files on PyPi as well as consistent version dependencies across modules
without the need to manually specify version numbers (e.g., using
`setuptools-scm`) we mark a few dependencies as dynamic and leave our
existing logic inside the `setup.py` file.
Finally, we reorganize the `version.py` file to be inside the module and
fix a few previous omissions for `mlos_viz`.
* Use `mock_env_range`, `mock_env_seed`, and `mock_env_metrics` instead
of just `range`, `seed`, and `metrics` to avoid naming conflicts.
* Use integer `mocl_env_seed` values to distinguish between
deterministic behavior, random seed, and user-specified seed for
`MockEnv`.
* Update unit tests and configs to validate the seed behavior.
* Add new unit tests for different initializations of the tunable values
---------
Co-authored-by: Brian Kroth <bpkroth@users.noreply.github.com>
Adds metadata to respond from suggest, and be passable into register.
This is in support of adding multi-fidelity support (#751)
---------
Co-authored-by: Brian Kroth <bpkroth@users.noreply.github.com>
Co-authored-by: Brian Kroth <bpkroth@microsoft.com>
- Introduce a new `make format` target that currently only calls
`licenseheaders` but will call `isort` and `black` in the future
- Introduce new variables to help make dependency tracking easier
- Simplify `pytest` rules and allow running only a single module at a
time.
This PR is a precursor to pyproject.toml related build changes as well.
---------
Co-authored-by: Sergiy Matusevych <sergiym@microsoft.com>
Minor tweak to address this warning in pytest output:
```
mlos_core/mlos_core/tests/optimizers/optimizer_multiobj_test.py::test_multi_target_opt_wrong_weights[SmacOptimizer-kwargs2]
C:\Users\bpkroth\.conda\envs\mlos\Lib\site-packages\_pytest\unraisableexception.py:80: PytestUnraisableExceptionWarning: Exception ignored in: <function SmacOptimizer.__del__ at 0x0000021F5FDC42C0>
Traceback (most recent call last):
File "C:\Users\bpkroth\src\MLOS\mlos_core\mlos_core\optimizers\bayesian_optimizers\smac_optimizer.py", line 208, in __del__
self.cleanup()
File "C:\Users\bpkroth\src\MLOS\mlos_core\mlos_core\optimizers\bayesian_optimizers\smac_optimizer.py", line 339, in cleanup
if self._temp_output_directory is not None:
^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'SmacOptimizer' object has no attribute '_temp_output_directory'
warnings.warn(pytest.PytestUnraisableExceptionWarning(msg))
mlos_core/mlos_core/tests/optimizers/optimizer_test.py: 135 warnings
```
Co-authored-by: Sergiy Matusevych <sergiym@microsoft.com>
This is a simple PR that makes all arguments explicit for
optimizer-related function calls in preparation to add additional
arguments in #751 and make it easier to review.
---------
Co-authored-by: Brian Kroth <bpkroth@users.noreply.github.com>
Co-authored-by: Brian Kroth <bpkroth@microsoft.com>
Summary of changes:
* Pass optional weights for optimization targets in mlos_core
* Implement weighted average for multi-objective optimization in FLAML
* Add more unit tests for multi-objective optimization on mlos_core side
Merge after ~#730~
---------
Co-authored-by: Brian Kroth <bpkroth@users.noreply.github.com>
A new git version outputs a different date format that older versions of
python's datetime module don't recognize.
https://github.com/pypa/setuptools_scm/issues/1038
setuptools-scm cut a new version to address that, but it's not available
in conda main channel yet, which is required, since without that the
conda pip phase can't execute in a single transaction
For now, we install things via conda-forge and adjust the channel
priority order so that the full set of dependencies could be resolved.
That of course brought in additional changes (e.g., `python=3.12` by
default, new `pylint`, `pycodestyle`, etc.), so this now also includes
some additional linting changes.
However, longer term, we need to switch to a pyproject.toml file to fix
this properly.
There we can specify prereqs for even loading the setup.py as well as
fix some other config complexities, though that is a broader change.
---------
Co-authored-by: Sergiy Matusevych <sergiym@microsoft.com>
* [x] Pass multi-column DataFrame instead of Sequence to
`BaseOptimizer.register()` and other methods that deal with scores
* [x] Update mlos_bench `MlosCoreOptimizer` to support the new mlos_core
API
* [x] Update unit tests to work with the new API
* [x] Add unit tests for end-to-end multi-target optimization
Merge after ~#726~
---------
Co-authored-by: Brian Kroth <bpkroth@users.noreply.github.com>
* [x] Store multiple optimization targets and directions in base
`Optimizer` class
* [x] Support multiple optimization targets in mock and grid search
optimizers
* [x] Check for single objective in `MlosCoreOptimizer` class (will add
support for multiple objectives after implementing that feature in
`mlos_core` in subsequent PRs)
* [x] Update unit tests in mlos_bench
Merge after ~#723~ and ~#725~
---------
Co-authored-by: Brian Kroth <bpkroth@users.noreply.github.com>
* [x] Update JSON schema to use `"optimizer_target": {"score": "min"}`
format
* [x] Update base Optimizer class to use the new config format (throw
`NotImplementedError` if > 1 target)
* [x] Modify all unit tests to use the new format
Part of #692
Merge after #723
---------
Co-authored-by: Brian Kroth <bpkroth@users.noreply.github.com>
We will need this for multi-objective optimization as well as for
training the optimizers that can take muti-dimensional input
* [x] Make `Storage.Experiment.load()` return multiple scores
* [x] Fix unit tests to check for loading multi-dimensional scores from
the DB
* [x] Make `Optimizer.register()` and `.bulk_register()` take
multi-dimensional trial scores
* [x] Fix the Optimizer unit tests to check for registering
multi-dimensional scores
* [x] Check the Scheduler and optimization loop unit tests to see if we
need to adjust the types etc.
**NOTE:** In this PR, we _do not_ change mlos_core: we will still pass a
single scalar into it and do not change the API on mlos_core side. We
will change mlos_core in the subsequent PR to minimize the diff.
Part of #692
makes optimizers and schedulers a bit simpler. Part of issue #715Closes#711
Note: the move from `--max_iterations` to `--max_suggestions` is a
breaking change, so we will need to cut a new release for this.
---------
Co-authored-by: Brian Kroth <bpkroth@users.noreply.github.com>
Addresses some deprecation warnings about use of `datetime.utcnow()`.
Additionally, sqlite does not store timezone info even if provided, so
retrieving the data may be interpreted according to the host machine's
*local* timezone, which may or may not be UTC.
To mitigate this, this change ensures that all timestamps are
1. first converted to UTC before storing into the DB,
2. converted (or augmented with zoneinfo) to UTC on retrieval
Additionally, we expand the tests to check for this behavior, first with
some additional conversion matrixes when telemetry or status data is
received in implicit local vs. explicit timezone data as well as
executions where the implicit local timezone has be overridden with a
`TZ` environment variable, to simulate different default timezone hosts.
Closes#718
---------
Co-authored-by: Brian Kroth <bpkroth@microsoft.com>
Co-authored-by: Brian Kroth <bpkroth@users.noreply.github.com>
* [x] Make Scheduler class loadable from JSON configs
* [x] Update the Launcher and `run.py` to instantiate Scheduler from
JSON
* [x] Create JSON schema for the Scheduler config
* [x] Add unit tests for the new Scheduler JSON configs
Closes#700
---------
Co-authored-by: Brian Kroth <bpkroth@microsoft.com>
Co-authored-by: Brian Kroth <bpkroth@users.noreply.github.com>
Roll back one of the updates from #705
It breaks tests on both Windows and WSL DevContainer on my PC.
* On Windows, `environ` may not have `PATH` at all
* On my WSL DevContainer, passing non-null `env` to `subprocess.run`
prevents it from finding the right python interpreter
Closes#688
- Introduces `GridSearchOptimizer` to `mlos_bench`
- Generates and stores a set of `tuple(dict.values())` from
`ConfigSpace` to track elements of the config grid to search
- If `max_iterations` > `len(grid)` can refill the grid if desired
(e.g., by calling `suggest()` after `not_converged()` returns `False`.
- If `max_iterations` < `len(grid)` (i.e., we don't have enough
iterations to complete the grid) will issue a warning.
---------
Co-authored-by: Sergiy Matusevych <sergiym@microsoft.com>
Co-authored-by: Sergiy Matusevych <sergiy.matusevych@gmail.com>
- Update the base image and make sure related dependencies are
consistently from either pip or conda, but not both (follow on fixes to
#699)
- Minor tweak to include the PATH in the local script exec environment
variables
- Another fix to the CI issue in #703
Initial implementation of a separate `Scheduler` class from the core
`run` loop.
---------
Co-authored-by: Brian Kroth <bpkroth@users.noreply.github.com>
Co-authored-by: Brian Kroth <bpkroth@microsoft.com>
A new base devcontainer image was
[released](https://github.com/devcontainers/images/blob/main/src/miniconda/history/0.203.7.md)
recently that introduces the following error in our environment when
running `make test`:
```
import pandas._libs.window.aggregations as window_aggregations\n'
'ImportError: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version '
"`GLIBCXX_3.4.29' not found (required by "
'/opt/conda/envs/mlos/lib/python3.11/site-packages/pandas/_libs/window/aggregations.cpython-311-x86_64-linux-gnu.so)\n')
```
A simplified reproduction involves running `conda run -n mlos pytest
mlos_bench/mlos_bench/launcher_run_test.ph`.
Unfortunately, it appears that under the latest version somewhere
between `conda` and `pytest` the `libc.so.6` library found is the one
under `/usr/lib` instead of `/opt/conda/envs/mlos/lib`.
I haven't quite seen where that's happening, but I've confirmed that
rolling back to the previous release for the base devcontainer image
fixes it.
Note that even then, we update `conda` to the latest and no-cache update
our own dependencies nightly, so we are getting the latest conda at
least, if not the latest Debian base packages.
This will be somewhat of a security issue eventually, but for now it
frees up our CI pipeline to get moving again.