2016-08-31 19:38:33 +03:00
|
|
|
# Change Log
|
|
|
|
|
|
|
|
## [Unreleased]
|
2017-04-29 02:11:54 +03:00
|
|
|
|
2017-05-24 19:54:09 +03:00
|
|
|
### Added
|
|
|
|
- `--poolid` parameter for `pool del` to specify a specific pool to delete
|
|
|
|
|
|
|
|
### Changed
|
|
|
|
- Prompt for confirmation for `jobs cmi`
|
|
|
|
- Updated to latest dependencies
|
|
|
|
|
|
|
|
### Fixed
|
|
|
|
- Remote FS allocation issue with `vm_count` deprecation check
|
|
|
|
- Better handling of package index refresh errors
|
2017-05-29 00:57:52 +03:00
|
|
|
- `pool udi` over SSH and no registry logins (#92)
|
|
|
|
- Duplicate volume checks between job and task definitions
|
2017-05-24 19:54:09 +03:00
|
|
|
|
2017-05-24 17:43:19 +03:00
|
|
|
## [2.7.0rc1] - 2017-05-24
|
2017-05-23 05:45:25 +03:00
|
|
|
### Added
|
2017-05-24 00:14:29 +03:00
|
|
|
- `pool listimages` command which will list all common Docker images on
|
|
|
|
all nodes and provide warning for mismatched images amongst compute nodes.
|
|
|
|
This functionality requires a provisioned SSH user and private key.
|
2017-05-23 19:29:00 +03:00
|
|
|
- `max_wall_time` option for both jobs and tasks. Please consult the
|
|
|
|
documentation for the difference when specifying this option at either the
|
|
|
|
job or task level.
|
2017-05-23 05:45:25 +03:00
|
|
|
- `--poll-until-tasks-complete` option for `jobs listtasks` to block the CLI
|
2017-05-24 17:43:19 +03:00
|
|
|
from exiting until all tasks under jobs for which the command is run have
|
|
|
|
completed
|
2017-05-23 06:03:35 +03:00
|
|
|
- `--tty` option for `pool ssh` and `fs cluster ssh` to enable allocation
|
|
|
|
of a pseudo-tty for the SSH session
|
2017-05-23 05:45:25 +03:00
|
|
|
|
2017-05-23 19:29:00 +03:00
|
|
|
### Changed
|
|
|
|
- `remove_container_after_exit`, `retention_time`, `shm_size`, `infiniband`,
|
2017-05-24 17:43:19 +03:00
|
|
|
`gpu` can now be specified at the job-level and overriden at the task-level
|
2017-05-23 19:29:00 +03:00
|
|
|
in the jobs configuration
|
2017-05-24 17:43:19 +03:00
|
|
|
- `data_volumes` and `shared_data_volumes` can now be specified at the
|
|
|
|
job-level and any volumes specified at the task level will be *merged* with
|
|
|
|
the job-level volumes to be exposed for the container
|
2017-05-23 19:29:00 +03:00
|
|
|
|
2017-05-23 05:45:25 +03:00
|
|
|
### Fixed
|
|
|
|
- Add missing deprecation path for `pool_specification_vm_count` for
|
|
|
|
multi-instance tasks. Please upgrade your jobs configuration to explicitly
|
|
|
|
use either `pool_specification_vm_count_dedicated` or
|
|
|
|
`pool_specification_vm_count_low_priority`.
|
|
|
|
- Speed up task collection additions by caching last task id
|
2017-05-24 00:14:29 +03:00
|
|
|
- Issues with pool resize and wait logic with low priority
|
2017-05-23 05:45:25 +03:00
|
|
|
|
2017-05-18 19:10:16 +03:00
|
|
|
## [2.7.0b2] - 2017-05-18
|
2017-05-13 05:23:29 +03:00
|
|
|
### Changed
|
2017-05-16 19:49:01 +03:00
|
|
|
- Allow the prior `vm_count` behavior, but provide a deprecation warning. The
|
|
|
|
old `vm_count` behavior will be removed in a future release. (#84)
|
2017-05-18 19:10:16 +03:00
|
|
|
- Add tasks via collection (#86)
|
2017-05-13 05:23:29 +03:00
|
|
|
- Log if node is dedicated in `pool listnodes`
|
|
|
|
- Updated all recipes with new `vm_count` changes
|
|
|
|
|
2017-05-16 19:49:01 +03:00
|
|
|
### Fixed
|
|
|
|
- Improve pool resize wait logic for pools with mixed node types
|
2017-05-18 19:10:16 +03:00
|
|
|
- Do not override workdir if specified (#87)
|
|
|
|
- Prevent container scanning for data ingress from Azure Storage if include
|
|
|
|
filter contains no wildcards (#88)
|
2017-05-16 19:49:01 +03:00
|
|
|
|
2017-05-13 00:40:06 +03:00
|
|
|
## [2.7.0b1] - 2017-05-12
|
2017-05-08 19:05:22 +03:00
|
|
|
### Added
|
2017-05-13 00:40:06 +03:00
|
|
|
- Support for [Low Priority Batch Compute Nodes](https://docs.microsoft.com/en-us/azure/batch/batch-low-pri-vms)
|
|
|
|
- `resize_timeout` can now be specified on the pool specification
|
2017-05-08 19:05:22 +03:00
|
|
|
- `--clear-tables` option to `storage del` command which will delete
|
|
|
|
blob containers and queues but clear table entries
|
2017-05-13 00:40:06 +03:00
|
|
|
- `--ssh` option to `pool udi` command which will force the update Docker
|
|
|
|
images command to update over SSH instead of through a Batch job. This is
|
2017-05-13 05:23:29 +03:00
|
|
|
useful if you want to perform an out-of-band update of Docker image(s), e.g.,
|
2017-05-13 00:40:06 +03:00
|
|
|
your pool is currently busy processing tasks and would not be able to
|
|
|
|
accommodate another task.
|
2017-05-08 19:05:22 +03:00
|
|
|
|
2017-05-11 19:21:20 +03:00
|
|
|
### Changed
|
2017-05-13 00:40:06 +03:00
|
|
|
- **Breaking Change:** `vm_count` in the pool specification is now a
|
|
|
|
complex property consisting of the properties `dedicated` and `low_priority`
|
2017-05-11 19:21:20 +03:00
|
|
|
- Updated all dependencies to latest
|
|
|
|
|
2017-05-08 22:29:37 +03:00
|
|
|
### Fixed
|
|
|
|
- Improve node startup time for GPU NC-series by removing extraneous
|
|
|
|
dependencies
|
2017-05-11 19:21:20 +03:00
|
|
|
- `fs cluster ssh` storage cluster id and command argument ordering was
|
|
|
|
inverted. This has been corrected to be as intended where the command
|
|
|
|
is the last argument, e.g., `fs cluster ssh mynfs -- df -h`
|
2017-05-08 22:29:37 +03:00
|
|
|
|
2017-05-05 18:42:53 +03:00
|
|
|
## [2.6.2] - 2017-05-05
|
|
|
|
### Added
|
|
|
|
- Docker image build for `develop` branch
|
|
|
|
|
|
|
|
### Changed
|
|
|
|
- Allow NVIDIA license agreement to be auto-confirmed via `-y` option
|
|
|
|
- Use requests for file downloading since it is already being installed
|
|
|
|
as a dependency
|
|
|
|
- Update dependencies to latest versions
|
|
|
|
|
2017-05-02 23:16:57 +03:00
|
|
|
### Fixed
|
|
|
|
- TensorFlow image not being set if no suitable image is found for
|
|
|
|
`misc tensorboard` command
|
2017-05-03 06:11:24 +03:00
|
|
|
- Authentication for running images not present in global config sourced
|
|
|
|
from a private registry
|
2017-05-02 23:16:57 +03:00
|
|
|
|
2017-05-02 08:16:48 +03:00
|
|
|
## [2.6.1] - 2017-05-01
|
2017-04-29 02:11:54 +03:00
|
|
|
### Added
|
|
|
|
- `misc tensorboard` command added which automatically instantiates a
|
|
|
|
Tensorboard instance on the compute node which is running or has ran a
|
2017-04-30 08:16:35 +03:00
|
|
|
task that has generated TensorFlow summary operation compatible logs. An
|
2017-04-29 02:11:54 +03:00
|
|
|
SSH tunnel is then created so you can view Tensorboard locally on the
|
2017-05-02 08:16:48 +03:00
|
|
|
machine running Batch Shipyard. This requires a valid SSH user that has been
|
2017-04-30 08:16:35 +03:00
|
|
|
provisioned via Batch Shipyard with private keys available. This command
|
|
|
|
will work on Windows if `ssh.exe` is available in `%PATH%` or the current
|
|
|
|
working directory. Please see the usage guide for more information about
|
|
|
|
this command.
|
2017-05-01 19:43:41 +03:00
|
|
|
- Pool-level `resource_files` support
|
2017-04-29 02:11:54 +03:00
|
|
|
|
|
|
|
### Changed
|
2017-05-01 00:22:13 +03:00
|
|
|
- Added optional `COMMAND` argument to `pool ssh` and `fs cluster ssh`
|
|
|
|
commands. If `COMMAND` is specified, the command is run non-interactively
|
|
|
|
with SSH on the target node.
|
2017-04-29 02:11:54 +03:00
|
|
|
- Added some additional sanity checks in the node prep script
|
|
|
|
- Updated TensorFlow-CPU and TensorFlow-GPU recipes to 1.1.0. Removed
|
|
|
|
specialized Docker build for TensorFlow-GPU. Added `jobs-tb.json` files
|
|
|
|
to TensorFlow-CPU and TensorFlow-GPU recipes as Tensorboard samples.
|
2017-05-02 08:16:48 +03:00
|
|
|
- Optimize some Batch calls
|
2017-04-29 02:11:54 +03:00
|
|
|
|
2017-04-22 05:22:03 +03:00
|
|
|
### Fixed
|
|
|
|
- Site extension issues
|
2017-04-29 02:11:54 +03:00
|
|
|
- SSH user add exception on Windows
|
2017-04-30 08:16:35 +03:00
|
|
|
- `jobs del --termtasks` will now disable the job prior to running task
|
|
|
|
termination to prevent active tasks in job from running while tasks are
|
|
|
|
being terminated
|
|
|
|
- `jobs listtasks` and `data listfiles` will now accept a `--jobid` that
|
|
|
|
does not have to be in `jobs.json`
|
2017-05-01 18:38:45 +03:00
|
|
|
- Data ingress on pool create issue with single node
|
2017-04-20 19:45:37 +03:00
|
|
|
|
|
|
|
## [2.6.0] - 2017-04-20
|
|
|
|
### Changed
|
|
|
|
- Update to latest dependencies
|
|
|
|
|
2017-04-18 23:51:05 +03:00
|
|
|
### Fixed
|
|
|
|
- Checks that prevented ssh/scp/openssl interaction on Windows
|
2017-04-20 19:45:37 +03:00
|
|
|
- SSH private key regression in data ingress direct to compute node
|
2017-04-13 23:13:06 +03:00
|
|
|
|
2017-04-14 22:54:43 +03:00
|
|
|
## [2.6.0rc1] - 2017-04-14
|
2017-04-13 19:31:35 +03:00
|
|
|
### Added
|
|
|
|
- Richer SSH options with new `ssh_public_key_data` and `ssh_private_key`
|
|
|
|
properties in `ssh` configuration blocks (for both `pool.json` and
|
|
|
|
`fs.json`).
|
|
|
|
- `ssh_public_key_data` allows direct embedding of SSH public keys in
|
|
|
|
OpenSSH format into the config files.
|
|
|
|
- `ssh_private_key` specifies where the private key is located with
|
|
|
|
respect to pre-created public keys (either `ssh_public_key` or
|
|
|
|
`ssh_public_key_data`). This allows transparent `pool ssh` or
|
|
|
|
`fs cluster ssh` commands with pre-created keys.
|
2017-04-13 23:13:06 +03:00
|
|
|
- RemoteFS-GlusterFS+BatchPool recipe
|
|
|
|
|
|
|
|
### Changed
|
2017-04-14 22:54:43 +03:00
|
|
|
- Docker installations are now pinned to a specific Docker version which
|
|
|
|
should reduce sudden breaking changes introduced upstream by Docker and/or
|
|
|
|
the distribution
|
2017-04-13 23:13:06 +03:00
|
|
|
- Fault domains for multi-vm storage clusters are now set to 2 by default but
|
|
|
|
can be configured using the `fault_domains` property. This was lowered from
|
|
|
|
the prior default of 3 due to managed disks and availability set restrictions
|
|
|
|
as some regions do not support 3 fault domains with this combination.
|
2017-04-14 22:54:43 +03:00
|
|
|
- Updated NC-series Tesla driver to 375.51
|
2017-04-13 19:31:35 +03:00
|
|
|
|
2017-04-05 21:30:00 +03:00
|
|
|
### Fixed
|
2017-04-14 22:54:43 +03:00
|
|
|
- Broken Docker installations due to gpgkey changes
|
2017-04-05 21:30:00 +03:00
|
|
|
- Possible race condition between disk setup and glusterfs volume create
|
2017-04-13 19:31:35 +03:00
|
|
|
- Forbid SSH username to be the same as the samba username
|
|
|
|
- Allow smbd.service to auto-restart with delay
|
2017-04-13 23:13:06 +03:00
|
|
|
- Data ingress to glusterfs on compute with no remotefs settings
|
2017-04-03 19:29:23 +03:00
|
|
|
|
2017-04-18 23:51:05 +03:00
|
|
|
### Removed
|
|
|
|
- Host support for OpenSUSE 13.2 and SLES 12
|
|
|
|
|
2017-04-03 19:29:23 +03:00
|
|
|
## [2.6.0b3] - 2017-04-03
|
2017-03-24 01:01:20 +03:00
|
|
|
### Added
|
2017-03-28 23:51:55 +03:00
|
|
|
- Created [Azure App Service Site Extension](https://www.siteextensions.net/packages/batch-shipyard).
|
|
|
|
You can now one-click install Batch Shipyard as a site extension (after you
|
2017-04-04 17:26:32 +03:00
|
|
|
have Python installed) and use Batch Shipyard from an Azure Function trigger.
|
2017-03-30 23:18:47 +03:00
|
|
|
- Samba support on storage cluster servers
|
2017-03-24 01:01:20 +03:00
|
|
|
- Add sample RemoteFS recipes for NFS and GlusterFS
|
2017-04-01 07:57:41 +03:00
|
|
|
- `install.cmd` installer for Windows. `install_conda_windows.cmd` has been
|
|
|
|
replaced by `install.cmd`, please see the install doc for more information.
|
2017-03-24 01:01:20 +03:00
|
|
|
|
2017-03-29 05:58:55 +03:00
|
|
|
### Changed
|
2017-04-03 20:47:52 +03:00
|
|
|
- **Breaking Change:** `multi_instance_auto_complete` under
|
|
|
|
`job_specifications` is now named `auto_complete`. This property will apply
|
|
|
|
to all types of jobs and not just multi-instance tasks. The default is now
|
|
|
|
`false` (instead of `true` for the old `multi_instance_auto_complete`).
|
2017-03-29 05:58:55 +03:00
|
|
|
- **Breaking Change:** `static_public_ip` has been replaced with a `public_ip`
|
2017-04-04 17:26:32 +03:00
|
|
|
complex property. This is to accommodate for situations where public IP for
|
2017-03-29 05:58:55 +03:00
|
|
|
RemoteFS is disabled. Please see the Remote FS configuration doc for more
|
|
|
|
info.
|
2017-04-01 07:57:41 +03:00
|
|
|
- `install.sh` now handles Anaconda Python environments
|
2017-03-31 23:21:22 +03:00
|
|
|
- `--cardinal 0` is now implicit if no `--hostname` or `--nodeid` is specified
|
|
|
|
for `fs cluster ssh` or `pool ssh` commands, respectively
|
2017-04-04 00:16:21 +03:00
|
|
|
- Allow `docker_images` in `global_resources` to be empty. Note that it is
|
|
|
|
always recommended to pre-load images on to pools for consistent scheduling
|
|
|
|
latencies from pool idle.
|
2017-03-29 05:58:55 +03:00
|
|
|
|
2017-03-24 01:01:20 +03:00
|
|
|
### Fixed
|
2017-03-24 03:37:53 +03:00
|
|
|
- Removed requirement of a `batch` credential section for pure `fs` operations
|
2017-03-28 23:51:55 +03:00
|
|
|
- Multi-instance auto complete setting not being properly read
|
2017-03-29 18:56:19 +03:00
|
|
|
- `install.sh` virtual environment issues
|
2017-03-30 23:18:47 +03:00
|
|
|
- Fix pool ingress data calls with remotefs (#62)
|
2017-03-31 01:15:19 +03:00
|
|
|
- Move additional node prep commands to last set of commands to execute in
|
|
|
|
start task (#63)
|
2017-03-31 18:07:23 +03:00
|
|
|
- `glusterfs_on_compute` shared data volume issues
|
2017-03-31 19:55:34 +03:00
|
|
|
- future and pathlib compat issues
|
2017-03-31 23:21:22 +03:00
|
|
|
- Python2 unicode/str issues with management libraries
|
2017-03-22 19:53:30 +03:00
|
|
|
|
|
|
|
## [2.6.0b2] - 2017-03-22
|
2017-03-18 01:48:20 +03:00
|
|
|
### Added
|
|
|
|
- Added virtual environment install option for `install.sh` which is now
|
|
|
|
the recommended way to install Batch Shipyard. Please see the install
|
|
|
|
guide for more information. (#55)
|
|
|
|
|
2017-03-22 19:53:30 +03:00
|
|
|
### Changed
|
|
|
|
- Force SSD optimizations for btrfs with premium storage
|
|
|
|
|
2017-03-18 01:48:20 +03:00
|
|
|
### Fixed
|
2017-03-22 19:53:30 +03:00
|
|
|
- Incorrect FS server options parsing at script time
|
2017-03-18 01:48:20 +03:00
|
|
|
- KeyVault client not initialized in `fs` contexts (#57)
|
2017-03-22 19:53:30 +03:00
|
|
|
- Check pool current node count prior to executing `pool udi` task (#58)
|
|
|
|
- Initialization with KeyVault uri on commandline (#59)
|
2017-03-17 03:25:40 +03:00
|
|
|
|
|
|
|
## [2.6.0b1] - 2017-03-16
|
2017-03-09 20:40:16 +03:00
|
|
|
### Added
|
2017-03-10 01:38:16 +03:00
|
|
|
- Support for provisioning storage clusters via the `fs cluster` command
|
2017-03-11 09:54:16 +03:00
|
|
|
- Support for NFS (single VM, scale up)
|
2017-03-16 18:58:53 +03:00
|
|
|
- Support for GlusterFS (multi VM, scale up and out)
|
2017-03-10 01:38:16 +03:00
|
|
|
- Support for provisioning managed disks via the `fs disks` command
|
2017-03-12 21:14:11 +03:00
|
|
|
- Support for data ingress to provisioned storage clusters
|
2017-03-22 19:53:30 +03:00
|
|
|
- Support for
|
2017-04-13 23:13:06 +03:00
|
|
|
[UserSubscription Batch accounts](https://docs.microsoft.com/en-us/azure/batch/batch-account-create-portal#user-subscription-mode)
|
2017-03-09 20:40:16 +03:00
|
|
|
- Azure Active Directory authentication support for Batch accounts
|
2017-03-12 21:14:11 +03:00
|
|
|
- Support for specifying a virtual network to use with a compute pool
|
2017-04-04 17:26:32 +03:00
|
|
|
- `allow_run_on_missing_image` option to jobs that allows tasks to execute
|
|
|
|
under jobs with Docker images that have not been pre-loaded via the
|
2017-03-10 01:38:16 +03:00
|
|
|
`global_resources`:`docker_images` setting in config.json. Note that, if
|
|
|
|
possible, you should attempt to specify all Docker images that you intend
|
|
|
|
to run in the `global_resources`:`docker_images` property in the global
|
|
|
|
configuration to minimize scheduling to task execution latency.
|
2017-03-17 20:00:58 +03:00
|
|
|
- Support for running containers as a different user identity (uid/gid)
|
2017-03-11 09:54:16 +03:00
|
|
|
- Support for Canonical/UbuntuServer/16.04-LTS. 16.04-LTS should be used over
|
2017-03-10 01:38:16 +03:00
|
|
|
the old 16.04.0-LTS sku due to
|
2017-03-11 09:54:16 +03:00
|
|
|
[issue #31](https://github.com/Azure/batch-shipyard/issues/31) and is no
|
|
|
|
longer receiving updates.
|
2017-03-09 20:40:16 +03:00
|
|
|
|
2017-03-01 23:45:32 +03:00
|
|
|
### Changed
|
2017-03-09 07:18:58 +03:00
|
|
|
- **Breaking Change:** `glusterfs` `volume_driver` for `shared_data_volumes`
|
2017-03-11 09:54:16 +03:00
|
|
|
should now be named as `glusterfs_on_compute`. This is to distinguish between
|
2017-03-16 18:58:53 +03:00
|
|
|
co-located GlusterFS on compute nodes with standalone GlusterFS
|
2017-03-11 09:54:16 +03:00
|
|
|
`storage_cluster` remote mounted distributed file system.
|
|
|
|
- Logging now has less verbose details (call origin) by default. Prior
|
|
|
|
behavior can be restored with the `-v` option.
|
2017-03-10 01:38:16 +03:00
|
|
|
- Pool existance is now checked prior to job submission and can now proceed
|
|
|
|
to add without an active pool.
|
|
|
|
- Batch `account` (name) is now an optional property in the credentials config
|
|
|
|
- Configuration doc broken up into multiple pages
|
|
|
|
- Update all recipes using Canonical/UbuntuServer/16.04.0-LTS to use
|
|
|
|
Canonical/UbuntuServer/16.04-LTS instead
|
2017-03-17 03:25:40 +03:00
|
|
|
- Configuration is no longer shown with `-v`. Use `--show-config` to dump
|
|
|
|
the complete configuration being used for the command.
|
2017-03-11 09:54:16 +03:00
|
|
|
- Precompile Python files during build for Docker images
|
2017-03-01 23:45:32 +03:00
|
|
|
- All dependencies updated to latest versions
|
|
|
|
- Update Batch API call compatibility for `azure-batch 2.0.0`
|
2017-02-23 19:06:27 +03:00
|
|
|
|
2017-03-11 09:54:16 +03:00
|
|
|
### Fixed
|
|
|
|
- Logging time format and incorrect Zulu time designation.
|
2017-03-11 20:21:33 +03:00
|
|
|
- `scp` and `multinode_scp` data movement capability is now supported in
|
|
|
|
Windows given `ssh.exe` and `scp.exe` can be found in `%PATH%` or the current
|
|
|
|
working directory. `rsync` methods are not supported on Windows.
|
|
|
|
- Credential encryption is now supported in Windows given `openssl.exe` can
|
|
|
|
be found in `%PATH%` or the current working directory.
|
2017-03-11 09:54:16 +03:00
|
|
|
|
2017-03-08 19:39:51 +03:00
|
|
|
## [2.5.4] - 2017-03-08
|
|
|
|
### Changed
|
|
|
|
- Downloaded files are now verified via SHA256 instead of MD5
|
|
|
|
- Updated NC-series Tesla driver to 375.39
|
|
|
|
|
|
|
|
### Fixed
|
|
|
|
- `nvidia-docker` updated to 1.0.1 for compatibility with Docker CE
|
|
|
|
|
2017-03-01 18:39:00 +03:00
|
|
|
## [2.5.3] - 2017-03-01
|
2017-02-28 20:41:05 +03:00
|
|
|
### Added
|
2017-03-01 06:34:22 +03:00
|
|
|
- `pool rebootnode` command added which allows single node reboot control.
|
2017-03-01 23:45:32 +03:00
|
|
|
Additionally, the option `--all-start-task-failed` will reboot all nodes in
|
2017-03-01 06:34:22 +03:00
|
|
|
the specified pool with the start task failed state.
|
2017-02-28 20:41:05 +03:00
|
|
|
- `jobs del` and `jobs term` now provide a `--termtasks` option to
|
|
|
|
allow the logic of `jobs termtasks` to precede the delete or terminate
|
|
|
|
action to the job. This option requires a valid SSH user to the remote nodes
|
|
|
|
as specified in the `ssh` configuration property in `pool.json`. This new
|
|
|
|
option is normally not needed if all tasks within the jobs have completed.
|
|
|
|
|
|
|
|
### Changed
|
|
|
|
- The Docker image used for blobxfer is now tied to the specific Batch
|
|
|
|
Shipyard release
|
|
|
|
- Default SSH user expiry time if not specified is now 30 days
|
|
|
|
- All recipes now have the default config.json storage account set to the
|
2017-03-01 23:45:32 +03:00
|
|
|
link as named in the provided credentials.json file. Now, only the credentials
|
|
|
|
file needs to be modified to run a recipe.
|
2017-02-28 20:41:05 +03:00
|
|
|
|
2017-02-23 19:06:27 +03:00
|
|
|
## [2.5.2] - 2017-02-23
|
2017-02-23 06:23:51 +03:00
|
|
|
### Added
|
|
|
|
- Chainer-CPU and Chainer-GPU recipes
|
2017-02-23 07:25:17 +03:00
|
|
|
- [Troubleshooting guide](docs/96-troubleshooting-guide.md)
|
2017-02-23 06:23:51 +03:00
|
|
|
|
2017-02-18 00:14:00 +03:00
|
|
|
### Changed
|
2017-02-23 09:11:27 +03:00
|
|
|
- Perform automatic container path substitution with host path for
|
|
|
|
GlusterFS data ingress/egress from/to Azure Storage (#37)
|
2017-02-18 00:14:00 +03:00
|
|
|
- Allow NAMD-TCP recipe to be run on a single node
|
|
|
|
|
|
|
|
### Fixed
|
2017-02-23 19:06:27 +03:00
|
|
|
- CNTK-GPU-OpenMPI run script fixed to allow multinode+singlegpu executions
|
|
|
|
- TensorFlow recipes updated for 1.0.0 release
|
2017-02-23 06:23:51 +03:00
|
|
|
- blobxfer data ingress on Windows (#39)
|
2017-02-23 07:25:17 +03:00
|
|
|
- Minor delete job and terminate tasks fixes
|
2017-02-01 22:06:26 +03:00
|
|
|
|
|
|
|
## [2.5.1] - 2017-02-01
|
2017-01-25 00:54:26 +03:00
|
|
|
### Added
|
|
|
|
- Support for max task retries (#23). See configuration doc for more
|
|
|
|
information.
|
2017-01-31 20:31:21 +03:00
|
|
|
- Support for task data retention time (#30). See configuration doc for
|
|
|
|
more information.
|
2017-01-25 00:54:26 +03:00
|
|
|
|
|
|
|
### Changed
|
|
|
|
- **Breaking Change:** `environment_variables_secret_id` was erroneously
|
|
|
|
named and has been renamed to `environment_variables_keyvault_secret_id` to
|
|
|
|
follow the other properties with similar behavior.
|
|
|
|
- Include Python 3.6 Travis CI target
|
|
|
|
|
|
|
|
### Fixed
|
2017-01-26 01:49:23 +03:00
|
|
|
- Automatically assigned task ids are now in the format `dockertask-NNNNN`
|
|
|
|
and will increment properly past 99999 but will not be padded after that (#27)
|
2017-02-01 22:06:26 +03:00
|
|
|
- Defect in list tasks for tasks that have not run (#28)
|
2017-01-25 00:54:26 +03:00
|
|
|
- Docker temporary directory not being set properly
|
2017-01-26 21:44:43 +03:00
|
|
|
- SLES-HPC will now install all Intel MPI related rpms
|
2017-01-31 02:05:03 +03:00
|
|
|
- Defect in task file mover for unencrypted credentials (#29)
|
2017-01-19 21:15:32 +03:00
|
|
|
|
|
|
|
## [2.5.0] - 2017-01-19
|
2017-01-12 20:23:25 +03:00
|
|
|
### Added
|
|
|
|
- Support for
|
|
|
|
[Task Dependency Id Ranges](https://docs.microsoft.com/en-us/azure/batch/batch-task-dependencies#task-id-range)
|
|
|
|
with the `depends_on_range` property under each task json property in `tasks`
|
|
|
|
in the jobs configuration file. Please see the configuration doc for more
|
|
|
|
information.
|
2017-01-19 21:15:32 +03:00
|
|
|
- Support for `environment_variables_secret_id` in job and task definitions.
|
|
|
|
Specifying these properties will fetch manually added secrets (in the form of
|
|
|
|
a string representation of a json key-value dictionary) from the specified
|
|
|
|
KeyVault using AAD credentials. Please see the configuration doc for more
|
|
|
|
information.
|
2017-01-03 19:51:19 +03:00
|
|
|
|
2017-01-19 09:03:26 +03:00
|
|
|
### Fixed
|
|
|
|
- Remove extraneous import (#12)
|
|
|
|
- Defect in handling per key secret ids (#13)
|
|
|
|
- Defect in environment variable dict merge (#17)
|
2017-01-19 21:15:32 +03:00
|
|
|
- Update Nvidia Docker to 1.0.0 (#21)
|
2017-01-19 09:03:26 +03:00
|
|
|
|
2017-01-11 20:14:24 +03:00
|
|
|
## [2.4.0] - 2017-01-11
|
|
|
|
### Added
|
|
|
|
- Support for credentials stored in Azure KeyVault
|
|
|
|
- `keyvault` command added. Please see the usage doc for more information.
|
|
|
|
- `*_keyvault_secret_id` properties added for keys and passwords in
|
|
|
|
credentials json. Please see the configuration doc for more information.
|
|
|
|
- Using Azure KeyVault with Batch Shipyard guide
|
|
|
|
|
|
|
|
### Changed
|
|
|
|
- Updated NC-series Tesla driver to 375.20
|
|
|
|
|
2017-01-03 19:51:19 +03:00
|
|
|
## [2.3.1] - 2017-01-03
|
2016-12-15 22:04:33 +03:00
|
|
|
### Added
|
|
|
|
- Add support for nvidia-docker with ssh docker tunnel
|
2016-12-09 22:32:05 +03:00
|
|
|
|
2017-01-03 19:51:19 +03:00
|
|
|
### Fixed
|
|
|
|
- Fix multi-job bug with jpcmd
|
|
|
|
|
2016-12-15 06:01:50 +03:00
|
|
|
## [2.3.0] - 2016-12-15
|
|
|
|
### Added
|
|
|
|
- `pool ssh` command. Please see the usage doc for more information.
|
|
|
|
- `shm_size` json property added to the json object within the `tasks` array
|
|
|
|
of a job. Please see the configuration doc for more information.
|
2017-01-03 19:51:19 +03:00
|
|
|
- SSH, Interactive Sessions and Docker SSH Tunnel guide
|
2016-12-15 06:01:50 +03:00
|
|
|
|
|
|
|
### Changed
|
|
|
|
- Improve usability of the generated SSH docker tunnel script
|
|
|
|
|
2016-12-09 22:32:05 +03:00
|
|
|
## [2.2.0] - 2016-12-09
|
|
|
|
### Added
|
|
|
|
- CNTK-CPU-Infiniband-IntelMPI recipe
|
|
|
|
|
|
|
|
### Changed
|
|
|
|
- `/opt/intel` is now automatically mounted once again for infiniband-enabled
|
|
|
|
containers on SUSE SLES-HPC hosts.
|
|
|
|
|
2016-12-01 01:35:44 +03:00
|
|
|
### Fixed
|
|
|
|
- Fix masked KeyErrors on `input_data` and `output_data`
|
|
|
|
- Fix SAS key generation for data movement
|
|
|
|
- Typo in ssh public key check on Windows prevented pool add actions
|
|
|
|
- Pin version of tfm docker image on data transfers
|
2016-11-30 19:12:07 +03:00
|
|
|
|
|
|
|
## [2.1.0] - 2016-11-30
|
2016-11-28 20:47:02 +03:00
|
|
|
### Added
|
|
|
|
- Allow `--configdir`, `--credentials`, `--config`, `--jobs`, `--pool` config
|
|
|
|
options to be specified as environment variables. Please see the usage doc
|
|
|
|
for more information.
|
2016-11-30 01:58:52 +03:00
|
|
|
- Added subcommand `listskus` to the `pool` command to list available
|
|
|
|
VM configurations (publisher, offer, sku) for the Batch account
|
2016-11-23 20:06:37 +03:00
|
|
|
|
2016-11-30 19:12:07 +03:00
|
|
|
### Changed
|
|
|
|
- Nodeprep now references cascade and tfm docker images by version instead
|
|
|
|
of latest to prevent breaking changes affecting older versions. Docker builds
|
|
|
|
of cascade and tfm based on latest commits are now disabled.
|
|
|
|
|
2016-11-29 01:14:44 +03:00
|
|
|
### Fixed
|
2016-11-30 19:12:07 +03:00
|
|
|
- Cascade docker image run not propagating exit code
|
2016-11-29 01:14:44 +03:00
|
|
|
|
2016-11-23 20:06:37 +03:00
|
|
|
## [2.0.0] - 2016-11-23
|
2016-11-19 19:54:52 +03:00
|
|
|
### Added
|
|
|
|
- Support for any Internet accessible container registry, including
|
2016-11-19 21:39:58 +03:00
|
|
|
[Azure Container Registry](https://azure.microsoft.com/en-us/services/container-registry/).
|
|
|
|
Please see the configuration doc for information on how to integrate with
|
|
|
|
a private container registry.
|
|
|
|
|
2016-11-20 12:55:35 +03:00
|
|
|
### Changed
|
|
|
|
- GPU driver for `STANDARD_NC` instances defined in the
|
|
|
|
`gpu`:`nvidia_driver`:`source` property is no longer required. If omitted,
|
|
|
|
an NVIDIA driver will be downloaded automatically with an NVIDIA License
|
|
|
|
agreement prompt. For `STANDARD_NV` instances, a driver URL is still required.
|
2016-11-21 01:55:36 +03:00
|
|
|
- Docker container name auto-tagging now prepends the job id in order to
|
|
|
|
prevent conflicts in case of un-named simultaneous tasks from multiple jobs
|
2016-11-22 10:56:07 +03:00
|
|
|
- Update CNTK docker images to 2.0beta4 and optimize GPU images for use
|
|
|
|
with NVIDIA K80/M60
|
2016-11-21 20:03:35 +03:00
|
|
|
- Update Caffe docker image, default to using OpenBLAS over ATLAS, and
|
|
|
|
optimize GPU images for use with NVIDIA K80/M60
|
2016-11-23 01:27:33 +03:00
|
|
|
- Update MXNet GPU docker image optimized for use with NVIDIA K80/M60
|
2016-11-21 20:03:35 +03:00
|
|
|
- Update TensorFlow docker images to 0.11.0 and optimize GPU images for use
|
|
|
|
with NVIDIA K80/M60
|
2016-11-20 12:55:35 +03:00
|
|
|
|
2016-11-19 21:39:58 +03:00
|
|
|
### Fixed
|
|
|
|
- Cascade thread exceptions will terminate with non-zero exit code
|
|
|
|
- Some improvements with node prep and reboots
|
2016-11-21 01:55:36 +03:00
|
|
|
- Task termination will only issue `docker rm` if the container exists
|
2016-11-14 21:54:23 +03:00
|
|
|
|
|
|
|
## [2.0.0rc3] - 2016-11-14 (SC16 Edition)
|
2016-11-04 17:25:55 +03:00
|
|
|
### Added
|
|
|
|
- `install_conda_windows.cmd` helper script for installing Batch Shipyard
|
|
|
|
under Anaconda for Windows
|
2016-11-07 21:23:43 +03:00
|
|
|
- Added `relative_destination_path` json property for `files` ingress into
|
2016-11-10 02:37:55 +03:00
|
|
|
node destinations. This allows arbitrary specification of where ingressed
|
|
|
|
files should be placed relative to the destination path.
|
|
|
|
- Added ability to ingress directly into the host without the requirement
|
|
|
|
of GlusterFS for pools with one compute node. A GlusterFS shared volume is
|
|
|
|
required for pools with more than one compute node for direct to pool data
|
|
|
|
ingress.
|
2016-11-12 23:35:56 +03:00
|
|
|
- New commands and options:
|
2016-11-09 22:59:01 +03:00
|
|
|
- `pool udi`: Update docker images on all compute nodes in a pool. `--image`
|
|
|
|
and `--digest` options can restrict the scope of the update.
|
2016-11-04 17:25:55 +03:00
|
|
|
- `data stream`: `--disk` will stream the file as binary to disk instead
|
|
|
|
of as text to the local console
|
2016-11-09 02:06:40 +03:00
|
|
|
- `data listfiles`: `--jobid` and `--taskid` allows scoping of the list
|
|
|
|
files action
|
2016-11-09 22:59:01 +03:00
|
|
|
- `jobs listtasks`: `--jobid` allows scoping of list tasks to a specific job
|
2016-11-13 09:31:15 +03:00
|
|
|
- `jobs add`: `--tail` allows tailing the specified file for the last job
|
|
|
|
and task added
|
2016-11-13 12:43:01 +03:00
|
|
|
- Keras+Theano-CPU and Keras+Theano-GPU recipes
|
|
|
|
- Keras+Theano-CPU added as an option in the quickstart guide
|
2016-11-02 18:26:39 +03:00
|
|
|
|
2016-11-07 21:23:43 +03:00
|
|
|
### Changed
|
2016-11-10 20:48:00 +03:00
|
|
|
- **Breaking Change:** Properties of `docker_registry` have changed
|
2016-11-14 21:54:23 +03:00
|
|
|
significantly to support eventual integration with the Azure Container
|
|
|
|
Registry service. Credentials for docker logins have moved to the credentials
|
|
|
|
json file. Please see the configuration doc for more information.
|
2016-11-10 02:37:55 +03:00
|
|
|
- `files` data ingress no longer creates a directory where files to
|
|
|
|
be uploaded exist. For example if uploading from a path `/a/b/c`, the
|
|
|
|
directory `c` is no longer created at the destination. Instead all files
|
|
|
|
found in `/a/b/c` will be immediately placed directly at the destination
|
|
|
|
path with sub-directories preserved. This behavior can be modified with
|
|
|
|
the `relative_destination_path` property.
|
2016-11-13 22:35:09 +03:00
|
|
|
- `CUDA_CACHE_*` variables are now set for GPU jobs such that compiled targets
|
|
|
|
pass-through to the host. This allows subsequent container invocations within
|
|
|
|
the same node the ability to reuse cached PTX JIT targets.
|
2016-11-12 08:08:58 +03:00
|
|
|
- `batch_shipyard`:`storage_entity_prefix` is now optional and defaults to
|
|
|
|
`shipyard` if not specified.
|
2016-11-12 23:35:56 +03:00
|
|
|
- Major internal configuration/settings refactor
|
2016-11-07 21:23:43 +03:00
|
|
|
|
|
|
|
### Fixed
|
2016-11-08 05:34:35 +03:00
|
|
|
- Pool resize down with wait
|
2016-11-07 21:23:43 +03:00
|
|
|
- More Python2/3 compatibility issues
|
2016-11-08 05:34:35 +03:00
|
|
|
- Ensure pools that deploy GlusterFS volumes have more than 1 node
|
2016-11-07 21:23:43 +03:00
|
|
|
|
2016-11-02 18:26:39 +03:00
|
|
|
## [2.0.0rc2] - 2016-11-02
|
2016-10-30 00:37:28 +03:00
|
|
|
### Added
|
|
|
|
- `install.sh` install/setup helper script
|
|
|
|
- `shipyard` execution helper script created via `install.sh`
|
2016-10-31 18:45:05 +03:00
|
|
|
- `generated_sas_expiry_days` json property to config json for the ability to
|
|
|
|
override the default number of days generated SAS keys are valid for.
|
2016-11-01 09:12:14 +03:00
|
|
|
- New options on commands/subcommands:
|
|
|
|
- `jobs add`: `--recreate` recreate any jobs which have completed and use
|
|
|
|
the same id
|
|
|
|
- `jobs termtasks`: `--force` force docker kill to tasks even if they are
|
|
|
|
in completed state
|
|
|
|
- `pool resize`: `--wait` wait for completion of resize
|
2016-11-02 18:26:39 +03:00
|
|
|
- HPCG-Infiniband-IntelMPI and HPLinpack-Infiniband-IntelMPI recipes
|
2016-10-30 00:37:28 +03:00
|
|
|
|
2016-10-29 21:00:31 +03:00
|
|
|
### Changed
|
2016-10-31 18:45:05 +03:00
|
|
|
- Default SAS expiry time used for resource files and data movement changed
|
|
|
|
from 7 to 30 days.
|
2016-10-29 21:00:31 +03:00
|
|
|
- Pools failing to start will now automatically retrieve stdout.txt and
|
2016-10-31 18:45:05 +03:00
|
|
|
stderr.txt to the current working directory under
|
|
|
|
`poolid/<node ids>/std{out,err}.txt`. These files can be inspected
|
|
|
|
locally and submitted as context for GitHub issues if pertinent.
|
2016-11-02 03:10:03 +03:00
|
|
|
- Pool resizing will now attempt to add an SSH user on the new nodes if
|
|
|
|
an SSH public key is referenced or found in the invocation directory
|
2016-10-30 00:37:28 +03:00
|
|
|
- Improve installation doc
|
2016-10-29 21:00:31 +03:00
|
|
|
|
|
|
|
### Fixed
|
2016-10-31 18:45:05 +03:00
|
|
|
- Improve Python2/3 compatibility
|
2016-10-29 21:00:31 +03:00
|
|
|
- Unicode literals warning with Click
|
|
|
|
- Config file loading issue in some contexts
|
|
|
|
- Documentation typos
|
2016-10-28 20:16:57 +03:00
|
|
|
|
|
|
|
## [2.0.0rc1] - 2016-10-28
|
2016-10-06 21:03:10 +03:00
|
|
|
### Added
|
2016-10-20 20:48:25 +03:00
|
|
|
- Comprehensive data movement support. Please see the data movement guide
|
|
|
|
and configuration doc for more information.
|
2016-10-14 23:59:19 +03:00
|
|
|
- Ingress from local machine with `files` in global configuration
|
|
|
|
- To GlusterFS shared volume
|
|
|
|
- To Azure Blob Storage
|
2016-10-15 23:56:56 +03:00
|
|
|
- To Azure File Storage
|
2016-10-21 07:18:31 +03:00
|
|
|
- Ingress from Azure Blob Storage, Azure File Storage, or another Azure
|
|
|
|
Batch Task with `input_data` in pool and jobs configuration
|
2016-10-14 23:59:19 +03:00
|
|
|
- Pool-level: to compute nodes
|
2016-10-17 00:42:25 +03:00
|
|
|
- Job-level: to compute nodes prior to running the specified job
|
|
|
|
- Task-level: to compute nodes prior to running a task of a job
|
2016-10-14 23:59:19 +03:00
|
|
|
- Egress to local machine as actions
|
|
|
|
- Single file from compute node
|
2016-10-16 00:30:44 +03:00
|
|
|
- Entire task-level directories from compute node
|
2016-11-02 03:10:03 +03:00
|
|
|
- Entire node-level directories from compute node
|
2016-10-17 00:42:25 +03:00
|
|
|
- Egress to Azure Blob of File Storage with `output_data` in jobs
|
|
|
|
configuration
|
2016-10-20 07:10:40 +03:00
|
|
|
- Task-level: to Azure Blob or File Storage on successful completion of a
|
2016-10-17 00:42:25 +03:00
|
|
|
task
|
2016-10-20 20:48:25 +03:00
|
|
|
- Credential encryption support. Please see the credential encryption guide
|
|
|
|
and configuration doc for more information.
|
2017-02-01 22:06:26 +03:00
|
|
|
- Experimental support for OpenSSH with HPN patches on Ubuntu
|
2016-10-24 19:12:24 +03:00
|
|
|
- Support pool resize up with GlusterFS
|
2016-10-25 08:12:58 +03:00
|
|
|
- Support GlusterFS volume options
|
2016-10-28 06:51:36 +03:00
|
|
|
- Configurable path to place files generated by `pool add` or `pool asu`
|
|
|
|
commands
|
2016-10-25 08:12:58 +03:00
|
|
|
- MXNet-CPU and Torch-CPU as options in the quickstart guide
|
2016-10-23 05:27:48 +03:00
|
|
|
- Update CNTK recipes for 1.7.2 and switch multinode/multigpu samples to
|
|
|
|
MNIST
|
2016-10-25 08:12:58 +03:00
|
|
|
- MXNet-CPU and MXNet-GPU recipes
|
2016-10-06 21:03:10 +03:00
|
|
|
|
|
|
|
### Changed
|
2016-10-27 07:39:00 +03:00
|
|
|
- **Breaking Change:** All new CLI experience with proper multilevel commands.
|
|
|
|
Please see usage doc for more information.
|
2016-10-28 06:51:36 +03:00
|
|
|
- Added new commands: `cert`, `data`
|
|
|
|
- Added many new convenience subcommands
|
2016-10-28 20:16:57 +03:00
|
|
|
- `--filespec` is now delimited by `,` instead of `:`
|
2016-10-06 21:03:10 +03:00
|
|
|
- **Breaking Change:** `ssh_docker_tunnel` in the `pool_specification` has
|
2016-10-16 03:54:16 +03:00
|
|
|
been replaced by the `ssh` property. `generate_tunnel_script` has been renamed
|
|
|
|
to `generate_docker_tunnel_script`. Please see the configuration doc for
|
2016-10-06 21:03:10 +03:00
|
|
|
more information.
|
2016-10-28 09:41:47 +03:00
|
|
|
- The `name` property of a task json object in the jobs specification is no
|
|
|
|
longer required for multi-instance tasks. If not specified, `name` defaults
|
|
|
|
to `id` for all task types.
|
|
|
|
- `data stream` no longer has an arbitrary max streaming time; the action will
|
2016-10-09 21:37:29 +03:00
|
|
|
stream the file indefinitely until the task completes
|
2016-10-20 07:10:40 +03:00
|
|
|
- Validate container with `storage_entity_prefix` for length issues
|
2016-11-02 03:10:03 +03:00
|
|
|
- `pool del` action now cleans up and deletes some storage containers
|
2016-10-13 20:53:53 +03:00
|
|
|
immediately afterwards (with confirmation prompts)
|
2016-10-25 08:12:58 +03:00
|
|
|
- `/opt/intel` is no longer automatically mounted for infiniband-enabled
|
2016-11-02 03:10:03 +03:00
|
|
|
containers on SUSE SLES-HPC hosts. Please see the configuration doc
|
2016-10-25 08:12:58 +03:00
|
|
|
on how to manually map this directory if required. OpenLogic CentOS-HPC
|
|
|
|
hosts remain unchanged.
|
2016-10-16 03:54:16 +03:00
|
|
|
- Modularized code base
|
2016-10-06 21:03:10 +03:00
|
|
|
|
|
|
|
### Fixed
|
|
|
|
- GlusterFS mount ownership/permissions fixed such that SSH users can
|
|
|
|
read/write
|
2016-10-13 20:53:53 +03:00
|
|
|
- Azure File shared volume setup when invoked from Windows
|
2016-10-20 07:10:40 +03:00
|
|
|
- Python2 compatibility issues with file encoding
|
2016-10-21 07:18:31 +03:00
|
|
|
- Allow shipyard.py to be invoked outside of the root of the GitHub cloned
|
|
|
|
base directory
|
|
|
|
- TensorFlow-Distributed recipe issues
|
2016-10-05 19:20:00 +03:00
|
|
|
|
|
|
|
## [1.1.0] - 2016-10-05
|
2016-09-26 21:17:50 +03:00
|
|
|
### Added
|
2016-10-04 06:02:29 +03:00
|
|
|
- Transparent Infiniband assist for SUSE SLES-HPC 12-SP1 image
|
2016-10-05 19:20:00 +03:00
|
|
|
- Add version for shipyard.py script
|
2016-10-04 06:02:29 +03:00
|
|
|
- NAMD-GPU, OpenFOAM-Infiniband-IntelMPI, Torch-CPU, Torch-GPU recipes
|
2016-09-29 06:50:53 +03:00
|
|
|
|
|
|
|
### Changed
|
|
|
|
- GlusterFS mountpoint is now within `$AZ_BATCH_NODE_SHARED_DIR` so files can
|
|
|
|
be viewed/downloaded with Batch APIs
|
|
|
|
- NAMD-Infiniband-IntelMPI recipe now contains a real Docker image link
|
2016-09-22 23:39:18 +03:00
|
|
|
|
2016-09-30 05:15:38 +03:00
|
|
|
### Fixed
|
|
|
|
- GlusterFS not properly starting on Ubuntu
|
|
|
|
|
2016-09-22 23:39:18 +03:00
|
|
|
## [1.0.0] - 2016-09-22
|
2016-09-09 06:15:11 +03:00
|
|
|
### Added
|
2016-09-15 22:47:43 +03:00
|
|
|
- Automated GlusterFS support
|
2016-09-17 00:02:25 +03:00
|
|
|
- Added `configdir` argument for convenience in loading configuration files,
|
|
|
|
please see the usage documentation for more details
|
2016-09-09 06:15:11 +03:00
|
|
|
- Ability to retrieve files from live compute nodes in addition to streaming
|
2016-09-17 08:59:50 +03:00
|
|
|
- Added `filespec` argument for non-interactive `streamfile` and `gettaskfile`
|
|
|
|
actions
|
2016-09-09 06:15:11 +03:00
|
|
|
- Added .gitattributes to designate Unix line-endings for text files
|
2016-09-09 23:42:33 +03:00
|
|
|
- Sample configuration files for each recipe
|
2016-09-22 23:39:18 +03:00
|
|
|
- Caffe-CPU, OpenFOAM-TCP-OpenMPI, TensorFlow-CPU, TensorFlow-Distributed
|
|
|
|
recipes
|
2016-09-09 06:15:11 +03:00
|
|
|
|
|
|
|
### Changed
|
2016-09-15 22:47:43 +03:00
|
|
|
- Updated configuration docs to detail which properties are required vs. those
|
|
|
|
that are optional
|
2016-09-13 21:39:05 +03:00
|
|
|
- SSH tunnel user is now added with a default expiry time of 7 days which can
|
|
|
|
be modified through the pool configuration file
|
2016-09-16 21:38:09 +03:00
|
|
|
- Configuration is not output to console by default, `-v` flag added for
|
|
|
|
verbose output
|
2016-09-17 08:20:07 +03:00
|
|
|
- Determinstic remote login settings output (node, ip, port) that can be
|
|
|
|
easily parsed
|
2016-09-21 18:30:58 +03:00
|
|
|
- Update Azurefile Docker Volume Driver plugin to 0.5.1
|
2016-08-31 19:38:33 +03:00
|
|
|
|
2016-09-09 23:42:33 +03:00
|
|
|
### Fixed
|
2016-09-16 21:38:09 +03:00
|
|
|
- Cascade (container-only) start issue with no private registry
|
2016-09-09 23:42:33 +03:00
|
|
|
- Non-shipyard docker image node prep with new azure-storage package
|
2016-09-17 03:54:49 +03:00
|
|
|
- Inter-node communication not specified key error on addpool
|
2016-09-16 23:00:05 +03:00
|
|
|
- Cross-platform fixes:
|
|
|
|
- Temp file creation used for environment variables
|
|
|
|
- SSH tunnel creation disabled on Windows if public key is not supplied
|
2016-09-17 08:59:50 +03:00
|
|
|
- Batch Shipyard Docker container not getting cleaned up if peer-to-peer is
|
|
|
|
disabled
|
2016-09-09 23:42:33 +03:00
|
|
|
|
2016-09-09 07:06:21 +03:00
|
|
|
### Removed
|
2016-09-09 23:42:33 +03:00
|
|
|
- `gpu`:`nvidia_driver`:`version` property removed from pool configuration
|
2016-09-13 21:39:05 +03:00
|
|
|
and is no longer required as the version is now automatically detected
|
2016-09-09 07:06:21 +03:00
|
|
|
|
2016-09-08 07:16:27 +03:00
|
|
|
## [0.2.0] - 2016-09-08
|
2016-09-03 01:02:07 +03:00
|
|
|
### Added
|
|
|
|
- Transparent GPU support for Azure N-Series VMs
|
2016-09-08 07:16:27 +03:00
|
|
|
- New recipes added: Caffe-GPU, CNTK-CPU-OpenMPI, CNTK-GPU-OpenMPI,
|
2016-09-08 21:03:40 +03:00
|
|
|
FFmpeg-GPU, NAMD-Infiniband-IntelMPI, NAMD-TCP, TensorFlow-GPU
|
2016-09-03 01:02:07 +03:00
|
|
|
|
|
|
|
### Changed
|
2016-09-06 20:32:42 +03:00
|
|
|
- Multi-instance tasks now automatically complete their job by default. This
|
|
|
|
removes the need to run the `cleanmijobs` action in the shipyard tool.
|
|
|
|
Please refer to the
|
|
|
|
[multi-instance documentation](docs/80-batch-shipyard-multi-instance-tasks.md)
|
|
|
|
for more information and limitations.
|
2016-09-03 01:02:07 +03:00
|
|
|
- Dumb back-off policy for DHT router convergence
|
2016-09-08 07:16:27 +03:00
|
|
|
- Optimzed Docker image storage location for Azure VMs
|
|
|
|
- Prompts added for destructive operations in the shipyard tool
|
2016-09-03 01:02:07 +03:00
|
|
|
|
|
|
|
### Fixed
|
|
|
|
- Incorrect file location of node prep finished
|
|
|
|
- Blocking wait for global resource on pool can now be disabled
|
2016-09-08 07:16:27 +03:00
|
|
|
- Incorrect process call to query for docker image size when peer-to-peer
|
|
|
|
transfer is disabled
|
|
|
|
- Use azure-storage 0.33.0 to fix Edm.Int64 overflow issue
|
2016-09-03 01:02:07 +03:00
|
|
|
|
2016-09-01 07:43:03 +03:00
|
|
|
## [0.1.0] - 2016-09-01
|
2016-08-31 19:38:33 +03:00
|
|
|
#### Added
|
|
|
|
- Initial release
|
|
|
|
|
2017-05-24 17:43:19 +03:00
|
|
|
[Unreleased]: https://github.com/Azure/batch-shipyard/compare/2.7.0rc1...HEAD
|
|
|
|
[2.7.0rc1]: https://github.com/Azure/batch-shipyard/compare/2.7.0b2...2.7.0rc1
|
2017-05-18 19:10:16 +03:00
|
|
|
[2.7.0b2]: https://github.com/Azure/batch-shipyard/compare/2.7.0b1...2.7.0b2
|
2017-05-13 00:40:06 +03:00
|
|
|
[2.7.0b1]: https://github.com/Azure/batch-shipyard/compare/2.6.2...2.7.0b1
|
2017-05-05 18:42:53 +03:00
|
|
|
[2.6.2]: https://github.com/Azure/batch-shipyard/compare/2.6.1...2.6.2
|
2017-05-02 08:16:48 +03:00
|
|
|
[2.6.1]: https://github.com/Azure/batch-shipyard/compare/2.6.0...2.6.1
|
2017-04-20 19:45:37 +03:00
|
|
|
[2.6.0]: https://github.com/Azure/batch-shipyard/compare/2.6.0rc1...2.6.0
|
2017-04-14 22:54:43 +03:00
|
|
|
[2.6.0rc1]: https://github.com/Azure/batch-shipyard/compare/2.6.0b3...2.6.0rc1
|
2017-04-03 19:29:23 +03:00
|
|
|
[2.6.0b3]: https://github.com/Azure/batch-shipyard/compare/2.6.0b2...2.6.0b3
|
2017-03-22 19:53:30 +03:00
|
|
|
[2.6.0b2]: https://github.com/Azure/batch-shipyard/compare/2.6.0b1...2.6.0b2
|
2017-03-17 03:25:40 +03:00
|
|
|
[2.6.0b1]: https://github.com/Azure/batch-shipyard/compare/2.5.4...2.6.0b1
|
2017-03-08 20:09:34 +03:00
|
|
|
[2.5.4]: https://github.com/Azure/batch-shipyard/compare/2.5.3...2.5.4
|
2017-02-28 20:41:05 +03:00
|
|
|
[2.5.3]: https://github.com/Azure/batch-shipyard/compare/2.5.2...2.5.3
|
2017-02-23 19:06:27 +03:00
|
|
|
[2.5.2]: https://github.com/Azure/batch-shipyard/compare/2.5.1...2.5.2
|
2017-02-01 22:06:26 +03:00
|
|
|
[2.5.1]: https://github.com/Azure/batch-shipyard/compare/2.5.0...2.5.1
|
2017-01-19 21:15:32 +03:00
|
|
|
[2.5.0]: https://github.com/Azure/batch-shipyard/compare/2.4.0...2.5.0
|
2017-01-11 20:14:24 +03:00
|
|
|
[2.4.0]: https://github.com/Azure/batch-shipyard/compare/2.3.1...2.4.0
|
2017-01-03 21:04:40 +03:00
|
|
|
[2.3.1]: https://github.com/Azure/batch-shipyard/compare/2.3.0...2.3.1
|
2016-12-15 06:01:50 +03:00
|
|
|
[2.3.0]: https://github.com/Azure/batch-shipyard/compare/2.2.0...2.3.0
|
2016-12-09 22:32:05 +03:00
|
|
|
[2.2.0]: https://github.com/Azure/batch-shipyard/compare/2.1.0...2.2.0
|
2016-11-30 19:12:07 +03:00
|
|
|
[2.1.0]: https://github.com/Azure/batch-shipyard/compare/2.0.0...2.1.0
|
2016-11-23 20:06:37 +03:00
|
|
|
[2.0.0]: https://github.com/Azure/batch-shipyard/compare/2.0.0rc3...2.0.0
|
2016-11-14 21:54:23 +03:00
|
|
|
[2.0.0rc3]: https://github.com/Azure/batch-shipyard/compare/2.0.0rc2...2.0.0rc3
|
2016-11-02 18:26:39 +03:00
|
|
|
[2.0.0rc2]: https://github.com/Azure/batch-shipyard/compare/2.0.0rc1...2.0.0rc2
|
2016-10-28 20:16:57 +03:00
|
|
|
[2.0.0rc1]: https://github.com/Azure/batch-shipyard/compare/1.1.0...2.0.0rc1
|
2016-10-05 19:20:00 +03:00
|
|
|
[1.1.0]: https://github.com/Azure/batch-shipyard/compare/1.0.0...1.1.0
|
2016-09-22 23:39:18 +03:00
|
|
|
[1.0.0]: https://github.com/Azure/batch-shipyard/compare/0.2.0...1.0.0
|
2016-09-03 01:02:07 +03:00
|
|
|
[0.2.0]: https://github.com/Azure/batch-shipyard/compare/0.1.0...0.2.0
|
2016-09-01 19:40:24 +03:00
|
|
|
[0.1.0]: https://github.com/Azure/batch-shipyard/compare/ab1fa4d...0.1.0
|