Fred Park
172d752938
Add RDP support
...
- Improve command hierarchy
2017-11-03 16:20:50 -07:00
Fred Park
6da607c9b9
Multi-instance/IB support for Singularity tasks
...
- Make cascade work in Docker container
2017-10-22 13:59:35 -07:00
Fred Park
48172e115e
Add initial Singularity task support
...
- Auto-GPU
- Fix ownership issues with Singularity image pre-load
2017-10-20 23:10:12 -07:00
Fred Park
4e5d5abf6b
Add Singularity support into cascade
...
- Remove singularity suport in native container support pools as it's
impossible to execute a singularity container in this mode
2017-10-17 18:51:27 -07:00
Fred Park
adc0c865ea
docker_volumes is now volumes
2017-10-17 12:57:02 -07:00
Fred Park
298b00d946
Mount Azure file shares to host ( #123 )
...
- Allow multiple file shares per pool
- Move root mount point for all shared data volumes
2017-10-04 17:59:30 -07:00
Fred Park
49a374416f
Add unusable node recovery option
2017-10-04 09:25:57 -07:00
Fred Park
6315be3a6b
Transition to blobxfer 1.x command structure
...
- Data ingress/egress changes
- Task factory file changes
- Resolves #47
2017-10-03 18:24:49 -07:00
Fred Park
e783744e00
Container registry logic overhaul
...
- Remove private registry back to Azure storage blob support (#44 )
- Require fully qualified Docker image names (#106 )
- Support multiple public/private registries on a single pool (#127 )
2017-10-03 18:24:42 -07:00
Fred Park
cbddcdfbff
Use docker_image in favor of image in tasks
2017-10-03 10:05:17 -07:00
Fred Park
60b4fc446f
Support ARM Images for custom images ( #126 )
2017-10-03 10:05:17 -07:00
Fred Park
238982db77
Add ARM VNet support in Batch service mode ( #126 )
...
- Support "global" aad property in credentials
- Add Virtual Network guide
2017-10-03 10:05:17 -07:00
Fred Park
afdad167a8
Add native support to custom images
...
- Update TSG doc for native container support
2017-10-03 10:04:03 -07:00
Fred Park
e2ddf3b750
Add YAML configuration support
...
- Resolves #122
2017-10-03 10:04:03 -07:00
Fred Park
9c700dfbd5
Fix jrtask and suppress CUDA vars for native
...
- Doc updates
- Suppress coordination/task commands that are empty
2017-10-03 10:03:20 -07:00
Fred Park
c13793dd57
Support version in platform image
...
- Override UbuntuServer 16.04-LTS latest to prior version due to
linux-azure kernel issues
2017-09-22 19:41:33 -07:00
Fred Park
745082029f
Misc doc updates
...
- Update requests
- Check task id length
- Drop Python 3.3 support due to cryptography
2017-08-10 08:40:07 -07:00
Fred Park
44a1f14b31
Add monitor_task_completion for recurring jobs
2017-08-09 07:57:56 -07:00
Fred Park
9add2444ec
Change autogen task id property to complex
...
- Update job recurrence docs
2017-08-08 08:45:15 -07:00
Fred Park
be530e63c0
Job recurrence support
2017-08-07 19:42:09 -07:00
Fred Park
99e72c0c3f
Add custom task factory support ( #93 )
2017-08-07 10:38:08 -07:00
Fred Park
c5fa85adcb
Add file task factory ( #93 )
...
- Split out task factory settings into separate file
- Change uniform to be a, b instead of min, max
- Update blobxfer script for single target ingress to place file
directly to destination
2017-08-04 11:02:33 -07:00
Fred Park
1650ce4a95
Add random task factory ( #93 )
2017-08-03 20:10:56 -07:00
Fred Park
ed8ca2d225
Add autogen task id setting
2017-07-31 13:40:18 -07:00
Fred Park
4105acc2f8
Add task factory (parameter sweep) support
...
- Resolves #93
2017-07-28 14:36:42 -07:00
Fred Park
e32fc4d93e
Add Autopool support
...
- Resolves #33
- Add --poolid to storage clear and storage del
- jobs del and jobs term now cleanup storage data if autopool is
detected
2017-07-21 11:10:03 -07:00
Fred Park
3b65ba684f
Support job priorities
...
- Resolves #109
2017-07-21 11:10:03 -07:00
Fred Park
23e9584852
Add compute node fill type support
...
- Resolves #107
2017-07-21 11:10:03 -07:00
Fred Park
82a46a615a
Basic Autoscale functionality
...
- Allow pools to be added with zero target nodes
- Add pool autoscale commands
2017-07-21 11:10:03 -07:00
Fred Park
5291ff1130
Move to blob leasing for download ticketing
...
- Greatly increase resource file SAS expiry timedelta
- Make concurrent_source_downloads generic, remove non-p2p option
- Update Dockerfiles
- Update to latest azure-storage
2017-07-21 11:10:03 -07:00
Fred Park
8397b411c5
Initial custom image support
2017-06-06 08:43:33 -07:00
Fred Park
d80d938063
More inheritable job to task properties
...
- Add max_wall_time property
- Resolves #69
2017-05-23 09:29:00 -07:00
Fred Park
7ed7429a24
Add Low Priority Batch VM support
...
- Resolves #82
- Resolves #83
2017-05-12 14:42:55 -07:00
Fred Park
f9912b7a52
Pool-level resource file support
2017-05-01 10:17:09 -07:00
Fred Park
741a0bdd85
Add fault_domains property
...
- Add RemoteFS-GlusterFS+BatchPool recipe
- Various fixes
2017-04-14 08:14:13 -07:00
Fred Park
0d974fa0aa
Add additional SSH options
...
- Fix samba to auto-restart
2017-04-13 09:31:35 -07:00
Fred Park
f61f91423e
multi_instance_auto_complete -> auto_complete
...
- Resolves #61
2017-04-03 10:48:54 -07:00
Fred Park
b426ce9c39
Add Samba NSG rules and stat
2017-03-30 19:48:17 -07:00
Fred Park
130401af75
Add samba support on storage cluster nodes
2017-03-30 15:03:11 -07:00
Fred Park
db16e4cb7e
Allow public IP to be disabled
...
- Fix fs cluster status --detail
- Expand non-retry on async ops to include all 400-level status codes
2017-03-28 20:49:09 -07:00
Fred Park
b269ea7f06
Add multi-volume/server support
2017-03-16 15:18:29 -07:00
Fred Park
5325395522
Add glusterfs local mount option and NSG rule
...
- Add --hosts option for fs cluster status to print required hosts
changes on the local machine to mount the remote fs
2017-03-14 22:07:51 -07:00
Fred Park
ca2f9d73ab
Add support for docker run uid/gid
...
- Resolves #54
2017-03-14 08:52:09 -07:00
Fred Park
89b722df54
Populate the fs config doc
...
- Update base README
- Rename disk_ids to disk_names in fs.json
2017-03-12 13:07:56 -07:00
Fred Park
e0490cf0b4
Doc updates for global config
...
- Create empty doc holder for remote fs config settings
2017-03-11 16:32:27 -08:00
Fred Park
cb7b42a231
Support glusterfs <-> pool autolinking
...
- Support glusterfs expand (additional disks)
- Provide `mount_options` for `file_server` which applies to local mount
on the file server of the disks
- Allow gluster volume name to be specified
- Provide stronger cross-checking between pool virtual network and
storage cluster virtual network
- Increase ud/fd in AS to maximums
- Install acl tools for nfsv4 and glusterfs
2017-03-11 15:23:55 -08:00
Fred Park
675c6c37f8
Glusterfs support for add/suspend/start
...
- Simple logging by default
- Fix logging format
2017-03-10 22:54:16 -08:00
Fred Park
3f47fda0b9
Checkpoint multi-vm glusterfs support
...
- Allow resource_group overrides in managed_disks and storage_cluster
- Add server_options to file_server
- Add named resource group support to disk deletion
- Fix Batch and ARM client issues in non-AAD mode
2017-03-10 15:10:31 -08:00
Fred Park
33291504c2
Support missing image tasks, pool check
...
- Break out configs into separate pages
- Update all configs using 16.04.0-LTS to 16.04-LTS
- Remove Batch `account` from recipe credentials
2017-03-09 15:07:37 -08:00
Fred Park
e349a004cd
Support pool <-> storage cluster auto-linkage
...
- Update to latest batch management client library supporting
UserSubscription
- Begin breakout of config doc into multiple pages
2017-03-09 09:40:16 -08:00
Fred Park
5fcddad7ea
Pool <-> storage cluster linkage checkpoint
2017-03-08 23:43:16 -08:00
Fred Park
91403de98f
Add pool vnet spec
...
- Refactor vnet/subnet creation so pool creation can use it
- Allow read of fs.json for pool add
- Rename "glusterfs" volume_driver to "glusterfs_on_compute"
2017-03-08 20:23:05 -08:00
Fred Park
66d90dde90
Prep for add pool with vnet changes
...
- Centralize various client creation logic
2017-03-08 14:56:39 -08:00
Fred Park
8f7aee3a2f
Support AAD auth for Batch accounts
2017-03-08 11:13:09 -08:00
Fred Park
c118b7e2d9
Allow custom inbound network security rules
2017-03-08 09:52:21 -08:00
Fred Park
587ab7faa4
Fix suspend/start issues with software raid
...
- Disallow expand action with mdadm-based arrays on RAID-0
- Change "remotefs" to "fs" for commands
2017-03-08 09:52:21 -08:00
Fred Park
748cf64bfb
Refactor and unify AAD settings across commands
...
- All KeyVault AAD endpoints to be specified
2017-03-08 09:52:21 -08:00
Fred Park
f8e3fa52ed
Add stat script
...
- Better organize some remotefs json settings
- Reduce redundant lookups in ssh path
- Create --output-config option to separate from --verbose
2017-03-08 09:52:21 -08:00
Fred Park
0b172eccce
Add remotefs bootstrap script
2017-03-08 09:52:21 -08:00
Fred Park
94bde5b076
Add first version of cluster add and del commands
...
- Modify remotefs json for more properties
2017-03-08 09:52:21 -08:00
Fred Park
cba7086511
Add disk add command
...
- Add first iteration of remotefs.json
- Modify set of TCP no tune VMs
2017-03-08 09:52:21 -08:00
Fred Park
cd2cb4352a
Scaffold base changes for remotefs
2017-03-08 09:52:21 -08:00
Fred Park
78fad1c3e3
Add support for task retention time
...
- Resolves #30
2017-01-31 09:40:16 -08:00
Fred Park
270ef0c7b1
Fix Docker tmpdir
...
- Fix typo with ev secret id ref to keyvault
- Add travis py36 env
2017-01-24 14:43:44 -08:00
Derrick Liu
5fabd07fef
Add max_task_retry_count to job and task definitions ( #23 )
...
* Add `max_task_retry_count` to json template as reference
* Add job-level and task-level max_task_retry_count properties
If set, we create a `azure.batch.models.JobConstraints` or `azure.batch.models.TaskConstraints` object, and pass it into the call to `JobAddParameter` or `TaskAddParameter` as a constraints argument.
* Update configuration documentation to include `max_task_retry_count`
* Fixed various minor issues and linting
Squashed commit:
[d794908] No retry means retry_count is 0, not 1
[29de812] Forgot to define these earlier
[8336700] Don't check for empty since it's an int
[c59d52a] Fix flake8 linting line length (+2 squashed commit)
Squashed commit:
[8336700] Don't check for empty since it's an int
[c59d52a] Fix flake8 linting line length
* Rename `max_task_retry_count` to `max_task_retries` and fix other PR comments
2017-01-24 07:46:52 -08:00
Fred Park
fa4e1f847c
Add env var secret id support
...
- Tag for 2.5.0 release
- Resolves #12
- Partially resolves #15
2017-01-19 10:16:42 -08:00
Fred Park
9b6dbef19f
Add task dependency id range support
2017-01-12 09:30:41 -08:00
Fred Park
348ceebc65
Add AAD X.509 cert auth support ( #10 )
...
- AAD/Keyvault credential support in credentials.json
2017-01-10 11:48:39 -08:00
Fred Park
be04d89410
Update docs for KeyVault support ( #10 )
2017-01-06 08:03:26 -08:00
Fred Park
57b47b353f
Add pool ssh command, resolves #9
...
- Make the ssh docker tunnel script much easier to use
- Add an ssh guide to docs
2016-12-15 07:39:04 -08:00
Fred Park
38ba61245d
Add /dev/shm option, resolves #8
2016-12-14 08:36:51 -08:00
Fred Park
c7744f95bf
Support for internet accessible private registries
2016-11-19 09:00:01 -08:00
Fred Park
4f41d95e32
Finish settings refactor
...
- Change recipes to use current_dedicated for multi-instance count
2016-11-12 22:13:55 -08:00
Fred Park
e700ee05b7
Add docker login prior to image update
...
- Move docker hub creds to credentials json
- Begin refactor of configuration settings retrieval
2016-11-11 09:30:14 -08:00
Fred Park
da573524de
Preliminary steps for ACR support
...
- Fix update docker images with private registry
- Automatically clean dangling image refs on update
- Remove private registry file/image id support
- Refactor fleet initialization steps to one entry point
- Simplify shipyard context init
2016-11-10 09:48:00 -08:00
Fred Park
f8ac2ccc40
Add support for single node direct ingress
...
- Add missing support for relative_destination_path in single node
transfers
2016-11-09 15:38:19 -08:00
Fred Park
9db24df307
Add relative destination path
...
- Check for vm_count for glusterfs setup
2016-11-07 10:23:43 -08:00
Fred Park
fa1024d191
Improve Python2/3 compatibility
...
- Add generated sas key expiry config option
2016-10-31 08:45:05 -07:00
Fred Park
beeb118b19
Add generated_file_export_path option
...
- Add Dockerfile for cli
- Update docs for docker cli
- Update travis build to include tfm
2016-10-25 22:15:35 -07:00
Fred Park
705ae40065
Add support for pool resize up with GlusterFS
...
- Update azure-batch dependency to 1.1.0
2016-10-24 10:08:13 -07:00
Fred Park
92464b3b54
Add Azure Batch Task data ingress
...
- Rearrange Dockerfiles
- Update TensorFlow-Distributed recipe
- Rename CASCADE env vars to SHIPYARD
2016-10-20 21:18:31 -07:00
Fred Park
481d298e7c
Add credential encryption guide
2016-10-20 10:48:25 -07:00
Fred Park
bb515e3812
Add Data Movement guide
2016-10-17 13:33:12 -07:00
Fred Park
bd7101df16
Rename generate_tunnel_script property
...
- Add Torch-CPU to quickstart
2016-10-15 17:54:16 -07:00
Fred Park
dc1c8d46a3
Add include pattern support for gettaskallfiles
2016-10-15 14:33:12 -07:00
Fred Park
33300c551c
Add pool/job/task-level data ingress support
2016-10-14 15:49:20 -07:00
Fred Park
8101a1407f
Add arbitrary file split support
2016-10-12 15:23:27 -07:00
Fred Park
1bc8e60fe2
Add include/exclude filter support for source path
2016-10-12 09:30:06 -07:00
Fred Park
261984020e
Add rsync transfer methods
...
- Refactor data transfer functions into single/multinode
- Expand docs for data ingress
2016-10-11 10:39:54 -07:00
Fred Park
a4ec217f66
First stage in shipyard modularization
...
- Update configuration docs for new data ingress spec
2016-10-09 15:22:15 -07:00
Fred Park
487223e8fa
Change pool config ssh_docker_tunnel to ssh
2016-10-06 11:03:10 -07:00
Fred Park
08204092be
Add CentOS GlusterFS support
...
- Update recipes
2016-09-15 12:47:43 -07:00
Fred Park
646cff6631
Add TensorFlow-Distributed recipe
...
- Fix SSH user expiry within 1 day
- Fix some README/dockerfile typos
2016-09-13 11:43:28 -07:00
Fred Park
e8d5e7a8a3
Automatically detect nvidia driver version
...
- Fix azure-storage dependencies for non-shipyard docker image setup
- Add no-install-recommends to apt-gets in node prep
2016-09-08 21:06:21 -07:00
Fred Park
b4e6e90f1d
Add FFmpeg GPU recipe
...
- Fix NV-series provisioning
- Fix up various READMEs
- Add maintained by tags in Dockerfiles
- Add missing config flag in jobs json
- Fix non-Docker shipyard azure-storage req
2016-09-08 11:52:21 -07:00
Fred Park
f9dac5bd93
Add GPU documentation
...
- Fix node prep issues with GPU
- Correct node prep finished file location
- Add TensorFlow-GPU recipe
2016-09-02 09:39:35 -07:00
Fred Park
a666d7d9f4
Add explicit inter node comm property for pool
...
- Allow inter node comm to work independently of p2p transfers
2016-08-31 22:20:18 -07:00
Fred Park
7f49641074
First part of the guide/docs
...
- Modify placement of some configuration settings
2016-08-31 15:35:33 -07:00
Fred Park
dad22994bc
Rename sample configs as config templates
...
- Add Changelog file
2016-08-31 09:38:33 -07:00