Граф коммитов

146 Коммитов

Автор SHA1 Сообщение Дата
Fred Park b3b98162c6
Add support for native mode image update
- Fix custom image + native pool startup for Linux
- pool images update over SSH fix
2018-02-08 14:25:28 -08:00
Fred Park b7a095ef70
Add merge/final task support
- Resolves #149
2018-01-25 14:32:40 -08:00
Fred Park c91a49ecc4 Update pool add error text 2017-11-17 10:14:04 -08:00
Fred Park 03d04eae0f Update VM sizes
- Fix multi-instance coordination command for Docker tasks
2017-11-08 12:30:28 -08:00
Fred Park fc362262da Do not fail on SSH private key filemode check
- Always failing prevents execution on WSL
- Log warning instead
2017-11-08 09:47:35 -08:00
Fred Park 2e8b43df55 Add Azure File mount support for Windows pools
- Fix coordination command of None issue
2017-11-05 10:38:22 -08:00
Fred Park 172d752938 Add RDP support
- Improve command hierarchy
2017-11-03 16:20:50 -07:00
Fred Park 0411de5828 Windows task execution support
- Blobxfer on windows support
- Disable all native updateimages/udi
2017-11-03 16:20:27 -07:00
Fred Park c9443fc91a Initial Windows Server Container support
- pool updateimages command supporting singularity images
- fix aad mfa token cache on python2
2017-11-03 16:20:10 -07:00
hieuhc c12da5c955 Fix list_tasks when there is a failed task (#142) 2017-11-01 20:27:39 +00:00
Fred Park 33c4e57bd0 Modify wait for pool behavior
- Break when allocation times out and pool state is steady
- Unusable recovery should only resize on non-resizing state
- Fix non-Ubuntu custom image temp disk query
- Add CentOS 7.3 packer sample
2017-10-30 10:26:22 -07:00
Fred Park 084ea6f125 Fix user identity on docker exec
- Add doc note about security on multi-instance Docker exec
2017-10-24 20:07:00 -07:00
Fred Park bb03797360 Fix no nodes listed on resizing state
- Update dependencies
2017-10-24 13:08:26 -07:00
Fred Park bd251272fc Add more packer samples
- Ubuntu 16.04 based normal, IB and GPU+IB
2017-10-24 10:27:42 -07:00
Fred Park 1f3cc63dca Update more documentation for Singularity
- Task termination fixes for Singularity
2017-10-23 18:55:13 -07:00
Fred Park 6da607c9b9 Multi-instance/IB support for Singularity tasks
- Make cascade work in Docker container
2017-10-22 13:59:35 -07:00
Fred Park 48172e115e Add initial Singularity task support
- Auto-GPU
- Fix ownership issues with Singularity image pre-load
2017-10-20 23:10:12 -07:00
Fred Park 4e5d5abf6b Add Singularity support into cascade
- Remove singularity suport in native container support pools as it's
  impossible to execute a singularity container in this mode
2017-10-17 18:51:27 -07:00
Fred Park 49a374416f Add unusable node recovery option 2017-10-04 09:25:57 -07:00
Fred Park 796a5e33b4 Combine rjm/tfm to cargo (#125) 2017-10-03 18:24:50 -07:00
Fred Park e783744e00 Container registry logic overhaul
- Remove private registry back to Azure storage blob support (#44)
- Require fully qualified Docker image names (#106)
- Support multiple public/private registries on a single pool (#127)
2017-10-03 18:24:42 -07:00
Fred Park 9c700dfbd5 Fix jrtask and suppress CUDA vars for native
- Doc updates
- Suppress coordination/task commands that are empty
2017-10-03 10:03:20 -07:00
Fred Park f1775c3ad7 Native container job schedule support 2017-10-03 10:03:20 -07:00
Fred Park cccc5fa3b8 output_data to storage blob support for native 2017-10-03 10:03:20 -07:00
Fred Park 0a9689f5a8 Native container support
- Allow pool conversion with native flag
2017-10-03 10:03:20 -07:00
Fred Park 53477a720a Tag for 2.9.5 release
- Add --all-starting for pool delnode
2017-09-24 19:50:14 -07:00
Fred Park 093cfdbc83 Prevent invalid mix of HPC offer and non-RDMA VM
- Fix unusable nodes on allocation exception in pool stats
- Expand network tuning exemptions
2017-09-22 08:27:16 -07:00
Fred Park 5621ec5e3e Fix regression in ssh private key check on Windows 2017-09-21 13:11:26 -07:00
Fred Park 1b0e1c449d Refactor SSH to common path
- Check for SSH private key filemode
- Resolves #116
2017-09-07 08:02:40 -07:00
Fred Park 7a815dff6f Minor updates and fixes 2017-08-14 09:07:07 -07:00
Fred Park 1e4cd777be Fix division by zero in pool stats
- Fix flake8 issues
2017-08-10 15:08:39 -07:00
Fred Park 4573180293 Validate and prompt certain job schedule adds 2017-08-09 08:39:26 -07:00
Fred Park 44a1f14b31 Add monitor_task_completion for recurring jobs 2017-08-09 07:57:56 -07:00
Fred Park 5ae9001716 Add job schedule support to commands
- Resolves #19
2017-08-08 15:02:53 -07:00
Fred Park 9add2444ec Change autogen task id property to complex
- Update job recurrence docs
2017-08-08 08:45:15 -07:00
Fred Park be530e63c0 Job recurrence support 2017-08-07 19:42:09 -07:00
Fred Park 99e72c0c3f Add custom task factory support (#93) 2017-08-07 10:38:08 -07:00
Fred Park 8a396f0e18 Add pool and jobs stats
- Resolves #110
2017-08-04 14:47:16 -07:00
Fred Park c5fa85adcb Add file task factory (#93)
- Split out task factory settings into separate file
- Change uniform to be a, b instead of min, max
- Update blobxfer script for single target ingress to place file
  directly to destination
2017-08-04 11:02:33 -07:00
Fred Park e5ffd492ab Update CNTK CPU infiniband recipe to 2.1 2017-08-03 16:28:23 -07:00
Fred Park 4d09a09a80 Add --all-unused to pool delnode 2017-08-03 10:41:38 -07:00
Fred Park 9eb8fd4c55 Support CentOS-HPC 7.3
- Update misc tensorboard to latest
- Fix term tasks in disable jobs
- Update NVIDIA driver
- Doc updates
2017-08-02 15:12:45 -07:00
Fred Park ed8ca2d225 Add autogen task id setting 2017-07-31 13:40:18 -07:00
Fred Park 4105acc2f8 Add task factory (parameter sweep) support
- Resolves #93
2017-07-28 14:36:42 -07:00
Fred Park e32fc4d93e Add Autopool support
- Resolves #33
- Add --poolid to storage clear and storage del
- jobs del and jobs term now cleanup storage data if autopool is
  detected
2017-07-21 11:10:03 -07:00
Fred Park 7ba85e7496 Add job migration support
- Add enable/disable job support too
- Resolves #108
2017-07-21 11:10:03 -07:00
Fred Park 3b65ba684f Support job priorities
- Resolves #109
2017-07-21 11:10:03 -07:00
Fred Park 82a46a615a Basic Autoscale functionality
- Allow pools to be added with zero target nodes
- Add pool autoscale commands
2017-07-21 11:10:03 -07:00
Fred Park 5291ff1130 Move to blob leasing for download ticketing
- Greatly increase resource file SAS expiry timedelta
- Make concurrent_source_downloads generic, remove non-p2p option
- Update Dockerfiles
- Update to latest azure-storage
2017-07-21 11:10:03 -07:00
Fred Park de45b18a67 Add backoff to cascade docker image pull retries 2017-07-01 01:25:30 -07:00