Граф коммитов

513 Коммитов

Автор SHA1 Сообщение Дата
Fred Park ec986323fb Duplicate volume checks between job and task 2017-05-28 15:18:32 -07:00
Andrei Petrovici 893baef123 Text correction for cert del (#90) 2017-05-28 14:43:49 -07:00
Fred Park 3d5958195c Remove print statements 2017-05-24 13:43:20 -07:00
Fred Park 3a5fb452d5 Various fixes
- Add poolid param for pool del
- Fix vm_count deprecation check on fs actions
- Improve robustness of package index updates
- Prompt for jobs cmi action
- Update to latest dependencies
2017-05-24 09:54:09 -07:00
Fred Park 06fd4f8e62 Tag for 2.7.0rc1 release 2017-05-24 07:43:19 -07:00
Fred Park 4514920b6a Add pool listimages command
- Resolves #60
- Fix some resize/wait issues
2017-05-23 14:14:29 -07:00
Fred Park d80d938063 More inheritable job to task properties
- Add max_wall_time property
- Resolves #69
2017-05-23 09:29:00 -07:00
Fred Park 5a82cc79f8 Add --tty option to ssh commands 2017-05-22 20:03:35 -07:00
Fred Park a623e5b26f Add list tasks poll option
- Resolves #77
- Add deprecation path for multi-instance pool spec vm count
- Fix outdated recipe doc for multi-instance pool spec
- Cache last task id to speed up task collection adds
2017-05-22 19:45:25 -07:00
Fred Park 199ac70e22 Tag for 2.7.0b2 release 2017-05-18 09:18:43 -07:00
Fred Park d2b066bf6d Add tasks via collection
- Resolves #86
2017-05-17 18:48:16 -07:00
Fred Park 644e86ddb6 Prevent glusterfs on compute and max tasks > 1 2017-05-17 15:03:30 -07:00
Fred Park 78fa235e94 Substitute include for remoteresource if possible
- Resolves #88
2017-05-17 12:56:56 -07:00
Fred Park c3a72fa4e3 Allow workdir to be set
- Resolves #87
2017-05-17 09:28:19 -07:00
Fred Park 11c5dc700b Improve pool resize logic with mixed nodes 2017-05-16 09:49:01 -07:00
Fred Park d9304794bc Add deprecation path for vm_count change
- Resolves #84
2017-05-16 09:43:58 -07:00
Fred Park b09a37f22f Add preempted state checking 2017-05-15 08:08:30 -07:00
Fred Park 81b6b2e657 MXNet-GPU recipe vm_count fix 2017-05-12 19:33:52 -07:00
Fred Park fd6c45505e Update recipes for vm_count change 2017-05-12 19:23:29 -07:00
Fred Park 7ed7429a24 Add Low Priority Batch VM support
- Resolves #82
- Resolves #83
2017-05-12 14:42:55 -07:00
Fred Park a17d6b64c9 Update dependencies
- Fix breaking changes in keyvault library
- Fix inverted order for fs cluster ssh and optional command
2017-05-11 09:21:20 -07:00
Fred Park c4b740bb85 Remove xserver-xorg-dev from NC path 2017-05-08 12:29:37 -07:00
Fred Park 70f7317c13 Add --clear-tables option to storage del
- Update limitations doc
- Resolves #80
2017-05-08 09:07:09 -07:00
Paresh Verma 6eee2c120a CNTK single gpu doesn't run because of node count comparison. (#79)
* Bugfix: cntk single gpu is never ran because of node count comparison.

For single node, single gpu scenario, $AZ_BATCH_HOST_LIST has a single
IP, and hence the count of nodes will be 1.

* Updating condition.
2017-05-05 19:02:39 -07:00
Fred Park 1050c5da0e Tag for 2.6.2 release 2017-05-05 08:42:53 -07:00
Fred Park 983a7eed45 Node prep script improvements
- Blacklist nouveau universally on GPU VMs
- Change URL retrieval to requests
- Update requirements to latest
2017-05-05 08:42:19 -07:00
Fred Park 2d53e411e4 Add develop branch Dockerfile for hub
- Allow NVIDIA license agreement to be auto-confirmed with -y
2017-05-04 09:03:09 -07:00
Fred Park b3f959801c Fix docker login for missing images
- Resolves #66
2017-05-02 20:11:24 -07:00
Fred Park 2e62f12729 Fix misc tensorboard default image issue 2017-05-02 13:16:57 -07:00
Fred Park ad423c0e3d Tag for 2.6.1 release
- Optimize some Batch calls
2017-05-01 22:17:43 -07:00
Fred Park 3d2c8cc191 Fix termtasks with disable 2017-05-01 18:47:41 -07:00
Fred Park f9912b7a52 Pool-level resource file support 2017-05-01 10:17:09 -07:00
Fred Park b559ba3fb5 Fix data ingress to single node on pool add 2017-05-01 08:38:45 -07:00
Fred Park aee7c2018b Batch exception handling fixes
- Add more tensorboard log switches for autodetection
2017-04-30 20:08:15 -07:00
Fred Park c8f7521196 Add COMMAND arg for ssh commands 2017-04-30 14:22:13 -07:00
Fred Park 9acdc2e000 Dockerfile updates 2017-04-30 00:08:02 -07:00
Fred Park d638fe10a5 Tensorboard docker image auto-detect
- Fix jobs del --termtasks to disable job first
- Fix jobs listtasks and data listfiles to accept jobid not in jobs
  config
- Perform double port remapping to avoid conflicts in tensorboard
2017-04-29 23:56:53 -07:00
Fred Park fb978a4788 Update TensorFlow recipes to r1.1
- Remove custom build of TF for non-distributed mode
2017-04-28 23:13:04 -07:00
Fred Park 4c29dd22e6 Fix streaming off by one 2017-04-28 12:53:27 -07:00
Fred Park 0394cd9848 Add misc tensorboard command 2017-04-28 12:53:19 -07:00
Fred Park 77e5fecc84 Add some additional checks in nodeprep 2017-04-28 10:55:06 -07:00
Fred Park 6612150352 Catch ssh user add exception in pool add
- Update various docs
2017-04-26 11:09:56 -07:00
Sebastian Brandes Kraaijenzank e026bf05ee Fixed typo in jobs documentation. (#76) 2017-04-26 07:19:25 -07:00
Fred Park e1452fc7c9 Update Azure site extension docs 2017-04-21 19:42:49 -07:00
Fred Park 668acccd43 Fix site extension issues 2017-04-21 19:22:03 -07:00
Fred Park 12216930fe Tag for 2.6.0 release 2017-04-20 13:09:26 -07:00
Fred Park e997441169 Fix regression in data ingress 2017-04-19 08:32:28 -07:00
Fred Park b8d36a065a Refactor pool ssh settings 2017-04-18 18:46:57 -07:00
Fred Park 0161169daa Remove Windows checks for scp/ssh/openssl
- Update docs
- Remove unused vars in nodeprep script
2017-04-18 13:51:05 -07:00
Fred Park 362546686d Add Windows SSH keygen how-to 2017-04-14 23:35:12 -07:00