Fred Park
d3c30d4c10
Update Singularity links
2018-07-09 10:23:04 -07:00
Fred Park
e50a077522
Fix storage cluster provisioning with ne opts
...
- Node exporter options not properly handled when not specified
2018-07-09 08:51:02 -07:00
Fred Park
d659c34272
Fix metadata dump with correct check logic
2018-07-02 20:05:45 -07:00
Fred Park
5597427e4e
Properly terminate on image pull without fallback
2018-07-01 14:34:03 -07:00
Fred Park
cc33117f82
Tag for 3.5.0 release
2018-06-29 13:27:34 -07:00
Fred Park
f81ba6f0a7
New vm platform image support
...
- Support CentOS 7.5
- Support Window server semi-annual 1803 with containers
- Drop CentOS 7.3 support
2018-06-29 13:17:22 -07:00
Fred Park
5952fab4a9
Fix non-retry of task add collection
...
- For cases where the slice is too big
- This is a regression from #188
2018-06-29 11:16:30 -07:00
Fred Park
166adfa47a
Add filters on pool nodes list
...
- Allow filtering of start task failed and unusable states
- On pool allocation use this filtering to scope down "bad" nodes
2018-06-29 10:32:00 -07:00
Fred Park
0c3d492f1a
Various fixes and updates
...
- Dump node listing on unusable
- Update Gluster to 4.1 on Ubuntu/Debian/RemoteFS
- Update Python to 3.6.6 in Windows Docker images
- Update dependencies
- Minor doc updates
- Fix appveyor sha256 artifact upload
2018-06-29 09:32:10 -07:00
Fred Park
03aad7dd94
Fix tfm and rjm issues
...
- Regression in task file mover task command generator from Windows
support refactor not properly replacing keyword arg
- Resolves #29
- Unpickling of tasks with dependencies for recurring job manager fails due
to an unintended pickling of an object from yaml that was not represented
as a Python list
- Resolves #221
2018-06-27 20:33:03 -07:00
Fred Park
66e77ac397
Update dependencies
...
- Fixes in scripts for cascade and monitor cert renewal
2018-06-27 15:32:22 -07:00
Fred Park
a36bc0c4c1
Add MADL recipe link from README
...
- Fix line endings and non-line length PEP8 issues in MADL recipe py helper
2018-06-27 12:23:58 -07:00
danyrouh
baf9ce0685
Microsoft Azure Distributed Linear Learner Recipe ( #195 )
2018-06-27 12:10:22 -07:00
Fred Park
c98cf320ec
Fix various Let's Encrypt non-staging issues
...
- Prod ACME challenges were failing due to improper stage check cleanup
- Fix nginx and compose configuration errors for prod certs
- Migrate Docker install on Ubuntu 18.04 to stable channel
- Update LIS
2018-06-27 11:20:50 -07:00
Fred Park
ea27e9e8bd
Support XFS filesystems in storage clusters
...
- Allow mdadm-based RAID-0 arrays to expand (experimental)
- Greatly expand remote fs guide with configuration/usage explanations
- Fix blocking bug with fs commands
- Resolves #219
2018-06-27 08:57:27 -07:00
Fred Park
3f30ba8d07
Support a fallback registry for system images
...
- Resolves #217
- Add misc mirror-images command
- Pass Singularity version to bootstrap
- Fix GlusterFS on compute provisioning, resolves #220
2018-06-26 12:22:09 -07:00
Fred Park
98da7b25b3
Update Singularity to 2.5.1
2018-06-26 11:15:07 -07:00
Fred Park
534318e2a7
Auto upload Batch node logs on unusable
...
- Resolves #216
- Add generate sas option for diag logs upload command
- Allow multiple poolid for storage clear and del
- Add diagnostics log option to storage clear and del
- Allow storage sas create command to create container and share level
SAS tokens
2018-06-25 13:04:03 -07:00
Fred Park
e33ee4153a
Updates and minor fixes
...
- Rename picket to monitor
- Defer resource group creation until after confirmation
- Add AAD note and action requirement for monitor creation
2018-06-25 07:51:42 -07:00
Fred Park
848208fd6c
Fix node exporter no options
2018-06-22 09:37:33 -07:00
Fred Park
be81e145f8
Emit sha256 digest for cli binaries
2018-06-21 10:20:56 -07:00
Fred Park
3324437f46
Add clarifications to creds doc
...
- Fix typo in settings for token cache enable
- Update program splash
2018-06-21 08:02:23 -07:00
Fred Park
a6097e12d7
Improve credential section checking
2018-06-19 08:53:43 -07:00
Fred Park
9ae0b14e81
Fix regression in keyvault conf loading
...
- Resolves #214
2018-06-19 08:10:00 -07:00
Fred Park
2336a63990
Add GitHub issue templates
...
- Update AppVeyor build
- Update requirements
- Minor doc updates
2018-06-18 10:12:23 -07:00
Fred Park
253dfb6036
Tag for 3.5.0b3 release
2018-06-13 13:35:03 -07:00
Fred Park
694aa35322
Fix azureblob mount check regression
...
- Allow on all platform images
- Fix SSH user not being provisioned on resize if keys are specified
- Resolves #213
2018-06-13 12:42:14 -07:00
Fred Park
a00ecab340
Tag for 3.5.0b2 release
...
- Update Nuget package metadata
2018-06-12 14:00:25 -07:00
Fred Park
7d48249864
Allow docker access with Batch SSH users
...
- Allow Docker daemon access with Batch SSH users with configuration
enablement
- Resolves #206
2018-06-11 14:27:26 -07:00
Fred Park
c2cd8f5e67
Update TensorFlow Docker image ref
2018-06-11 13:37:24 -07:00
Fred Park
4573a5c95e
Support native mode platform images
...
- Auto convert if possible with native enablement
- Update docs
- Resolves #204
2018-06-11 13:32:33 -07:00
Fred Park
ca1e9504a7
Cache container/file share creation calls
...
- Resolves #211
2018-06-11 07:43:15 -07:00
Fred Park
612e2a50e5
Move Prometheus/Grafana config to separate file
...
- Move grafana admin login info to credentials
- Update documentation for Prometheus/Grafana integration
- Resolves #205
2018-06-08 16:42:52 -07:00
Fred Park
449e621a66
Add suspend/restart support for monitoring
2018-06-08 07:47:15 -07:00
Fred Park
cf0797790f
Support max increment of VMs in scenario autoscale
...
- Allow definition of weekdays/workhours
- Resolves #210
2018-06-08 07:20:41 -07:00
Fred Park
e65dd9c196
Add CentOS-HPC 7.4 Support
...
- Fixup CentOS 7.4 GPU support
- Update LIS to 4.2.5
- Update packer scripts
- Resolves #184
2018-06-07 13:31:01 -07:00
Fred Park
520fe45c9d
Install into virutalenv by default
...
- Add ludicrous speed quickstart
- Resolves #200
2018-06-07 11:15:30 -07:00
Fred Park
911bcc8593
Fix blobxfer script regression from shellcheck
2018-06-07 10:50:42 -07:00
Fred Park
a7a513b804
Don't allow docker only on cAdvisor
...
- Breaks grafana dashboard
2018-06-07 10:50:42 -07:00
Fred Park
9f61db12c3
Autoprovision Grafana Dashboard
...
- Add default dashboard
- Allow arbitrary provisioning of additional dashboards
- Add monitor list command
- Add RemoteFS monitoring support
- Compact cadvisor
2018-06-07 10:50:37 -07:00
Fred Park
b77a147766
Continue Prometheus integration support
...
- Add nginx reverse proxy and letsencrypt cert support
- Add let's encrypt options
- Add picket into compose
- crontab cert renewal
- Add inbound rule management for temporary ACME challenge on port 80
- Update to node exporter 0.16
- Fixup various issues
2018-06-04 09:04:30 -07:00
Fred Park
ba9dc76f7c
Update gluster to 4.0
2018-06-04 09:03:03 -07:00
Fred Park
d4c6aa99ae
Support CentOS 7.4 GPU
...
- Add LIS installation support
- Update NC driver to 396.26
- Update dependencies
- Resolves #199
2018-06-04 09:03:03 -07:00
Fred Park
95382bf639
Fix appveyor build
2018-06-04 09:03:02 -07:00
Fred Park
b7266d2c96
Fix conf load from keyvault
...
- Improve error messages
- Add missing aad and keyvault decorators to commands
2018-06-04 09:03:02 -07:00
Fred Park
ae86b92be2
Start Prometheus monitoring integration
...
- Refactor package uploader for pool
- Auto install node exporter and cadvisor for prom enabled pools
- Add configuration
- Create monitoring resource
- Start work on picket monitor
2018-06-04 08:56:44 -07:00
Fred Park
32fa9fcaa1
account_key and aad in batch interaction
...
- These options are now mutually exclusive
- Addresses #197
2018-05-17 09:38:13 -07:00
Fred Park
0bf9399061
Fix some post-release issues
...
- Shellcheck-induced regressions in containers
- Split Docker image build into own build env on appveyor
- Update to Python 3.6.5 for cargo Docker image build on Windows
2018-05-02 10:24:38 -07:00
Fred Park
bbcb479e62
Tag for 3.5.0b1 release
2018-05-02 08:16:04 -07:00
Fred Park
5168320335
Improve site extension installation robustness
...
- Fix errorlevel checking
- Allow nuget package on pre-releases
2018-05-02 08:15:59 -07:00