Граф коммитов

923 Коммитов

Автор SHA1 Сообщение Дата
Fred Park d3c30d4c10
Update Singularity links 2018-07-09 10:23:04 -07:00
Fred Park e50a077522
Fix storage cluster provisioning with ne opts
- Node exporter options not properly handled when not specified
2018-07-09 08:51:02 -07:00
Fred Park d659c34272
Fix metadata dump with correct check logic 2018-07-02 20:05:45 -07:00
Fred Park 5597427e4e
Properly terminate on image pull without fallback 2018-07-01 14:34:03 -07:00
Fred Park cc33117f82
Tag for 3.5.0 release 2018-06-29 13:27:34 -07:00
Fred Park f81ba6f0a7
New vm platform image support
- Support CentOS 7.5
- Support Window server semi-annual 1803 with containers
- Drop CentOS 7.3 support
2018-06-29 13:17:22 -07:00
Fred Park 5952fab4a9
Fix non-retry of task add collection
- For cases where the slice is too big
- This is a regression from #188
2018-06-29 11:16:30 -07:00
Fred Park 166adfa47a
Add filters on pool nodes list
- Allow filtering of start task failed and unusable states
- On pool allocation use this filtering to scope down "bad" nodes
2018-06-29 10:32:00 -07:00
Fred Park 0c3d492f1a
Various fixes and updates
- Dump node listing on unusable
- Update Gluster to 4.1 on Ubuntu/Debian/RemoteFS
- Update Python to 3.6.6 in Windows Docker images
- Update dependencies
- Minor doc updates
- Fix appveyor sha256 artifact upload
2018-06-29 09:32:10 -07:00
Fred Park 03aad7dd94
Fix tfm and rjm issues
- Regression in task file mover task command generator from Windows
  support refactor not properly replacing keyword arg
- Resolves #29
- Unpickling of tasks with dependencies for recurring job manager fails due
  to an unintended pickling of an object from yaml that was not represented
  as a Python list
- Resolves #221
2018-06-27 20:33:03 -07:00
Fred Park 66e77ac397
Update dependencies
- Fixes in scripts for cascade and monitor cert renewal
2018-06-27 15:32:22 -07:00
Fred Park a36bc0c4c1
Add MADL recipe link from README
- Fix line endings and non-line length PEP8 issues in MADL recipe py helper
2018-06-27 12:23:58 -07:00
danyrouh baf9ce0685 Microsoft Azure Distributed Linear Learner Recipe (#195) 2018-06-27 12:10:22 -07:00
Fred Park c98cf320ec
Fix various Let's Encrypt non-staging issues
- Prod ACME challenges were failing due to improper stage check cleanup
- Fix nginx and compose configuration errors for prod certs
- Migrate Docker install on Ubuntu 18.04 to stable channel
- Update LIS
2018-06-27 11:20:50 -07:00
Fred Park ea27e9e8bd
Support XFS filesystems in storage clusters
- Allow mdadm-based RAID-0 arrays to expand (experimental)
- Greatly expand remote fs guide with configuration/usage explanations
- Fix blocking bug with fs commands
- Resolves #219
2018-06-27 08:57:27 -07:00
Fred Park 3f30ba8d07
Support a fallback registry for system images
- Resolves #217
- Add misc mirror-images command
- Pass Singularity version to bootstrap
- Fix GlusterFS on compute provisioning, resolves #220
2018-06-26 12:22:09 -07:00
Fred Park 98da7b25b3
Update Singularity to 2.5.1 2018-06-26 11:15:07 -07:00
Fred Park 534318e2a7
Auto upload Batch node logs on unusable
- Resolves #216
- Add generate sas option for diag logs upload command
- Allow multiple poolid for storage clear and del
- Add diagnostics log option to storage clear and del
- Allow storage sas create command to create container and share level
  SAS tokens
2018-06-25 13:04:03 -07:00
Fred Park e33ee4153a
Updates and minor fixes
- Rename picket to monitor
- Defer resource group creation until after confirmation
- Add AAD note and action requirement for monitor creation
2018-06-25 07:51:42 -07:00
Fred Park 848208fd6c
Fix node exporter no options 2018-06-22 09:37:33 -07:00
Fred Park be81e145f8
Emit sha256 digest for cli binaries 2018-06-21 10:20:56 -07:00
Fred Park 3324437f46
Add clarifications to creds doc
- Fix typo in settings for token cache enable
- Update program splash
2018-06-21 08:02:23 -07:00
Fred Park a6097e12d7
Improve credential section checking 2018-06-19 08:53:43 -07:00
Fred Park 9ae0b14e81
Fix regression in keyvault conf loading
- Resolves #214
2018-06-19 08:10:00 -07:00
Fred Park 2336a63990
Add GitHub issue templates
- Update AppVeyor build
- Update requirements
- Minor doc updates
2018-06-18 10:12:23 -07:00
Fred Park 253dfb6036
Tag for 3.5.0b3 release 2018-06-13 13:35:03 -07:00
Fred Park 694aa35322
Fix azureblob mount check regression
- Allow on all platform images
- Fix SSH user not being provisioned on resize if keys are specified
- Resolves #213
2018-06-13 12:42:14 -07:00
Fred Park a00ecab340
Tag for 3.5.0b2 release
- Update Nuget package metadata
2018-06-12 14:00:25 -07:00
Fred Park 7d48249864
Allow docker access with Batch SSH users
- Allow Docker daemon access with Batch SSH users with configuration
  enablement
- Resolves #206
2018-06-11 14:27:26 -07:00
Fred Park c2cd8f5e67
Update TensorFlow Docker image ref 2018-06-11 13:37:24 -07:00
Fred Park 4573a5c95e
Support native mode platform images
- Auto convert if possible with native enablement
- Update docs
- Resolves #204
2018-06-11 13:32:33 -07:00
Fred Park ca1e9504a7
Cache container/file share creation calls
- Resolves #211
2018-06-11 07:43:15 -07:00
Fred Park 612e2a50e5
Move Prometheus/Grafana config to separate file
- Move grafana admin login info to credentials
- Update documentation for Prometheus/Grafana integration
- Resolves #205
2018-06-08 16:42:52 -07:00
Fred Park 449e621a66
Add suspend/restart support for monitoring 2018-06-08 07:47:15 -07:00
Fred Park cf0797790f
Support max increment of VMs in scenario autoscale
- Allow definition of weekdays/workhours
- Resolves #210
2018-06-08 07:20:41 -07:00
Fred Park e65dd9c196
Add CentOS-HPC 7.4 Support
- Fixup CentOS 7.4 GPU support
- Update LIS to 4.2.5
- Update packer scripts
- Resolves #184
2018-06-07 13:31:01 -07:00
Fred Park 520fe45c9d
Install into virutalenv by default
- Add ludicrous speed quickstart
- Resolves #200
2018-06-07 11:15:30 -07:00
Fred Park 911bcc8593
Fix blobxfer script regression from shellcheck 2018-06-07 10:50:42 -07:00
Fred Park a7a513b804
Don't allow docker only on cAdvisor
- Breaks grafana dashboard
2018-06-07 10:50:42 -07:00
Fred Park 9f61db12c3
Autoprovision Grafana Dashboard
- Add default dashboard
- Allow arbitrary provisioning of additional dashboards
- Add monitor list command
- Add RemoteFS monitoring support
- Compact cadvisor
2018-06-07 10:50:37 -07:00
Fred Park b77a147766
Continue Prometheus integration support
- Add nginx reverse proxy and letsencrypt cert support
- Add let's encrypt options
- Add picket into compose
- crontab cert renewal
- Add inbound rule management for temporary ACME challenge on port 80
- Update to node exporter 0.16
- Fixup various issues
2018-06-04 09:04:30 -07:00
Fred Park ba9dc76f7c
Update gluster to 4.0 2018-06-04 09:03:03 -07:00
Fred Park d4c6aa99ae
Support CentOS 7.4 GPU
- Add LIS installation support
- Update NC driver to 396.26
- Update dependencies
- Resolves #199
2018-06-04 09:03:03 -07:00
Fred Park 95382bf639
Fix appveyor build 2018-06-04 09:03:02 -07:00
Fred Park b7266d2c96
Fix conf load from keyvault
- Improve error messages
- Add missing aad and keyvault decorators to commands
2018-06-04 09:03:02 -07:00
Fred Park ae86b92be2
Start Prometheus monitoring integration
- Refactor package uploader for pool
- Auto install node exporter and cadvisor for prom enabled pools
- Add configuration
- Create monitoring resource
- Start work on picket monitor
2018-06-04 08:56:44 -07:00
Fred Park 32fa9fcaa1
account_key and aad in batch interaction
- These options are now mutually exclusive
- Addresses #197
2018-05-17 09:38:13 -07:00
Fred Park 0bf9399061
Fix some post-release issues
- Shellcheck-induced regressions in containers
- Split Docker image build into own build env on appveyor
- Update to Python 3.6.5 for cargo Docker image build on Windows
2018-05-02 10:24:38 -07:00
Fred Park bbcb479e62
Tag for 3.5.0b1 release 2018-05-02 08:16:04 -07:00
Fred Park 5168320335
Improve site extension installation robustness
- Fix errorlevel checking
- Allow nuget package on pre-releases
2018-05-02 08:15:59 -07:00