Various updates
- Update docs - Update azure-batch dependency - Set Slurm scheduling option defer mode
This commit is contained in:
Родитель
97dac7d5aa
Коммит
ec7af5b7c1
|
@ -23,8 +23,6 @@ been migrated under the new `additional_node_prep` property as
|
|||
- Performance improvements to speed up job submission with large task
|
||||
factories or large amount of tasks. Verbosity of task generation progress
|
||||
has been increased which can be modified with `-v`.
|
||||
- Updated Docker CE to 18.09.2
|
||||
- Updated Singularity to 2.6.1
|
||||
- Updated blobxfer to 1.7.0 ([#255](https://github.com/Azure/batch-shipyard/issues/255))
|
||||
- Updated LIS, NV driver to 410.92 and NC/ND driver to 410.104
|
||||
- Updated other dependencies to latest
|
||||
|
@ -35,9 +33,14 @@ supplied parameters ([#249](https://github.com/Azure/batch-shipyard/issues/249))
|
|||
- Azure Function extension installation failure ([#260](https://github.com/Azure/batch-shipyard/issues/260))
|
||||
- Block job submission on non-active pools ([#251](https://github.com/Azure/batch-shipyard/issues/251))
|
||||
- Missing files included in binary distributions ([#258](https://github.com/Azure/batch-shipyard/issues/258))
|
||||
- Pools with accelerated networking with fail provisioning sometimes due to
|
||||
- Pools with accelerated networking would fail provisioning sometimes due to
|
||||
infiniband devices being present for non-RDMA VM sizes
|
||||
|
||||
### Security
|
||||
- Updated Docker CE to 18.09.2 to address the runc CVE-2019-5736
|
||||
- Updated Singularity to 2.6.1 to address the shared mount propagation
|
||||
vulnerability CVE-2018-19295
|
||||
|
||||
## [3.6.1] - 2018-12-03
|
||||
### Added
|
||||
- `force_enable_task_dependencies` property in jobs configuration to turn
|
||||
|
|
|
@ -1,3 +1,3 @@
|
|||
azure-batch==6.0.0
|
||||
azure-batch==6.0.1
|
||||
msrest==0.5.5
|
||||
requests==2.21.0
|
||||
|
|
|
@ -110,8 +110,9 @@ subnet level, then you will need to create a single `Inbound Security Rule`
|
|||
to allow traffic in at the Virtual Network level for Batch compute nodes to
|
||||
successfully operate.
|
||||
|
||||
Ports `29876` and `29877` must allow `TCP` traffic from the Source
|
||||
`Service Tag` of `BatchNodeManagement` to `Any` Destination as shown below:
|
||||
Ports `29876` and `29877` must allow `TCP` traffic from the
|
||||
`Source service tag` of `BatchNodeManagement` to `Any` Destination as shown
|
||||
below:
|
||||
|
||||
![64-byovnet-nsg-inbound-rule.png](https://azurebatchshipyard.blob.core.windows.net/github/64-byovnet-nsg-inbound-rule.png)
|
||||
|
||||
|
@ -137,7 +138,7 @@ restrict outbound network traffic, ensure that you have either the generic
|
|||
`Storage` service tag or the correct `Storage.<region>` service tag.
|
||||
Additionally, ensure that if you are specifying the destination port that
|
||||
you provide sufficient rules to cover all outbound requests over port `443`
|
||||
including potentially accesses to other storage regions or any application
|
||||
including potential requests to other storage regions or any application
|
||||
logic that may use port `443`.
|
||||
|
||||
## Additional Configuration Documentation
|
||||
|
|
|
@ -319,5 +319,5 @@ separately. User subscription based Batch accounts share the underlying
|
|||
subscription regional core quotas.
|
||||
|
||||
## Sample Usage
|
||||
Please see the sample [Slurm recipe](../recipes/Slurm-NFS) for a working
|
||||
Please see the sample [Slurm recipe](../recipes/Slurm+NFS) for a working
|
||||
example.
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
azure-batch==6.0.0
|
||||
azure-batch==6.0.1
|
||||
azure-cosmosdb-table==1.0.5
|
||||
azure-mgmt-compute==4.4.0
|
||||
azure-mgmt-resource==2.1.0
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
azure-batch==6.0.0
|
||||
azure-batch==6.0.1
|
||||
azure-cosmosdb-table==1.0.5
|
||||
azure-mgmt-compute==4.4.0
|
||||
azure-mgmt-network==2.5.1
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
# PyTorch-CPU
|
||||
# PyTorch-GPU
|
||||
This recipe shows how to run [PyTorch](https://pytorch.org/) on GPUs
|
||||
using N-series Azure VM instances in an Azure Batch compute pool.
|
||||
This sample executes the MNIST example.
|
||||
|
|
|
@ -130,7 +130,7 @@ This PyTorch-CPU recipe contains information on how to containerize
|
|||
[PyTorch](https://pytorch.org) for use on Azure Batch compute nodes.
|
||||
|
||||
#### [PyTorch-GPU](./PyTorch-GPU)
|
||||
This Torch-GPU recipe contains information on how to containerize
|
||||
This PyTorch-GPU recipe contains information on how to containerize
|
||||
[PyTorch](https://pytorch.org) on GPUs for use with N-series Azure VMs.
|
||||
|
||||
#### [TensorFlow-CPU](./TensorFlow-CPU)
|
||||
|
|
|
@ -1,5 +1,5 @@
|
|||
adal==1.2.1
|
||||
azure-batch==6.0.0
|
||||
azure-batch==6.0.1
|
||||
azure-cosmosdb-table==1.0.5
|
||||
azure-keyvault==1.1.0
|
||||
azure-mgmt-authorization==0.51.1
|
||||
|
|
|
@ -1,4 +1,4 @@
|
|||
azure-batch==6.0.0
|
||||
azure-batch==6.0.1
|
||||
azure-cosmosdb-table==1.0.5
|
||||
azure-mgmt-storage==3.1.1
|
||||
azure-mgmt-resource==2.1.0
|
||||
|
|
|
@ -27,7 +27,6 @@ PluginDir=/usr/lib/slurm
|
|||
# LOGGING
|
||||
SlurmctldDebug=5
|
||||
# slurmctld log file has no %h option, so need to log locally
|
||||
#SlurmctldLogFile=/var/log/slurm-llnl/slurmctld.log
|
||||
SlurmctldLogFile=/var/log/slurm/slurmctld.log
|
||||
SlurmdDebug=5
|
||||
SlurmdLogFile={SLURM_LOG_PATH}/slurmd-%h.log
|
||||
|
@ -67,6 +66,7 @@ Waittime=0
|
|||
|
||||
# SCHEDULING
|
||||
SchedulerType=sched/backfill
|
||||
SchedulerParameters=defer
|
||||
#SchedulerAuth=
|
||||
SelectType=select/linear
|
||||
#SelectType=select/cons_res
|
||||
|
|
Загрузка…
Ссылка в новой задаче