314037f76f
- Package and use Slurm 18.08 instead of default from distro repo - Slurm "master" contains separate controller and login nodes - Integrate RemoteFS shared file system into Slurm cluster - Auto feature tagging on Slurm nodes - Support CentOS 7, Ubuntu 16.04, Ubuntu 18.04 Batch pools as Slurm node targets - Unify login and Batch pools on cluster user based on login user - Auto provision passwordless SSH user on compute nodes with login user context - Add slurm cluster commands, including orchestrate command - Add separate SSH for controller, login, nodes - Add Slurm configuration doc - Add Slurm guide - Add Slurm recipe - Update usage doc - Remove deprecated MSI VM extension from monitoring and federation - Fix pool nodes count on non-existent pool - Refactor SSH info to allow offsets - Add fs cluster orchestrate command |
||
---|---|---|
.. | ||
config | ||
README.md |
README.md
RemoteFS-GlusterFS+BatchPool
This recipe shows how to provision and link a Batch Pool with a GlusterFS storage cluster.
Configuration
Please see refer to this set of sample configuration files for this recipe.
Credentials Configuration
The credentials configuration should have management
Azure Active Directory
credentials defined along with a valid storage account. The management
section can be supplied through environment variables instead if preferred.
Additionally, valid batch
Azure Active Directory credentials should be
supplied.
FS Configuration and GlusterFS storage cluster creation
Please follow the explanations as presented in vanilla RemoteFS-GlusterFS recipe. The RemoteFS storage cluster should be created first prior to creating the Batch pool.
Pool Configuration
Pool configuration would follow like most other standard non-linked pools
except you will need a virtual network specification which would be the
same virtual network hosting the RemoteFS file servers. The
virtual_network
property will require:
name
is the name of the virtual network to useresource_group
is the resource group name containing the virtual networkaddress_space
is the address space of the virtual networksubnet
is the subnet for the compute nodes of this poolname
is the subnet nameaddress_prefix
is the set of addresses for the batch nodes. This subnet space must be large enough to accommodate the number of compute nodes being allocated. Do not share subnets between RemoteFS storage clusters and Batch compute nodes.
Global Configuration
The global configuration is where the RemoteFS storage cluster is linked
to the Batch pool through shared_data_volumes
in volumes
under
global_resources
. shared_data_volumes
dictionary would have a key
that is the storage cluster id key in FS configuration. For this example,
this would be mystoragecluster
. This property contains the following
members:
volume_driver
should be set tostorage_cluster
to link the RemoteFS storage cluster to our Batch poolcontainer_path
is the container path to map the storage cluster mountpointmount_options
are any additional mount options to pass and can be empty or unspecified
Batch Shipyard Commands
After you have created your RemoteFS GlusterFS storage cluster via
fs cluster orchestrate
, then you can issue pool add
with the above config
which will create a Batch pool and automatically link your GlusterFS
storage cluster against your Batch pool. You can then use data placed on
the storage cluster in your containerized workloads.