Граф коммитов

23 Коммитов

Автор SHA1 Сообщение Дата
Fred Park 3052e98c8b
Add MVAPICH support
- More changes for #287
- Automatically source environment modules if it exists
- Fix some typos
2019-08-12 01:58:39 +00:00
Fred Park b6044b3489
Update GPU support
- Update to Docker CE 19.03.1
- Use "native" Docker/containerd GPU support
- Breaking change in jobs configuration to allow arbitrary configuration
- Update docs
- Resolves #293
2019-08-08 20:36:41 +00:00
Fred Park 4d69c96d79
Merge branch 'sriov-merge' into singularity3 2019-07-23 21:02:52 +00:00
Vincent Labonté cc42916cba Fixes and update of recipes (#290)
* Fix multi-instance tasks that are not a MPI task

* Add setup task script for CNTK-CPU-Infiniband-IntelMPI

* Update CNTK-CPU-Infiniband-IntelMPI recipe

* Add MPI executable path option

* Update CNTK-CPU-OpenMPI recipe

* Change the default MPI executable_path to mpirun

* Modify CNTK-CPU-Infiniband-IntelMPI recipe

* Add setup task script for CNTK-GPU-Infiniband-IntelMPI

* Update CNTK-GPU-Infiniband-IntelMPI recipe

* Add setup task script for CNTK-GPU-OpenMPI

* Add setup task script for NAMD-Infiniband-IntelMPI

* Update NAMD-Infiniband-IntelMPI recipe

* Add setup task script for OpenFOAM-Infiniband-IntelMPI

* Update OpenFOAM-Infiniband-IntelMPI recipe

* Update TensorFlow-GPU Singularity recipe

* Add setup task script for OpenFOAM-TCP-OpenMPI

* Update OpenFOAM-TCP-OpenMPI recipe

* Add support for arbitrary commands with the MPI processes_per_node option

* Fix MPI with native images

* Modify CNTK-CPU-Infiniband-IntelMPI recipe

* Modify CNTK-GPU-Infiniband-IntelMPI recipe

* Modify NAMD-Infiniband-IntelMPI recipe

* Update processes_per_node documentation

* Fix `pool images list` with Singularity images

* Modify OpenFOAM-Infiniband-IntelMPI set up script

* Add check for mpi setting with Windows

* Add auto scratch support with OpenFOAM-Infiniband-IntelMPI recipe

* Modify OpenFOAM-TCP-OpenMPI set up script

* Add auto scratch support with OpenFOAM-TCP-OpenMPI recipe

* Add mpiBench-IntelMPI recipe

* Add mpiBench-MPICH recipe

* Add mpiBench-OpenMPI recipe

* Resolve PR comments

* Resolve PR comments
2019-07-17 18:57:06 -07:00
Fred Park 25fec92273
Support Hc/Hb
- Support RDMA bifurcation
- Update platform docs for CentOS-HPC 7.6
2019-07-15 03:32:04 +00:00
Fred Park 559463cd12
Merge branch 'develop' into sriov-merge 2019-07-09 21:45:31 +00:00
Vincent Labonté 442a22bd28 Improve MPI Interface for Singularity and Docker (#289)
* Add MPI config support for MPICH

* Add MPI config support for Docker containers

* Resolve PR comments

* Make use of the script runner with MPI and Docker

* Minor fixes

* Resolve PR comments
2019-07-09 13:46:12 -07:00
Vincent Labonté e6e60048a7 Improve MPI Interface for Intel MPI and Open MPI with Singularity images (#288)
* Add MPI config support for IntelMPI

* Separate prologue command into user and system

* Add MpiSettings

* Add MPI config support for Open MPI

* Fix MPI config support for IntelMPI

* Workaround for Open MPI btl tcp

* Correct documentation

* Fix non mpi multi instance execution

* Resolve PR comments

* Resolve PR comments

* Partially address #287
2019-07-03 12:40:54 -07:00
Fred Park b93f60213d
Support conditional output data
- Resolves #230
2019-06-24 18:03:43 +00:00
Fred Park 7b138e785a
Support user-specified job prep/release tasks
- Host mode only
- Resolves #202
2019-06-24 16:02:30 +00:00
Fred Park 70532fa4ae
Add Genomics recipes
- BLAST and RNASeq pipelines
- Fix adding tasks to an existing job with existing merge tasks
- Add support for force_enable_task_dependencies at the job level
- Fix doc typos
2018-11-29 08:58:01 -08:00
Fred Park 54c78b7c49
Auto scratch support 2018-11-05 11:24:22 -08:00
Fred Park 02c6e110d7
Kata containers support
- Make Singularity runtime install optional
- Add `restrict_default_bind_mounts` option to jobs spec
- Provide a default container runtime option
2018-11-05 11:24:17 -08:00
Fred Park 52628d27cf
Federation support
- Federation proxy lifecycle management
- Federation lifecycle management
- Federation job submission and management
- Mount Azure File share for auto-rotated log persistence
- FIFO within job support
- Constraint matching
- Federations can be created in "unique job id" mode requiring all
  submitted jobs via fed jobs add be unique across the entire federation
- Supports nearly 15K actions per job (in non-unique job id mode)
- Task dependency rewrite engine for federated jobs
  - Verify dependencies only within task group
  - Uniquely identify task dependencies
- Allow tuning of scheduling behavior options
- Package federation logic on proxy into Docker container
- Full guide/walkthrough for federation feature
- Refactor common code between monitor/fed proxy into resource
- Other doc updates
2018-08-06 09:30:36 -07:00
Fred Park 69235ba2c1
Add default_working_dir option
- Clarify where a container runs by default in the jobs config doc
- Resolves #190
2018-04-25 08:23:17 -07:00
Fred Park 72484b510b
Add product_iterables support for task factory
- Resolves #187
2018-04-20 11:21:34 -07:00
Fred Park 97d9ca09ce
Relax zip iterable type
- Partially addresses #187
2018-04-19 14:34:57 -07:00
Fred Park e1ae45b9d2
Add support for task exit conditions at job-level
- Fix incorrectly overwriting the job action setting
2018-03-16 09:25:35 -07:00
Fred Park 5c0eb71156
Support default task exit conditions
- Add more info for list operations
2018-02-27 08:15:25 -08:00
Fred Park 663548d91e
Minor updates
- Update dependencies
- Check for out of date deps
2018-02-07 14:32:32 -08:00
Fred Park 67aefaf854
Minor schema updates 2018-01-26 15:00:46 -08:00
Fred Park b7a095ef70
Add merge/final task support
- Resolves #149
2018-01-25 14:32:40 -08:00
Fred Park ff20de8d4b
Add YAML Schemas 2018-01-22 10:16:28 -08:00