Update feature demo examples (#30)

* Update feature demo examples. * Add defaulting for `ignoreK8sSuggestedNodes`. * Fix sort in `getUsablePhysicalCells`.
2020-08-20 15:07:58 +08:00 · 2020-08-20 15:07:58 +08:00 · df185eecde
--- a/README.md
+++ b/README.md
@ -40,7 +40,6 @@ HiveD supports multiple job **priorities**. Higher-priority jobs can **[preempt]
 5. [Priorities](example/feature/README.md#Guaranteed-Job), [Overuse with Low Priority](example/feature/README.md#Opportunistic-Job), and [Inter-](example/feature/README.md#Inter-VC-Preemption)/[Intra-VC Preemption](example/feature/README.md#Intra-VC-Preemption)
 6. [Job (Full/Partial) Gang Scheduling/Preemption](example/feature/README.md#Gang-Scheduling)
 7. Fault-Tolerance, [Bad Hardware Awareness](example/feature/README.md#Bad-Hardware-Awareness), [Work-Preserving Reconfiguration](example/feature/README.md#Work-Preserving-Reconfiguration)
-8. [Leverage K8S Default Scheduler](example/feature/README.md#Leverage-K8S-Default-Scheduler)

 ## Prerequisite
 1. A Kubernetes cluster, v1.14.2 or above, on-cloud or on-premise.
--- a/example/feature/README.md
+++ b/example/feature/README.md
@ -11,7 +11,7 @@ HiveD guarantees **quota safety for all VCs**, in the sense that the requests to

 VC's cells can be described by Hardware Quantity, [Topology](#VC-Safety), [Type](#SKU-Type), [Pinned Cells](#Pinned-Cells), etc. To guarantee safety, HiveD never allows a VC to "invade" other VCs' cells. For example, to guarantee all VCs' topology, one VC's [guaranteed jobs](#Guaranteed-Job) should never make fragmentation inside other VCs:

-Two DGX-2s, two VCs each owns one DGX-2 node. For a traditional scheduler, this will translate into two VCs each owning 16 GPUs. When a user submits 16 1-GPU jobs to VC1, the user in VC2 might not be able to run a 16-GPU job, due to possible fragmentation issue caused by VC1. While HiveD can guarantee each VC always has one entire node available for its dedicated use.
+Two DGX-2s, two VCs each owns one DGX-2 node. For a traditional scheduler, this will translate into two VCs each owning 16 GPUs. When a user submits 16 1-GPU jobs to vc1, the user in vc2 might not be able to run a 16-GPU job, due to possible fragmentation issue caused by vc1. While HiveD can guarantee each VC always has one entire node available for its dedicated use.

 ### Reproduce Steps
 1. Use [hived-config-1](file/hived-config-1.yaml).
@ -27,7 +27,7 @@ This is similar to [K8S Taints and Tolerations](https://kubernetes.io/docs/conce

 ### Reproduce Steps
 1. Use [hived-config-8](file/hived-config-8.yaml).
-2. Submit job [itc-pin](file/itc-pin.yaml) to VC1, all tasks in task role vc1pinned will be on node 10.151.41.25 (which is pinned), all tasks in task role vc1nopinned will NOT be on node 10.151.41.25.
+2. Submit job [itc-pin](file/itc-pin.yaml) to vc1, all tasks in task role vc1pinned will be on node 10.151.41.25 (which is pinned), all tasks in task role vc1nopinned will NOT be on node 10.151.41.25.
   <img src="file/itc-pin.png" width="900"/>

 ## SKU Type
@ -68,8 +68,8 @@ This is useful for jobs that cannot perform any useful work, such as making prog
   <img src="file/itc-gang4.png" width="900"/>

 #### TensorFlow Distributed Training
-1. Use [hived-config-1](file/hived-config-1.yaml).
-2. Submit job [itc-dtf](file/itc-dtf.yaml) to VC2, it will success.
+1. Use [hived-config-2](file/hived-config-2.yaml).
+2. Submit job [itc-dtf](file/itc-dtf.yaml) to default VC, it will success.
   <img src="file/itc-dtf.png" width="900"/>

 ## Incremental Scheduling
@ -110,27 +110,28 @@ Within one VC, a high-priority job can preempt low-priority jobs.
 ### Reproduce Steps
 #### Immediate Preemption
 1. Use [hived-config-3](file/hived-config-3.yaml).
-2. Submit [itc-intra-imd-preempt-test](file/itc-intra-imd-preempt-test.yaml), which requests for 4 M60 GPUs for VC1 with test (0) priority.
-3. Submit [itc-intra-imd-preempt-prod](file/itc-intra-imd-preempt-prod.yaml), which also requests for 4 M60 GPUs for VC1 with prod (100) priority. The job will preempt the test job immediately, so the test job is retried and waiting for resource.
+2. Submit [itc-intra-imd-preempt-test](file/itc-intra-imd-preempt-test.yaml), which requests for 4 M60 GPUs for vc1 with test (0) priority.
+3. Submit [itc-intra-imd-preempt-prod](file/itc-intra-imd-preempt-prod.yaml), which also requests for 4 M60 GPUs for vc1 with prod (100) priority. The job will preempt the test job immediately, so the test job is retried and waiting for resource.
   <img src="file/itc-intra-imd-preempt-test.png" width="900"/>
   <img src="file/itc-intra-imd-preempt-prod.png" width="900"/>

 #### Lazy Preemption
 1. Use [hived-config-3](file/hived-config-3.yaml).
-2. Submit [itc-intra-lazy-preempt-test](file/itc-intra-lazy-preempt-test.yaml), which requests for 4 K80 GPUs for VC1 with test (0) priority.
-3. Submit [itc-intra-lazy-preempt-prod](file/itc-intra-lazy-preempt-prod.yaml), which also requests for 4 K80 GPUs for VC1 with prod (100) priority. The job will just downgrade the test job to be [Opportunistic Job](#Opportunistic-Job), instead of preempting it immediately, because all jobs can still fit into the whole physical cluster.
+2. Submit [itc-intra-lazy-preempt-test](file/itc-intra-lazy-preempt-test.yaml), which requests for 4 K80 GPUs for vc1 with test (0) priority.
+3. Submit [itc-intra-lazy-preempt-prod](file/itc-intra-lazy-preempt-prod.yaml), which also requests for 4 K80 GPUs for vc1 with prod (100) priority. The job will just downgrade the test job to be [Opportunistic Job](#Opportunistic-Job), instead of preempting it immediately, because all jobs can still fit into the whole physical cluster.
 4. Submit [itc-intra-lazy-preempt-prod2](file/itc-intra-lazy-preempt-prod2.yaml), which also requests for 3 * 4 K80 GPUs for default VC with prod (100) priority. The job will preempt the test job immediately, because all jobs cannot fit into the whole physical cluster.
   <img src="file/itc-intra-lazy-preempt-test.png" width="900"/>
   <img src="file/itc-intra-lazy-preempt-prod.png" width="900"/>
   <img src="file/itc-intra-lazy-preempt-prod2.png" width="900"/>
+> NOTE: `lazyPreemptionEnable` option is disabled by default, becasue earlier job may be downgraded to low priority job and get preempted by later jobs, which may be confusing.

 ## Inter-VC Preemption
 ### Description
 One VC's [Guaranteed Job](#Guaranteed-Job) can preempt other VCs' [Opportunistic Jobs](#Opportunistic-Job).

 ### Reproduce Steps
-1. Use [hived-config-2](file/hived-config-2.yaml).
-2. Submit [itc-inter-preempt-oppo](file/itc-inter-preempt-oppo.yaml), which requests for 2 * 4 K80 GPUs for VC1 with oppo (-1) priority.
+1. Use [hived-config-3](file/hived-config-3.yaml).
+2. Submit [itc-inter-preempt-oppo](file/itc-inter-preempt-oppo.yaml), which requests for 2 * 4 K80 GPUs for vc1 with oppo (-1) priority.
 3. Submit [itc-inter-preempt-prod](file/itc-inter-preempt-prod.yaml), which also requests for 3 * 4 K80 GPUs for default VC with prod (100) priority. The job will preempt the oppo job immediately.
   <img src="file/itc-inter-preempt-oppo.png" width="900"/>
   <img src="file/itc-inter-preempt-prod.png" width="900"/>
@ -190,20 +191,20 @@ HiveD can be reconfigured without unnecessary user impacts, such as add/update/d
 #### VirtualCluster Reconfig - Delete VirtualCluster
 1. Use [hived-config-2](file/hived-config-2.yaml).
 2. Submit job [itc-reconfig-3](file/itc-reconfig-3.yaml) to default VC. Wait until it is running.
-3. Delete the default VC and move its quota to VC1, then becomes [hived-config-5](file/hived-config-5.yaml).
+3. Delete the default VC and move its quota to vc1, then becomes [hived-config-5](file/hived-config-5.yaml).
 4. Use [hived-config-5](file/hived-config-5.yaml), and restart HiveD.
 5. The job will still run without any interruption but [lazy preempted](#Lazy-Preemption) by HiveD.
   <img src="file/itc-reconfig-3.png" width="900"/>
-6. To confirm it is [lazy preempted](#Lazy-Preemption), submit job [itc-reconfig-4](file/itc-reconfig-4.yaml) to VC1 which requests all K80 nodes. The job will immediately preempt [itc-reconfig-3](file/itc-reconfig-3.yaml).
+6. To confirm it is [lazy preempted](#Lazy-Preemption), submit job [itc-reconfig-4](file/itc-reconfig-4.yaml) to vc1 which requests all K80 nodes. The job will immediately preempt [itc-reconfig-3](file/itc-reconfig-3.yaml).
   <img src="file/itc-reconfig-4.png" width="900"/>

 #### VirtualCluster Reconfig - Update VirtualCluster
 1. Use [hived-config-2](file/hived-config-2.yaml).
 2. Submit job [itc-reconfig-3](file/itc-reconfig-3.yaml) to default VC. Wait until it is running.
-3. Move one K80-NODE cell from default VC to VC1, then becomes [hived-config-6](file/hived-config-6.yaml).
+3. Move one K80-NODE cell from default VC to vc1, then becomes [hived-config-6](file/hived-config-6.yaml).
 4. Use [hived-config-6](file/hived-config-6.yaml), and restart HiveD.
 5. The job will still run without any interruption but [lazy preempted](#Lazy-Preemption) by HiveD.
-6. To confirm it is [lazy preempted](#Lazy-Preemption), submit job [itc-reconfig-5](file/itc-reconfig-5.yaml) to VC1 which requests all K80 nodes. The job will immediately preempt [itc-reconfig-3](file/itc-reconfig-3.yaml).
+6. To confirm it is [lazy preempted](#Lazy-Preemption), submit job [itc-reconfig-5](file/itc-reconfig-5.yaml) to vc1 which requests all K80 nodes. The job will immediately preempt [itc-reconfig-3](file/itc-reconfig-3.yaml).
   <img src="file/itc-reconfig-5.png" width="900"/>

 ## Bad Hardware Awareness
@ -219,17 +220,3 @@ Avoid scheduling pods to bad hardware.
 4. Bring back 10.151.41.26 by `sudo systemctl start kubelet`. Wait until this is detected by K8S.
 5. The waiting job will start running, without any retries.
   <img src="file/itc-badnode50-3.png" width="900"/>
-
-## Leverage K8S Default Scheduler
-### Description
-You can still leverage almost all scheduling features provided by your existing [K8S Default Scheduler](https://kubernetes.io/docs/concepts/scheduling/kube-scheduler) with HiveD, such as these [Filtering Policies](https://kubernetes.io/docs/concepts/scheduling/kube-scheduler/#filtering).
-
-### Reproduce Steps
-#### Leverage [Labels and Selectors](https://kubernetes.io/docs/concepts/overview/working-with-objects/labels)
-1. Use [hived-config-2](file/hived-config-2.yaml).
-2. Remove PAI worker label for 10.151.41.26 (the only M60 node).
-3. Submit job [itc-no-worker-label](file/itc-no-worker-label.yaml), which requests M60 node, it will be waiting without IP associated.
-   <img src="file/itc-no-worker-label-1.png" width="900"/>
-4. Add back PAI worker label for 10.151.41.26.
-5. The waiting job will start running, without any retries.
-   <img src="file/itc-no-worker-label-2.png" width="900"/>
--- a/example/feature/file/hived-config-1.yaml
+++ b/example/feature/file/hived-config-1.yaml
@ -34,11 +34,11 @@ physicalCluster:
    - cellAddress: 10.151.41.24

 virtualClusters:
-  VC1:
+  vc1:
    virtualCells:
    - cellType: 3-K80-NODE.K80-NODE
      cellNumber: 1
-  VC2:
+  vc2:
    virtualCells:
    - cellType: 3-K80-NODE.K80-NODE
      cellNumber: 1
--- a/example/feature/file/hived-config-2.yaml
+++ b/example/feature/file/hived-config-2.yaml
@ -45,7 +45,7 @@ physicalCluster:
    - cellAddress: 10.151.41.26

 virtualClusters:
-  VC1:
+  vc1:
    virtualCells:
    - cellType: K80-NODE-POOL.K80-NODE
      cellNumber: 1
--- a/example/feature/file/hived-config-3.yaml
+++ b/example/feature/file/hived-config-3.yaml
@ -45,7 +45,7 @@ physicalCluster:
    - cellAddress: 10.151.41.26

 virtualClusters:
-  VC1:
+  vc1:
    virtualCells:
    - cellType: K80-NODE-POOL.K80-NODE
      cellNumber: 1
--- a/example/feature/file/hived-config-33.yaml
+++ b/example/feature/file/hived-config-33.yaml
@ -42,7 +42,7 @@ physicalCluster:
    # - cellAddress: 10.151.41.25

 virtualClusters:
-  VC1:
+  vc1:
    virtualCells:
    - cellType: K80-NODE-POOL.K80-NODE
      cellNumber: 1
--- a/example/feature/file/hived-config-4.yaml
+++ b/example/feature/file/hived-config-4.yaml
@ -45,7 +45,7 @@ physicalCluster:
    - cellAddress: 10.151.41.25

 virtualClusters:
-  VC1:
+  vc1:
    virtualCells:
    - cellType: K80-NODE-POOL.K80-NODE
      cellNumber: 1
--- a/example/feature/file/hived-config-5.yaml
+++ b/example/feature/file/hived-config-5.yaml
@ -45,7 +45,7 @@ physicalCluster:
    - cellAddress: 10.151.41.26

 virtualClusters:
-  VC1:
+  vc1:
    virtualCells:
    - cellType: K80-NODE-POOL.K80-NODE
      cellNumber: 4
--- a/example/feature/file/hived-config-6.yaml
+++ b/example/feature/file/hived-config-6.yaml
@ -45,7 +45,7 @@ physicalCluster:
    - cellAddress: 10.151.41.26

 virtualClusters:
-  VC1:
+  vc1:
    virtualCells:
    - cellType: K80-NODE-POOL.K80-NODE
      cellNumber: 3
--- a/example/feature/file/hived-config-7.yaml
+++ b/example/feature/file/hived-config-7.yaml
@ -44,7 +44,7 @@ physicalCluster:
    - cellAddress: 10.151.41.26

 virtualClusters:
-  VC1:
+  vc1:
    virtualCells:
    - cellType: K80-NODE-POOL.K80-NODE
      cellNumber: 1
--- a/example/feature/file/hived-config-8.yaml
+++ b/example/feature/file/hived-config-8.yaml
@ -34,13 +34,13 @@ physicalCluster:
        - cellAddress: 10.151.41.24

 virtualClusters:
-  VC1:
+  vc1:
    virtualCells:
      - cellType: 3-K80-NODE.K80-NODE
        cellNumber: 1
    pinnedCells:
      - pinnedCellId: VC1-K80
-  VC2:
+  vc2:
    virtualCells:
      - cellType: 3-K80-NODE.K80-NODE
        cellNumber: 1
--- a/example/feature/file/itc-badnode50.yaml
+++ b/example/feature/file/itc-badnode50.yaml
@ -11,18 +11,17 @@ taskRoles:
    instances: 1
    completion:
      minFailedInstances: 1
-      minSucceededInstances: 6
+      minSucceededInstances: 1
    dockerImage: keras_tensorflow_example
    resourcePerInstance:
      cpu: 4
      memoryMB: 8192
      gpu: 1
    commands:
-      - nvidia-smi -L
-      - printenv
-      - sleep 10000
+      - rm /usr/local/cuda/lib64/stubs/libcuda.so.1
+      - python mnist_cnn.py
 defaults:
-  virtualCluster: VC1
+  virtualCluster: vc1
 extras:
  gangAllocation: true
  hivedScheduler:
@ -30,4 +29,3 @@ extras:
    taskRoles:
      train:
        skuType: M60
-  submitFrom: submit-job-v2
--- a/example/feature/file/itc-buddy.yaml
+++ b/example/feature/file/itc-buddy.yaml
@ -20,9 +20,9 @@ taskRoles:
    commands:
      - nvidia-smi -L
      - printenv
-      - sleep 10000
+      - sleep 10m
 defaults:
-  virtualCluster: VC1
+  virtualCluster: vc1
 extras:
  gangAllocation: true
  hivedScheduler:
@ -30,4 +30,3 @@ extras:
    taskRoles:
      train:
        skuType: K80
-  submitFrom: submit-job-v2
--- a/example/feature/file/itc-dtf.yaml
+++ b/example/feature/file/itc-dtf.yaml
@ -100,5 +100,3 @@ deployments:
          - echo "Uploading data ..."
 defaults:
  deployment: tf_example
-extras:
-  submitFrom: submit-job-v2
--- a/example/feature/file/itc-elastic.yaml
+++ b/example/feature/file/itc-elastic.yaml
@ -21,7 +21,7 @@ taskRoles:
      - rm /usr/local/cuda/lib64/stubs/libcuda.so.1
      - python mnist_cnn.py
 defaults:
-  virtualCluster: VC1
+  virtualCluster: vc1
 extras:
  gangAllocation: false
  hivedScheduler:
@ -29,4 +29,3 @@ extras:
    taskRoles:
      train:
        skuType: K80
-  submitFrom: submit-job-v2
--- a/example/feature/file/itc-gang.yaml
+++ b/example/feature/file/itc-gang.yaml
@ -21,7 +21,7 @@ taskRoles:
      - rm /usr/local/cuda/lib64/stubs/libcuda.so.1
      - python mnist_cnn.py
 defaults:
-  virtualCluster: VC1
+  virtualCluster: vc1
 extras:
  gangAllocation: true
  hivedScheduler:
@ -29,4 +29,3 @@ extras:
    taskRoles:
      train:
        skuType: K80
-  submitFrom: submit-job-v2
--- a/example/feature/file/itc-gang4.yaml
+++ b/example/feature/file/itc-gang4.yaml
@ -11,7 +11,7 @@ taskRoles:
    instances: 4
    completion:
      minFailedInstances: 1
-      minSucceededInstances: 6
+      minSucceededInstances: 4
    dockerImage: keras_tensorflow_example
    resourcePerInstance:
      cpu: 4
@ -21,7 +21,7 @@ taskRoles:
      - rm /usr/local/cuda/lib64/stubs/libcuda.so.1
      - python mnist_cnn.py
 defaults:
-  virtualCluster: VC1
+  virtualCluster: vc1
 extras:
  gangAllocation: true
  hivedScheduler:
@ -29,4 +29,3 @@ extras:
    taskRoles:
      train:
        skuType: K80
-  submitFrom: submit-job-v2
--- a/example/feature/file/itc-inter-preempt-oppo.yaml
+++ b/example/feature/file/itc-inter-preempt-oppo.yaml
@ -20,13 +20,12 @@ taskRoles:
    commands:
      - rm /usr/local/cuda/lib64/stubs/libcuda.so.1
      - python mnist_cnn.py
-      - sleep 10000
+      - sleep 10m
 defaults:
-  virtualCluster: VC1
+  virtualCluster: vc1
 extras:
  hivedScheduler:
    jobPriorityClass: oppo
    taskRoles:
      train:
        skuType: K80
-  submitFrom: submit-job-v2
--- a/example/feature/file/itc-inter-preempt-prod.yaml
+++ b/example/feature/file/itc-inter-preempt-prod.yaml
@ -20,7 +20,7 @@ taskRoles:
    commands:
      - rm /usr/local/cuda/lib64/stubs/libcuda.so.1
      - python mnist_cnn.py
-      - sleep 10000
+      - sleep 10m
 defaults:
  virtualCluster: default
 extras:
@ -29,4 +29,3 @@ extras:
    taskRoles:
      train:
        skuType: K80
-  submitFrom: submit-job-v2
--- a/example/feature/file/itc-intra-imd-preempt-prod.yaml
+++ b/example/feature/file/itc-intra-imd-preempt-prod.yaml
@ -20,13 +20,12 @@ taskRoles:
    commands:
      - rm /usr/local/cuda/lib64/stubs/libcuda.so.1
      - python mnist_cnn.py
-      - sleep 10000
+      - sleep 10m
 defaults:
-  virtualCluster: VC1
+  virtualCluster: vc1
 extras:
  hivedScheduler:
    jobPriorityClass: prod
    taskRoles:
      train:
        skuType: M60
-  submitFrom: submit-job-v2
--- a/example/feature/file/itc-intra-imd-preempt-test.yaml
+++ b/example/feature/file/itc-intra-imd-preempt-test.yaml
@ -20,13 +20,12 @@ taskRoles:
    commands:
      - rm /usr/local/cuda/lib64/stubs/libcuda.so.1
      - python mnist_cnn.py
-      - sleep 10000
+      - sleep 10m
 defaults:
-  virtualCluster: VC1
+  virtualCluster: vc1
 extras:
  hivedScheduler:
    jobPriorityClass: test
    taskRoles:
      train:
        skuType: M60
-  submitFrom: submit-job-v2
--- a/example/feature/file/itc-intra-lazy-preempt-prod.yaml
+++ b/example/feature/file/itc-intra-lazy-preempt-prod.yaml
@ -20,13 +20,12 @@ taskRoles:
    commands:
      - rm /usr/local/cuda/lib64/stubs/libcuda.so.1
      - python mnist_cnn.py
-      - sleep 10000
+      - sleep 10m
 defaults:
-  virtualCluster: VC1
+  virtualCluster: vc1
 extras:
  hivedScheduler:
    jobPriorityClass: prod
    taskRoles:
      train:
        skuType: K80
-  submitFrom: submit-job-v2
--- a/example/feature/file/itc-intra-lazy-preempt-prod2.yaml
+++ b/example/feature/file/itc-intra-lazy-preempt-prod2.yaml
@ -20,7 +20,7 @@ taskRoles:
    commands:
      - rm /usr/local/cuda/lib64/stubs/libcuda.so.1
      - python mnist_cnn.py
-      - sleep 10000
+      - sleep 10m
 defaults:
  virtualCluster: default
 extras:
@ -29,4 +29,3 @@ extras:
    taskRoles:
      train:
        skuType: K80
-  submitFrom: submit-job-v2
--- a/example/feature/file/itc-intra-lazy-preempt-test.yaml
+++ b/example/feature/file/itc-intra-lazy-preempt-test.yaml
@ -20,13 +20,13 @@ taskRoles:
    commands:
      - rm /usr/local/cuda/lib64/stubs/libcuda.so.1
      - python mnist_cnn.py
-      - sleep 10000
+      - sleep 10m
 defaults:
-  virtualCluster: VC1
+  virtualCluster: vc1
 extras:
  hivedScheduler:
    jobPriorityClass: test
    taskRoles:
      train:
        skuType: K80
-  submitFrom: submit-job-v2
+    lazyPreemptionEnable: true
--- a/example/feature/file/itc-k80-type.yaml
+++ b/example/feature/file/itc-k80-type.yaml
@ -20,8 +20,9 @@ taskRoles:
    commands:
      - rm /usr/local/cuda/lib64/stubs/libcuda.so.1
      - python mnist_cnn.py
+      - sleep 10m
 defaults:
-  virtualCluster: VC1
+  virtualCluster: vc1
 extras:
  gangAllocation: false
  hivedScheduler:
@ -29,4 +30,3 @@ extras:
    taskRoles:
      train:
        skuType: K80
-  submitFrom: submit-job-v2
--- a/example/feature/file/itc-no-type.yaml
+++ b/example/feature/file/itc-no-type.yaml
@ -1,5 +1,5 @@
 protocolVersion: 2
-name: itc-k80-type
+name: itc-no-type
 type: job
 prerequisites:
  - protocolVersion: 2
@ -21,9 +21,8 @@ taskRoles:
      - rm /usr/local/cuda/lib64/stubs/libcuda.so.1
      - python mnist_cnn.py
 defaults:
-  virtualCluster: VC1
+  virtualCluster: vc1
 extras:
  gangAllocation: false
  hivedScheduler:
    jobPriorityClass: prod
-  submitFrom: submit-job-v2
--- a/example/feature/file/itc-no-worker-label-1.png
+++ b/example/feature/file/itc-no-worker-label-1.png
--- a/example/feature/file/itc-no-worker-label-2.png
+++ b/example/feature/file/itc-no-worker-label-2.png
--- a/example/feature/file/itc-no-worker-label.yaml
+++ b/example/feature/file/itc-no-worker-label.yaml
@ -1,33 +0,0 @@
-protocolVersion: 2
-name: itc-no-worker-label
-type: job
-prerequisites:
-  - protocolVersion: 2
-    name: keras_tensorflow_example
-    type: dockerimage
-    uri: openpai/pai.example.keras.tensorflow
-taskRoles:
-  train:
-    instances: 1
-    completion:
-      minFailedInstances: 1
-      minSucceededInstances: 6
-    dockerImage: keras_tensorflow_example
-    resourcePerInstance:
-      cpu: 4
-      memoryMB: 8192
-      gpu: 1
-    commands:
-      - nvidia-smi -L
-      - printenv
-      - sleep 10000
-defaults:
-  virtualCluster: VC1
-extras:
-  gangAllocation: true
-  hivedScheduler:
-    jobPriorityClass: prod
-    taskRoles:
-      train:
-        skuType: M60
-  submitFrom: submit-job-v2
--- a/example/feature/file/itc-oppo.yaml
+++ b/example/feature/file/itc-oppo.yaml
@ -21,7 +21,7 @@ taskRoles:
      - rm /usr/local/cuda/lib64/stubs/libcuda.so.1
      - python mnist_cnn.py
 defaults:
-  virtualCluster: VC1
+  virtualCluster: vc1
 extras:
  gangAllocation: false
  hivedScheduler:
@ -29,4 +29,3 @@ extras:
    taskRoles:
      train:
        skuType: K80
-  submitFrom: submit-job-v2
--- a/example/feature/file/itc-pin.yaml
+++ b/example/feature/file/itc-pin.yaml
@ -35,7 +35,7 @@ taskRoles:
    - python mnist_cnn.py

 defaults:
-  virtualCluster: VC1
+  virtualCluster: vc1

 extras:
  hivedScheduler:
--- a/example/feature/file/itc-reconfig-1.yaml
+++ b/example/feature/file/itc-reconfig-1.yaml
@ -20,12 +20,12 @@ taskRoles:
    commands:
      - rm /usr/local/cuda/lib64/stubs/libcuda.so.1
      - python mnist_cnn.py
+      - sleep 10m
 defaults:
-  virtualCluster: VC1
+  virtualCluster: vc1
 extras:
  hivedScheduler:
    jobPriorityClass: test
    taskRoles:
      train:
        skuType: M60
-  submitFrom: submit-job-v2
--- a/example/feature/file/itc-reconfig-2.yaml
+++ b/example/feature/file/itc-reconfig-2.yaml
@ -21,11 +21,10 @@ taskRoles:
      - rm /usr/local/cuda/lib64/stubs/libcuda.so.1
      - python mnist_cnn.py
 defaults:
-  virtualCluster: VC1
+  virtualCluster: vc1
 extras:
  hivedScheduler:
    jobPriorityClass: test
    taskRoles:
      train:
        skuType: M60
-  submitFrom: submit-job-v2
--- a/example/feature/file/itc-reconfig-3.yaml
+++ b/example/feature/file/itc-reconfig-3.yaml
@ -20,6 +20,7 @@ taskRoles:
    commands:
      - rm /usr/local/cuda/lib64/stubs/libcuda.so.1
      - python mnist_cnn.py
+      - sleep 10m
 defaults:
  virtualCluster: default
 extras:
@ -28,4 +29,3 @@ extras:
    taskRoles:
      train:
        skuType: K80
-  submitFrom: submit-job-v2
--- a/example/feature/file/itc-reconfig-4.yaml
+++ b/example/feature/file/itc-reconfig-4.yaml
@ -11,7 +11,7 @@ taskRoles:
    instances: 4
    completion:
      minFailedInstances: 1
-      minSucceededInstances: 2
+      minSucceededInstances: 4
    dockerImage: keras_tensorflow_example
    resourcePerInstance:
      cpu: 16
@ -21,11 +21,10 @@ taskRoles:
      - rm /usr/local/cuda/lib64/stubs/libcuda.so.1
      - python mnist_cnn.py
 defaults:
-  virtualCluster: VC1
+  virtualCluster: vc1
 extras:
  hivedScheduler:
    jobPriorityClass: test
    taskRoles:
      train:
        skuType: K80
-  submitFrom: submit-job-v2
--- a/example/feature/file/itc-reconfig-5.yaml
+++ b/example/feature/file/itc-reconfig-5.yaml
@ -11,7 +11,7 @@ taskRoles:
    instances: 3
    completion:
      minFailedInstances: 1
-      minSucceededInstances: 2
+      minSucceededInstances: 3
    dockerImage: keras_tensorflow_example
    resourcePerInstance:
      cpu: 16
@ -21,11 +21,10 @@ taskRoles:
      - rm /usr/local/cuda/lib64/stubs/libcuda.so.1
      - python mnist_cnn.py
 defaults:
-  virtualCluster: VC1
+  virtualCluster: vc1
 extras:
  hivedScheduler:
    jobPriorityClass: test
    taskRoles:
      train:
        skuType: K80
-  submitFrom: submit-job-v2
--- a/example/feature/file/itc-safety-1.yaml
+++ b/example/feature/file/itc-safety-1.yaml
@ -22,7 +22,7 @@ taskRoles:
    - python mnist_cnn.py

 defaults:
-  virtualCluster: VC1
+  virtualCluster: vc1

 extras:
  hivedScheduler:
--- a/example/feature/file/itc-safety-2.yaml
+++ b/example/feature/file/itc-safety-2.yaml
@ -22,7 +22,7 @@ taskRoles:
    - python mnist_cnn.py

 defaults:
-  virtualCluster: VC1
+  virtualCluster: vc1

 extras:
  hivedScheduler:
--- a/pkg/algorithm/cell_allocation.go
+++ b/pkg/algorithm/cell_allocation.go
@ -235,9 +235,9 @@ func getUsablePhysicalCells(
 		return nil
 	}
 	// prioritize the cells with fewer opportunistic pods (to reduce preemption)
-	sort.SliceStable(candidates, func(i, j int) bool {
-		return candidates[i].GetUsedLeafCellNumAtPriorities()[opportunisticPriority] <
-			candidates[j].GetUsedLeafCellNumAtPriorities()[opportunisticPriority]
+	sort.SliceStable(usableCandidates, func(i, j int) bool {
+		return usableCandidates[i].GetUsedLeafCellNumAtPriorities()[opportunisticPriority] <
+			usableCandidates[j].GetUsedLeafCellNumAtPriorities()[opportunisticPriority]
 	})
 	return usableCandidates
 }
--- a/pkg/api/types.go
+++ b/pkg/api/types.go
@ -83,7 +83,7 @@ type PodSchedulingSpec struct {
 	LeafCellNumber          int32              `yaml:"leafCellNumber"`
 	GangReleaseEnable       bool               `yaml:"gangReleaseEnable"`
 	LazyPreemptionEnable    bool               `yaml:"lazyPreemptionEnable"`
-	IgnoreK8sSuggestedNodes bool               `yaml:"ignoreK8sSuggestedNodes"`
+	IgnoreK8sSuggestedNodes bool               `yaml:"ignoreK8sSuggestedNodes" default:"true"`
 	AffinityGroup           *AffinityGroupSpec `yaml:"affinityGroup"`
 }

--- a/pkg/internal/utils.go
+++ b/pkg/internal/utils.go
@ -232,7 +232,7 @@ func ExtractPodSchedulingSpec(pod *core.Pod) *si.PodSchedulingSpec {
 	defer AsBadRequestPanic()
 	errPfx := fmt.Sprintf("Pod annotation %v: ", si.AnnotationKeyPodSchedulingSpec)

-	podSchedulingSpec := si.PodSchedulingSpec{}
+	podSchedulingSpec := si.PodSchedulingSpec{IgnoreK8sSuggestedNodes: true}

 	annotation := convertOldAnnotation(pod.Annotations[si.AnnotationKeyPodSchedulingSpec])
 	if annotation == "" {