diff --git a/1-docker/README.md b/1-docker/README.md index 45efada..70116c4 100644 --- a/1-docker/README.md +++ b/1-docker/README.md @@ -131,6 +131,7 @@ The next step is to build our image to be able to run it using docker. For that, From the [`./src`](./src) repository, we can build the image with ```console +cd src docker build -t ${DOCKER_USERNAME}/tf-mnist . ``` > Reminder: the `-t` argument allows to **tag** the image with a specific name. @@ -215,7 +216,8 @@ ENTRYPOINT ["python", "/app/main.py"] Then simply rebuild the image with a new tag (you can use docker or nvidia-docker interchangeably for any command except run): ```console -docker build -t ${DOCKER_USERNAME}/tf-mnist:gpu . +cd src +docker build -t ${DOCKER_USERNAME}/tf-mnist:gpu -f Dockerfile.gpu . ``` Finally run the container with nvidia-docker: diff --git a/2-kubernetes/README.md b/2-kubernetes/README.md index cb287e5..22160d8 100644 --- a/2-kubernetes/README.md +++ b/2-kubernetes/README.md @@ -140,9 +140,9 @@ kubectl get nodes Should yield an output similar to this one: ``` NAME STATUS ROLES AGE VERSION -aks-nodepool1-42640332-0 Ready agent 1h v1.9.6 -aks-nodepool1-42640332-1 Ready agent 1h v1.9.6 -aks-nodepool1-42640332-2 Ready agent 1h v1.9.6 +aks-nodepool1-42640332-0 Ready agent 1h v1.11.1 +aks-nodepool1-42640332-1 Ready agent 1h v1.11.1 +aks-nodepool1-42640332-2 Ready agent 1h v1.11.1 ``` If you provisioned GPU VM, describing one of the node should indicate the presence of GPU(s) on the node: diff --git a/6-tfjob/README.md b/6-tfjob/README.md index b18bb66..ac761bb 100644 --- a/6-tfjob/README.md +++ b/6-tfjob/README.md @@ -37,7 +37,7 @@ Before going further, let's take a look at what the `TFJob` object looks like: | Field | Type| Description | |-------|-----|-------------| -| ReplicaSpecs | `TFReplicaSpec` array | Specification for a set of TensorFlow processes, defined below | +| TFReplicaSpec | `TFReplicaSpec` array | Specification for a set of TensorFlow processes, defined below | Let's go deeper: @@ -57,11 +57,13 @@ kind: TFJob metadata: name: example-tfjob spec: - replicaSpecs: - - template: + tfReplicaSpecs: + MASTER: + replicas: 1 + template: spec: containers: - - image: wbuchwalter/tf-mnist:gpu + - image: /tf-mnist:gpu name: tensorflow resources: limits: @@ -87,8 +89,10 @@ kind: TFJob metadata: name: module6-ex1-gpu spec: - replicaSpecs: - - template: + tfReplicaSpecs: + MASTER: + replicas: 1 + template: spec: containers: - image: /tf-mnist:gpu # From module 1