Родитель
9c2d3f6c0f
Коммит
8e42cbed2f
|
@ -131,6 +131,7 @@ The next step is to build our image to be able to run it using docker. For that,
|
|||
From the [`./src`](./src) repository, we can build the image with
|
||||
|
||||
```console
|
||||
cd src
|
||||
docker build -t ${DOCKER_USERNAME}/tf-mnist .
|
||||
```
|
||||
> Reminder: the `-t` argument allows to **tag** the image with a specific name.
|
||||
|
@ -215,7 +216,8 @@ ENTRYPOINT ["python", "/app/main.py"]
|
|||
Then simply rebuild the image with a new tag (you can use docker or nvidia-docker interchangeably for any command except run):
|
||||
|
||||
```console
|
||||
docker build -t ${DOCKER_USERNAME}/tf-mnist:gpu .
|
||||
cd src
|
||||
docker build -t ${DOCKER_USERNAME}/tf-mnist:gpu -f Dockerfile.gpu .
|
||||
```
|
||||
|
||||
Finally run the container with nvidia-docker:
|
||||
|
|
|
@ -140,9 +140,9 @@ kubectl get nodes
|
|||
Should yield an output similar to this one:
|
||||
```
|
||||
NAME STATUS ROLES AGE VERSION
|
||||
aks-nodepool1-42640332-0 Ready agent 1h v1.9.6
|
||||
aks-nodepool1-42640332-1 Ready agent 1h v1.9.6
|
||||
aks-nodepool1-42640332-2 Ready agent 1h v1.9.6
|
||||
aks-nodepool1-42640332-0 Ready agent 1h v1.11.1
|
||||
aks-nodepool1-42640332-1 Ready agent 1h v1.11.1
|
||||
aks-nodepool1-42640332-2 Ready agent 1h v1.11.1
|
||||
```
|
||||
|
||||
If you provisioned GPU VM, describing one of the node should indicate the presence of GPU(s) on the node:
|
||||
|
|
|
@ -37,7 +37,7 @@ Before going further, let's take a look at what the `TFJob` object looks like:
|
|||
|
||||
| Field | Type| Description |
|
||||
|-------|-----|-------------|
|
||||
| ReplicaSpecs | `TFReplicaSpec` array | Specification for a set of TensorFlow processes, defined below |
|
||||
| TFReplicaSpec | `TFReplicaSpec` array | Specification for a set of TensorFlow processes, defined below |
|
||||
|
||||
Let's go deeper:
|
||||
|
||||
|
@ -57,11 +57,13 @@ kind: TFJob
|
|||
metadata:
|
||||
name: example-tfjob
|
||||
spec:
|
||||
replicaSpecs:
|
||||
- template:
|
||||
tfReplicaSpecs:
|
||||
MASTER:
|
||||
replicas: 1
|
||||
template:
|
||||
spec:
|
||||
containers:
|
||||
- image: wbuchwalter/tf-mnist:gpu
|
||||
- image: <DOCKER_USERNAME>/tf-mnist:gpu
|
||||
name: tensorflow
|
||||
resources:
|
||||
limits:
|
||||
|
@ -87,8 +89,10 @@ kind: TFJob
|
|||
metadata:
|
||||
name: module6-ex1-gpu
|
||||
spec:
|
||||
replicaSpecs:
|
||||
- template:
|
||||
tfReplicaSpecs:
|
||||
MASTER:
|
||||
replicas: 1
|
||||
template:
|
||||
spec:
|
||||
containers:
|
||||
- image: <DOCKER_USERNAME>/tf-mnist:gpu # From module 1
|
||||
|
|
Загрузка…
Ссылка в новой задаче