* Update tfReplicaSpecs

* Address comments
This commit is contained in:
Rita Zhang 2018-09-24 13:35:47 -07:00 коммит произвёл GitHub
Родитель 9c2d3f6c0f
Коммит 8e42cbed2f
Не найден ключ, соответствующий данной подписи
Идентификатор ключа GPG: 4AEE18F83AFDEB23
3 изменённых файлов: 16 добавлений и 10 удалений

Просмотреть файл

@ -131,6 +131,7 @@ The next step is to build our image to be able to run it using docker. For that,
From the [`./src`](./src) repository, we can build the image with
```console
cd src
docker build -t ${DOCKER_USERNAME}/tf-mnist .
```
> Reminder: the `-t` argument allows to **tag** the image with a specific name.
@ -215,7 +216,8 @@ ENTRYPOINT ["python", "/app/main.py"]
Then simply rebuild the image with a new tag (you can use docker or nvidia-docker interchangeably for any command except run):
```console
docker build -t ${DOCKER_USERNAME}/tf-mnist:gpu .
cd src
docker build -t ${DOCKER_USERNAME}/tf-mnist:gpu -f Dockerfile.gpu .
```
Finally run the container with nvidia-docker:

Просмотреть файл

@ -140,9 +140,9 @@ kubectl get nodes
Should yield an output similar to this one:
```
NAME STATUS ROLES AGE VERSION
aks-nodepool1-42640332-0 Ready agent 1h v1.9.6
aks-nodepool1-42640332-1 Ready agent 1h v1.9.6
aks-nodepool1-42640332-2 Ready agent 1h v1.9.6
aks-nodepool1-42640332-0 Ready agent 1h v1.11.1
aks-nodepool1-42640332-1 Ready agent 1h v1.11.1
aks-nodepool1-42640332-2 Ready agent 1h v1.11.1
```
If you provisioned GPU VM, describing one of the node should indicate the presence of GPU(s) on the node:

Просмотреть файл

@ -37,7 +37,7 @@ Before going further, let's take a look at what the `TFJob` object looks like:
| Field | Type| Description |
|-------|-----|-------------|
| ReplicaSpecs | `TFReplicaSpec` array | Specification for a set of TensorFlow processes, defined below |
| TFReplicaSpec | `TFReplicaSpec` array | Specification for a set of TensorFlow processes, defined below |
Let's go deeper:
@ -57,11 +57,13 @@ kind: TFJob
metadata:
name: example-tfjob
spec:
replicaSpecs:
- template:
tfReplicaSpecs:
MASTER:
replicas: 1
template:
spec:
containers:
- image: wbuchwalter/tf-mnist:gpu
- image: <DOCKER_USERNAME>/tf-mnist:gpu
name: tensorflow
resources:
limits:
@ -87,8 +89,10 @@ kind: TFJob
metadata:
name: module6-ex1-gpu
spec:
replicaSpecs:
- template:
tfReplicaSpecs:
MASTER:
replicas: 1
template:
spec:
containers:
- image: <DOCKER_USERNAME>/tf-mnist:gpu # From module 1