Update documentation and install prerequisites.

This commit is contained in:
Jin Li 2017-09-18 17:10:10 -07:00
Родитель 14cb225e5b
Коммит 415d38305e
14 изменённых файлов: 83 добавлений и 19 удалений

11
docs/deployment/Azure/FAQ.md Executable file
Просмотреть файл

@ -0,0 +1,11 @@
# Frequently Asked Questions (FAQ) for Azure Cluster Deployment.
Please refer to [this](../knownissues/Readme.md) for more general deployment issues.
## After setup, I cannot visit the deployed DL Workspace portal.
* Please wait a few minutes after the deployment script runs through to allow the portal container to be pulled and scheduled for execution.
## I can't execute Spark job on Azure.
The current default deployment procedure on Azure doesn't deploy HDFS/Spark. So Spark job execution is not available.

Просмотреть файл

@ -2,18 +2,15 @@
This document describes the procedure to deploy DL workspace cluster on Azure. We are still improving the deployment procedure on Azure. Please contact the authors if you have encounter deployment issue.
Please note that the procedure below doesn't deploy HDFS/Spark on DLWorkspace cluster on Azure. So Spark job execution is not available on Azure Cluster.
0. Follow [this document](../DevEnvironment/README.md) to setup the dev environment of DLWorkspace. Login to your Azure subscription on your dev machine via:
```
az login
```
1. Please create Configuration file with single line of your cluster name. The cluster name should be unique, only with lower characters or numbers.
```
cluster_name: [your cluster name]
```
1. Please [configure](configure.md) your azure cluster.
2. Initial cluster and generate certificates and keys:
```
@ -39,9 +36,6 @@ cluster_name: [your cluster name]
```
where machine1 is your azure infrastructure node. (you may get the address by ./deploy.py display)
The script block execute the following command in sequences: (you do NOT need to run the following commands if you have run step 5)
1. Setup basic tools on the Ubuntu image.
```
@ -68,6 +62,6 @@ cluster_name: [your cluster name]
./deploy.py kubernetes start jobmanager restfulapi webportal
```
6. If you run into a deployment issue, please check [here](Deployment_Issue.md) first.
6. If you run into a deployment issue, please check [here](FAQ.md) first.

Просмотреть файл

@ -0,0 +1,53 @@
# Configuration: Azure Cluster
For more customized configuration, please refer to the [Configuration Section](../configuration/Readme.md).
## Azure Cluster specific configuration
We have greatly simplified Azure Cluster Configuration. As a minimum, you will only need to create a config.yaml file under src/ClusterBootstrap, with the cluster name.
### Cluster Name
Cluster name must be unique, and should be specified as:
```
cluster_name: [your cluster name]
```
### Authentication
If you are not building a cluster for Microsoft employee usage, you will also need to configure [Authentication](../authentication/Readme.md).
### Additional configuration.
You may provide/change the specification of the deployed Azure cluster by adding the following information on config.yaml file:
```
azure_cluster:
<<your cluster name>>:
"infra_node_num": 1,
"worker_node_num": 2,
"azure_location": "westus2",
"infra_vm_size" : "Standard_D1_v2",
"worker_vm_size": "Standard_NC6",
```
* infra_node_num: should be odd (1, 3 or 5), number of infrastructure node for the deployment. 3 infrastructure node tolerate 1 failure, and 5 infrastructure node tolerate 2 failures. However, more infrastructure nodes (and more failure tolerance) will reduce performance of the node.
* worker_node_num: number of worker node used for deployment.
* azure_location:
Please use the following to find all available azure locations.
```
az account list-locations
```
* infra_vm_size, worker_vm_size: infrastructure and worker VM size.
Usually, a CPU VM will be used for infra_vm_size, and a GPU VM will be used for worker_vm_size. Please use the following to find all available Azure VM size.
```
az vm list-sizes --location westus2
```

Просмотреть файл

@ -1,6 +1,6 @@
# Build ISO (for USB) or docker image (PXE) for DL workspace deployment.
This document describes the procedure to build the deployment image for DL workspace cluster. After the configuration file is [created](Configuration.md), in directory 'src/ClusterBootstrap', run:
This document describes the procedure to build the deployment image for DL workspace cluster. After the configuration file is [created](configuration/Readme.md), in directory 'src/ClusterBootstrap', run:
```
python deploy.py -y build

Просмотреть файл

@ -2,7 +2,7 @@
0. [Run Once] The installation program needs certain utilities (such as docker and python). The simplest setup is to use the [development docker](../../DevDocker.md) and run the subsequent command in the development docker. Alternatively, you may choose to run install_prerequisites.sh once to install utilities.
1. [Create Configuration file](Configuration.md), and determine important information of the cluster (e.g., cluster name, number of Etcd servers used). Please refer to [Backup/Restore](Backup.md) on instruction to backup/restore cluster configuration.
1. [Create Configuration file](configuration/Readme.md), and determine important information of the cluster (e.g., cluster name, number of Etcd servers used). Please refer to [Backup/Restore](Backup.md) on instruction to backup/restore cluster configuration.
2. Config storage system to be used in the cluster, following instructions in [Storage.md](Storage.md). If you are in Microsoft corp net and just want to test DLWorkspace in **small scale**, you may add the following configurations to config.yaml to use the public NFS server provided by CCS group.

Просмотреть файл

@ -2,7 +2,7 @@
This document describes the procedure to deploy DL workspace cluster via PXE server. **__The procedure will automatically wipe out the system disk and deploy a CoreOS image of DL workspace cluster__** to all machines that are on the same subnet and served by the PXE server. Please proceed with caution.
1. [Create Configuration file](Configuration.md)
1. [Create Configuration file](configuration/Readme.md)
2. [Build a bootable image](Build.md).

Просмотреть файл

@ -1,7 +1,7 @@
# Deployment of DL workspace cluster via USB sticks
This document describes the procedure to build and deploy a small DL workspace cluster via USB sticks. The key procedures are:
1. [Create Configuration file](Configuration.md)
1. [Create Configuration file](configuration/Readme.md)
2. [Build a bootable image](Build.md) (ISO/docker image) :
3. Burn a USB stick (>=1GB):
You could use [Rufus](https://www.ubuntu.com/download/desktop/create-a-usb-stick-on-windows) tool recommended by Ubuntu or many other tools to burn .iso to a USB stick. If Rufus is used, please use the following options:

Просмотреть файл

@ -2,7 +2,7 @@
This document describes the procedure to deploy DL workspace cluster on a Ubuntu Cluster that is on a VLAN with a initial node that is used as a PXE-server for prime the cluster.
1. Please [create Configuration file](Configuration.md) and build [the relevant deployment key](Build.md).
1. Please [create Configuration file](configuration/Readme.md) and build [the relevant deployment key](Build.md).
Please copy config_azure.yaml.template to config.yaml, and fill in the necessary information of the cluster.
You may add the configuration to either config.yaml, or cluster.yaml, with the following entry:

Просмотреть файл

@ -2,7 +2,7 @@
## I don't see the log in option for (Gmail, Live.com, etc..)
Please check if you have properly configured [authentication](../authentication/Readme.md), and include the authentication option in the [config.yaml](../Configuration.md) file.
Please check if you have properly configured [authentication](../authentication/Readme.md), and include the authentication option in the [config.yaml](../configuration/Readme.md) file.
## I am the administrator, and when I log in, I got the message "You are not an authorized user to this cluster!"

Просмотреть файл

@ -2,7 +2,7 @@
This document describes the procedure to deploy DL workspace on a prior deployed CoreOS cluster (e.g., certain Philly deployment). This document is still evolving. Please contact the author for any question.
1. [Create Configuration file](Configuration.md)
1. [Create Configuration file](configuration/Readme.md)
1. Please copy src/ClusterBootstrap/config_philly.yaml.template to src/ClusterBootstrap/config.yaml, and further fill in various
parameters in the config file.
2. Please specify the private key used to access the Philly cluster.

Просмотреть файл

@ -4,7 +4,7 @@ Please follow the following setup to deploy DL workspace on philly.
0. [Run Once] The installation program needs certain utilities (such as docker and python). The simplest setup is to use the [development docker](../../DevDocker.md) and run the subsequent command in the development docker. Alternatively, you may choose to run install_prerequisites.sh once to install utilities.
1. [Create Configuration file](Configuration.md), and determine important information of the cluster (e.g., cluster name, number of Etcd servers used). Please refer to [Backup/Restore](Backup.md) on instruction to backup/restore cluster configuration.
1. [Create Configuration file](../deployment/configuration/Readme.md), and determine important information of the cluster (e.g., cluster name, number of Etcd servers used). Please refer to [Backup/Restore](Backup.md) on instruction to backup/restore cluster configuration.
2. [Build deployment images and various executables] (Build.md):
```

Просмотреть файл

@ -51,5 +51,11 @@ sudo apt-key adv --keyserver packages.microsoft.com --recv-keys 417A0893
sudo apt-get install apt-transport-https
sudo apt-get update && sudo apt-get install azure-cli
# Disable Network manager.
if [ -f /etc/NetworkManager/NetworkManager.conf ]; then
sed "s/^dns=dnsmasq$/#dns=dnsmasq/" /etc/NetworkManager/NetworkManager.conf > /tmp/NetworkManager.conf && sudo mv /tmp/NetworkManager.conf /etc/NetworkManager/NetworkManager.conf
sudo service network-manager restart
fi

Просмотреть файл

@ -367,7 +367,7 @@ if __name__ == '__main__':
Create and manage a Azure VM cluster.
Prerequest:
* Create azure_cluster_config.yaml according to instruction in docs/deployment/Configuration.md.
* Create config.yaml according to instruction in docs/deployment/azure/configure.md.
Command:
create Create an Azure VM cluster based on the parameters in config file.