Merge pull request #5 from Azure/readme-update

Update README.md
This commit is contained in:
JS 2017-06-20 17:06:54 -04:00 коммит произвёл GitHub
Родитель 2f324fca9c 0e9da1cfba
Коммит 3d753440c8
1 изменённых файлов: 19 добавлений и 4 удалений

Просмотреть файл

@ -24,17 +24,26 @@ A suite of distributed tools to help engineers scale their work into Azure.
The entire experience of this package is centered around a few commands in the bin folder.
### Create and setup your cluster
First, create your cluster:
```
./bin/spark-cluster-create \
--id <my-cluster-id> \
--size <number of nodes> \
--size-low-pri <number of low priority nodes> \
--vm-size <vm-size> \
--custom-script <path to custom bash script to run on each node> \
--custom-script <path to custom bash script to run on each node> (optional) \
--wait/--no-wait (optional)
```
You can also create your cluster with [low-priority](https://docs.microsoft.com/en-us/azure/batch/batch-low-pri-vms) VMs at an 80% discount by using **--size-low-pri** instead of **--size**:
```
./bin/spark-cluster-create \
--id <my-cluster-id> \
--size-low-pri <number of low-pri nodes>
--vm-size <vm-size>
```
When your cluster is ready, create a user for your cluster:
```
./bin/spark-cluster-create-user \
@ -43,6 +52,8 @@ When your cluster is ready, create a user for your cluster:
--password <password>
```
### Submit a Spark job
Now you can submit jobs to run against the cluster:
```
./bin/spark-submit \
@ -52,6 +63,8 @@ Now you can submit jobs to run against the cluster:
--application <path-to-spark-job>
```
### Interact with your Spark cluster
To view the spark UI, open up an ssh tunnel with the "masterui" option and a local port to map to:
```
./bin/spark-cluster-ssh \
@ -66,6 +79,8 @@ Optionally, you can also open up a jupyter notebook with the "jupyter" option to
--jupyter <local-port>
```
### Manage your Spark cluster
You can also see your clusters from the CLI:
```
./bin/spark-cluster-list
@ -73,10 +88,10 @@ You can also see your clusters from the CLI:
And get the state of any specified cluster:
```
./bin/spark-cluster-get <my-cluster-id>
./bin/spark-cluster-get --id <my-cluster-id>
```
Finally, you can delete any specified cluster:
```
./bin/spark-cluster-delete <my-cluster-id>
./bin/spark-cluster-delete --id <my-cluster-id>
```