Update SETUP.md for more databricks info
This commit is contained in:
Родитель
77f2c0defc
Коммит
36bcc70f91
19
SETUP.md
19
SETUP.md
|
@ -1,9 +1,9 @@
|
|||
# Setup guide
|
||||
|
||||
This document describes how to setup all the dependencies to run the notebooks in this repository in two different environments:
|
||||
This document describes how to setup all the dependencies to run the notebooks in this repository in two different platforms:
|
||||
|
||||
* a Linux system (local or an [Azure Data Science Virtual Machine (DSVM)](https://azure.microsoft.com/en-us/services/virtual-machines/data-science-virtual-machines/))
|
||||
* [Azure Databricks](https://azure.microsoft.com/en-us/services/databricks/).
|
||||
* Linux Machine: Local or [Azure Data Science Virtual Machine (DSVM)](https://azure.microsoft.com/en-us/services/virtual-machines/data-science-virtual-machines/)
|
||||
* [Azure Databricks](https://azure.microsoft.com/en-us/services/databricks/)
|
||||
|
||||
## Table of Contents
|
||||
|
||||
|
@ -21,18 +21,15 @@ This document describes how to setup all the dependencies to run the notebooks i
|
|||
|
||||
## Compute environments
|
||||
|
||||
Depending on the type of recommender system and the notebook that needs to be run, there are different computational requirements.
|
||||
|
||||
Currently, this repository supports the following environments:
|
||||
Depending on the type of recommender system and the notebook that needs to be run, there are different computational requirements. Currently, this repository supports the following environments:
|
||||
|
||||
* Python CPU
|
||||
* Python GPU
|
||||
* PySpark
|
||||
|
||||
|
||||
## Setup guide for Local or DSVM
|
||||
|
||||
### Setup Requirements
|
||||
### Requirements
|
||||
|
||||
* Machine running Linux, Windows Subsystem for Linux ([WSL](https://docs.microsoft.com/en-us/windows/wsl/about)) or macOS
|
||||
* Anaconda with Python version >= 3.6.
|
||||
|
@ -139,11 +136,13 @@ SPARK_WORKER_OPTS="-Dspark.worker.cleanup.enabled=true, -Dspark.worker.cleanup.a
|
|||
|
||||
## Setup guide for Azure Databricks
|
||||
|
||||
### Requirements of Azure Databricks
|
||||
### Requirements
|
||||
|
||||
* Runtime version 4.3 (Apache Spark 2.3.1, Scala 2.11)
|
||||
* Databricks Runtime version 4.3 (Apache Spark 2.3.1, Scala 2.11) or greater
|
||||
* Python 3
|
||||
|
||||
An example of how to create an Azure Databricks workspace and an Apache Spark cluster within the workspace can be found from [here](https://docs.microsoft.com/en-us/azure/azure-databricks/quickstart-create-databricks-workspace-portal). To utilize deep learning models and GPUs, you may setup GPU-enabled cluster. For more details about this topic, please see [Azure Databricks deep learning guide](https://docs.azuredatabricks.net/applications/deep-learning/index.html).
|
||||
|
||||
### Repository installation
|
||||
You can setup the repository as a library on Databricks either manually or by running an [installation script](scripts/databricks_install.sh). Both options assume you have access to a provisioned Databricks workspace and cluster and that you have appropriate permissions to install libraries.
|
||||
|
||||
|
|
Загрузка…
Ссылка в новой задаче