nlp-recipes/SETUP.md

67 строки
2.9 KiB
Markdown
Исходник Обычный вид История

2019-06-14 23:29:32 +03:00
# Setup Guide
2019-06-04 13:09:15 +03:00
This document describes how to setup all the dependencies to run the notebooks in this repository.
The recommended environment to run these notebooks is the [Azure Data Science Virtual Machine (DSVM)](https://azure.microsoft.com/en-us/services/virtual-machines/data-science-virtual-machines/). Since a considerable number of the algorithms rely on deep learning, it is recommended to use a GPU DSVM.
For training at scale, operationalization or hyperparameter tuning, it is recommended to use [Azure ML](https://docs.microsoft.com/en-us/azure/machine-learning/service/).
## Table of Contents
* [Compute environments](#compute-environments)
* [Setup guide for Local or DSVM](#setup-guide-for-local-or-dsvm)
* [Setup Requirements](#setup-requirements)
* [Dependencies setup](#dependencies-setup)
2019-06-04 13:09:15 +03:00
* [Register the conda environment in the DSVM JupyterHub](#register-the-conda-environment-in--the-dsvm-jupyterhub)
2019-06-14 23:29:32 +03:00
## Compute Environments
2019-06-04 13:29:25 +03:00
Depending on the type of NLP system and the notebook that needs to be run, there are different computational requirements. Currently, this repository supports **Python CPU** and **Python GPU**.
2019-06-14 23:29:32 +03:00
## Setup Guide for Local or DSVM
2019-06-04 13:29:25 +03:00
### Requirements
2019-06-04 13:29:25 +03:00
* A machine running Linux, MacOS or Windows.
* Anaconda with Python version >= 3.6.
* This is pre-installed on Azure DSVM such that one can run the following steps directly. To setup on your local machine, [Miniconda](https://docs.conda.io/en/latest/miniconda.html) is a quick way to get started.
2019-06-14 23:29:32 +03:00
### Dependencies Setup
2019-06-04 13:29:25 +03:00
We provide a script, [generate_conda_file.py](tools/generate_conda_file.py), to generate a conda-environment yaml file
which you can use to create the target environment using the Python version 3.6 with all the correct dependencies.
2019-06-04 13:29:25 +03:00
Assuming the repo is cloned as `nlp` in the system, to install **a default (Python CPU) environment**:
2019-06-04 13:29:25 +03:00
cd nlp
python tools/generate_conda_file.py
conda env create -f nlp_cpu.yaml
2019-06-04 13:29:25 +03:00
You can specify the environment name as well with the flag `-n`.
2019-06-04 13:29:25 +03:00
Click on the following menus to see how to install the Python GPU environment:
<details>
<summary><strong><em>Python GPU environment</em></strong></summary>
Assuming that you have a GPU machine, to install the Python GPU environment, which by default installs the CPU environment:
2019-06-04 13:29:40 +03:00
cd nlp
2019-06-04 13:37:44 +03:00
python tools/generate_conda_file.py --gpu
2019-06-04 13:09:15 +03:00
conda env create -n nlp_gpu -f nlp_gpu.yaml
</details>
2019-06-14 23:29:32 +03:00
### Register Conda Environment in DSVM JupyterHub
2019-06-04 13:29:25 +03:00
We can register our created conda environment to appear as a kernel in the Jupyter notebooks.
conda activate my_env_name
python -m ipykernel install --user --name my_env_name --display-name "Python (my_env_name)"
2019-06-04 13:29:25 +03:00
If you are using the DSVM, you can [connect to JupyterHub](https://docs.microsoft.com/en-us/azure/machine-learning/data-science-virtual-machine/dsvm-ubuntu-intro#jupyterhub-and-jupyterlab) by browsing to `https://your-vm-ip:8000`.