This commit is contained in:
Nikhil Joglekar 2018-12-03 01:20:01 -08:00
Родитель 65ac830795
Коммит 504c03616b
2 изменённых файлов: 80 добавлений и 84 удалений

Просмотреть файл

@ -1,14 +1,14 @@
# Setup guide
In this guide we show how to setup all the dependencies to run the notebooks of this repo on an [Azure DSVM](https://azure.microsoft.com/en-us/services/virtual-machines/data-science-virtual-machines/) and on [Azure Databricks](https://azure.microsoft.com/en-us/services/databricks/).
In this guide we show how to setup all the dependencies to run the notebooks of this repo on a local environment or [Azure DSVM](https://azure.microsoft.com/en-us/services/virtual-machines/data-science-virtual-machines/) and on [Azure Databricks](https://azure.microsoft.com/en-us/services/databricks/).
<details>
<summary><strong><em>Click here to see the Table of Contents</em></strong></summary>
* [Compute environments](#compute-environments)
* [Setup guide for the DSVM](#setup-guide-for-the-dsvm)
* [Requirements of the DSVM](#requirements-of-the-dsvm)
* [Dependencies setup for the DSVM](#dependencies-setup-for-the-dsvm)
* [Setup guide for Local or DSVM](#setup-guide-for-local-or-dsvm)
* [Setup Requirements](#setup-requirements)
* [Dependencies setup](#dependencies-setup)
* [Register the conda environment in Jupyter notebook](register-the-conda-environment-in-jupyter-notebook)
* [Tests](#tests)
* [Troubleshooting for the DSVM](#troubleshooting-for-the-dsvm)
@ -31,16 +31,16 @@ Environments supported to run the notebooks on the DSVM:
Environments supported to run the notebooks on Azure Databricks:
* PySpark
## Setup guide for the DSVM
## Setup guide for Local or DSVM
### Requirements of the DSVM
### Setup Requirements
- [Anaconda Python 3.6](https://conda.io/miniconda.html)
- The Python library dependencies can be found in this [script](scripts/generate_conda_file.sh).
- Machine with Spark (optional for Python environment but mandatory for PySpark environment).
- Machine with GPU (optional but desirable for computing acceleration).
### Dependencies setup for the DSVM
### Dependencies setup
We install the dependencies with Conda. As a pre-requisite, we may want to make sure that Conda is up-to-date:
@ -119,80 +119,6 @@ We can register our created conda environment to appear as a kernel in the Jupyt
source activate my_env_name
python -m ipykernel install --user --name my_env_name --display-name "Python (my_env_name)"
### Tests
This project use unit, smoke and integration tests with Python files and notebooks. For more information, see a [quick introduction to unit, smoke and integration tests](https://miguelgfierro.com/blog/2018/a-beginners-guide-to-python-testing/). Click on the following menus to see more details:
<details>
<summary><strong><em>Unit tests</em></strong></summary>
Unit tests ensure that each class or function behaves as it should. Every time a developer makes a pull request to staging or master branch, a battery of unit tests is executed. To manually execute the unit tests in the different environments, first **make sure you are in the correct environment**.
For executing the Python unit tests for the utilities:
pytest tests/unit -m "not notebooks and not spark and not gpu"
For executing the Python unit tests for the notebooks:
pytest tests/unit -m "notebooks and not spark and not gpu"
For executing the Python GPU unit tests for the utilities:
pytest tests/unit -m "not notebooks and not spark and gpu"
For executing the Python GPU unit tests for the notebooks:
pytest tests/unit -m "notebooks and not spark and gpu"
For executing the PySpark unit tests for the utilities:
pytest tests/unit -m "not notebooks and spark and not gpu"
For executing the PySpark unit tests for the notebooks:
pytest tests/unit -m "notebooks and spark and not gpu"
</details>
<details>
<summary><strong><em>Smoke tests</em></strong></summary>
Smoke tests make sure that the system works and are executed just before the integration tests every night.
For executing the Python smoke tests:
pytest tests/smoke -m "smoke and not spark and not gpu"
For executing the Python GPU smoke tests:
pytest tests/smoke -m "smoke and not spark and gpu"
For executing the PySpark smoke tests:
pytest tests/smoke -m "smoke and spark and not gpu"
</details>
<details>
<summary><strong><em>Integration tests</em></strong></summary>
Integration tests make sure that the program results are acceptable
For executing the Python integration tests:
pytest tests/integration -m "integration and not spark and not gpu"
For executing the Python GPU integration tests:
pytest tests/integration -m "integration and not spark and gpu"
For executing the PySpark integration tests:
pytest tests/integration -m "integration and spark and not gpu"
</details>
### Troubleshooting for the DSVM
@ -224,9 +150,6 @@ To make sure it works, you can now create a new notebook and import the utilitie
import reco_utils
```
### Dependencies setup for Azure Databricks
The dependencies has to be manually installed in the cluster, they can be found on [this script](scripts/generate_conda_file.sh).
### Troubleshooting for Azure Databricks
* For the [utilities](reco_utils) to work on Databricks, it is important to zip the content correctly. The zip has to be performed inside the root folder, if you zip directly the root folder, it won't work.

73
TESTS.md Normal file
Просмотреть файл

@ -0,0 +1,73 @@
# Tests
This project use unit, smoke and integration tests with Python files and notebooks. For more information, see a [quick introduction to unit, smoke and integration tests](https://miguelgfierro.com/blog/2018/a-beginners-guide-to-python-testing/). To manually execute the unit tests in the different environments, first **make sure you are in the correct environment as described in the [setup](/SETUP.md)**. Click on the following menus to see more details:
<details>
<summary><strong><em>Unit tests</em></strong></summary>
Unit tests ensure that each class or function behaves as it should. Every time a developer makes a pull request to staging or master branch, a battery of unit tests is executed.
For executing the Python unit tests for the utilities:
pytest tests/unit -m "not notebooks and not spark and not gpu"
For executing the Python unit tests for the notebooks:
pytest tests/unit -m "notebooks and not spark and not gpu"
For executing the Python GPU unit tests for the utilities:
pytest tests/unit -m "not notebooks and not spark and gpu"
For executing the Python GPU unit tests for the notebooks:
pytest tests/unit -m "notebooks and not spark and gpu"
For executing the PySpark unit tests for the utilities:
pytest tests/unit -m "not notebooks and spark and not gpu"
For executing the PySpark unit tests for the notebooks:
pytest tests/unit -m "notebooks and spark and not gpu"
</details>
<details>
<summary><strong><em>Smoke tests</em></strong></summary>
Smoke tests make sure that the system works and are executed just before the integration tests every night.
For executing the Python smoke tests:
pytest tests/smoke -m "smoke and not spark and not gpu"
For executing the Python GPU smoke tests:
pytest tests/smoke -m "smoke and not spark and gpu"
For executing the PySpark smoke tests:
pytest tests/smoke -m "smoke and spark and not gpu"
</details>
<details>
<summary><strong><em>Integration tests</em></strong></summary>
Integration tests make sure that the program results are acceptable
For executing the Python integration tests:
pytest tests/integration -m "integration and not spark and not gpu"
For executing the Python GPU integration tests:
pytest tests/integration -m "integration and not spark and gpu"
For executing the PySpark integration tests:
pytest tests/integration -m "integration and spark and not gpu"
</details>