diff --git a/README.md b/README.md index 00b82e8b..2d26806a 100644 --- a/README.md +++ b/README.md @@ -100,9 +100,18 @@ Further detailed instructions, including setup in Azure, are here: 1. [Debugging and monitoring models](docs/debugging_and_monitoring.md) 1. [Model diagnostics](docs/model_diagnostics.md) 1. [Move a model to a different workspace](docs/move_model.md) -1. [Deployment](docs/deploy_on_aml.md) 1. [Working with FastMRI models](docs/fastmri.md) +## Deployment +We offer a companion set of open-sourced tools that help to integrate trained CT segmentation models with clinical +software systems: +- The [InnerEye-Gateway](https://github.com/microsoft/InnerEye-Gateway) is a Windows service running in a DICOM network, +that can route anonymized DICOM images to an inference service. +- The [InnerEye-Inference](https://github.com/microsoft/InnerEye-Inference) component offers a REST API that integrates +with the InnnEye-Gateway, to run inference on InnerEye-DeepLearning models. + +Details can be found [here](docs/deploy_on_aml.md). + ![docs/deployment.png](docs/deployment.png) ## More information diff --git a/docs/environment.md b/docs/environment.md index 1c1cc613..be17b507 100644 --- a/docs/environment.md +++ b/docs/environment.md @@ -4,30 +4,14 @@ In order to work with the solution, your OS environment will need [git](https://git-scm.com/) and [git lfs](https://git-lfs.github.com/) installed. Depending on the OS that you are running the installation instructions may vary. Please refer to respective documentation sections on the tools' websites for detailed instructions. -## Using the InnerEye code as a git submodule of your project -You have two options for working with our codebase: -* You can fork the InnerEye-DeepLearning repository, and work off that. -* Or you can create your project that uses the InnerEye-DeepLearning code, and include InnerEye-DeepLearning as a git -submodule. - -If you go down the second route, here's the list of files you will need in your project (that's the same as those -given in [this document](building_models.md)) -* `environment.yml`: Conda environment with python, pip, pytorch -* `settings.yml`: A file similar to `InnerEye\settings.yml` containing all your Azure settings -* A folder like `ML` that contains your additional code, and model configurations. -* A file `ML/runner.py` that invokes the InnerEye training runner, but that points the code to your environment and Azure -settings; see the [Building models](building_models.md) instructions for details. - -You then need to add the InnerEye code as a git submodule, in folder `innereye-submodule`: -```shell script -git submodule add https://github.com/microsoft/InnerEye-DeepLearning innereye-submodule -``` -Then configure your Python IDE to consume *both* your repository root *and* the `innereye-submodule` subfolder as inputs. -In Pycharm, you would do that by going to Settings/Project Structure. Mark your repository root as "Source", and -`innereye-submodule` as well. - We recommend using PyCharm or VSCode as the Python editor. +You have two options for working with our codebase: +* You can fork the InnerEye-DeepLearning repository, and work off that. We recommend that because it is easiest to set up. +* Or you can create your project that uses the InnerEye-DeepLearning code, and include InnerEye-DeepLearning as a git +submodule. We only recommended that if you are very handy with Python. More details about this option +[are here](innereye_as_submodule.md). + ## Windows Subsystem for Linux Setup When developing on a Windows machine, we recommend using [the Windows Subsystem for Linux, WSL2](https://docs.microsoft.com/en-us/windows/wsl/about). That's because PyTorch has better support for Linux. diff --git a/docs/innereye_as_submodule.md b/docs/innereye_as_submodule.md new file mode 100644 index 00000000..9b5c247a --- /dev/null +++ b/docs/innereye_as_submodule.md @@ -0,0 +1,95 @@ +# Using the InnerEye code as a git submodule of your project + +You can use InnerEye as a submodule in your own project. +If you go down that route, here's the list of files you will need in your project (that's the same as those +given in [this document](building_models.md)) +* `environment.yml`: Conda environment with python, pip, pytorch +* `settings.yml`: A file similar to `InnerEye\settings.yml` containing all your Azure settings +* A folder like `ML` that contains your additional code, and model configurations. +* A file like `myrunner.py` that invokes the InnerEye training runner, but that points the code to your environment +and Azure settings; see the [Building models](building_models.md) instructions for details. Please see below for how +`myrunner.py` should look like. + +You then need to add the InnerEye code as a git submodule, in folder `innereye-deeplearning`: +```shell script +git submodule add https://github.com/microsoft/InnerEye-DeepLearning innereye-deeplearning +``` +Then configure your Python IDE to consume *both* your repository root *and* the `innereye-deeplearning` subfolder as inputs. +In Pycharm, you would do that by going to Settings/Project Structure. Mark your repository root as "Source", and +`innereye-deeplearning` as well. + +Example commandline runner that uses the InnerEye runner (called `myrunner.py` above): +```python +import sys +from pathlib import Path + + +# This file here mimics how the InnerEye code would be used as a git submodule. + +# Ensure that this path correctly points to the root folder of your repository. +repository_root = Path(__file__).absolute() + + +def add_package_to_sys_path_if_needed() -> None: + """ + Checks if the Python paths in sys.path already contain the /innereye-deeplearning folder. If not, add it. + """ + is_package_in_path = False + innereye_submodule_folder = repository_root / "innereye-deeplearning" + for path_str in sys.path: + path = Path(path_str) + if path == innereye_submodule_folder: + is_package_in_path = True + break + if not is_package_in_path: + print(f"Adding {innereye_submodule_folder} to sys.path") + sys.path.append(str(innereye_submodule_folder)) + + +def main() -> None: + try: + from InnerEye import ML # noqa: 411 + except: + add_package_to_sys_path_if_needed() + + from InnerEye.ML import runner + print(f"Repository root: {repository_root}") + # Check here that yaml_config_file correctly points to your settings file + runner.run(project_root=repository_root, + yaml_config_file=Path("settings.yml"), + post_cross_validation_hook=None) + + +if __name__ == '__main__': + main() + +``` + +## Adding new models + +1. Set up a directory outside of InnerEye to holds your configs. In your repository root, you could have a folder +`InnerEyeLocal`, parallel to the InnerEye submodule, alongside `settings.yml` and `myrunner.py`. + +The example below creates a new flavour of the Glaucoma model in `InnerEye/ML/configs/classification/GlaucomaPublic`. +All that needs to be done is change the dataset. We will do this by subclassing GlaucomaPublic in a new config +stored in `InnerEyeLocal/configs` +1. Create folder `InnerEyeLocal/configs` +1. Create a config file `InnerEyeLocal/configs/GlaucomaPublicExt.py` which extends the `GlaucomaPublic` class +like this: +```python +from InnerEye.ML.configs.classification.GlaucomaPublic import GlaucomaPublic + +class MyGlaucomaModel(GlaucomaPublic): + def __init__(self) -> None: + super().__init__() + self.azure_dataset_id="name_of_your_dataset_on_azure" +``` +1. In `settings.yml`, set `model_configs_namespace` to `InnerEyeLocal.configs` so this config +is found by the runner. Set `extra_code_directory` to `InnerEyeLocal`. + +#### Start Training +Run the following to start a job on AzureML: +``` +python myrunner.py --azureml=True --model=MyGlaucomaModel +``` +See [Model Training](building_models.md) for details on training outputs, resuming training, testing models and model ensembles. diff --git a/docs/sample_tasks.md b/docs/sample_tasks.md index 7a6fe2ad..57b7c69b 100644 --- a/docs/sample_tasks.md +++ b/docs/sample_tasks.md @@ -1,7 +1,8 @@ # Sample Tasks -Two sample tasks for the classification and segmentation pipelines. -This document will walk through the steps in [Training Steps](building_models.md), but with specific examples for each task. +This document contains two sample tasks for the classification and segmentation pipelines. + +The document will walk through the steps in [Training Steps](building_models.md), but with specific examples for each task. Before trying tp train these models, you should have followed steps to set up an [environment](environment.md) and [AzureML](setting_up_aml.md) ## Sample classification task: Glaucoma Detection on OCT volumes @@ -9,61 +10,42 @@ Before trying tp train these models, you should have followed steps to set up an This example is based on the paper [A feature agnostic approach for glaucoma detection in OCT volumes](https://arxiv.org/pdf/1807.04855v3.pdf). ### Downloading and preparing the dataset -1. The dataset is available [here](https://zenodo.org/record/1481223#.Xs-ehzPiuM_) [[1]](#1). +The dataset is available [here](https://zenodo.org/record/1481223#.Xs-ehzPiuM_) [[1]](#1). -1. After downloading and extracting the zip file, run the [create_glaucoma_dataset_csv.py](https://github.com/microsoft/InnerEye-DeepLearning/blob/main/InnerEye/Scripts/create_glaucoma_dataset_csv.py) +After downloading and extracting the zip file, run the [create_glaucoma_dataset_csv.py](https://github.com/microsoft/InnerEye-DeepLearning/blob/main/InnerEye/Scripts/create_glaucoma_dataset_csv.py) script on the extracted folder. ``` python create_dataset_csv.py /path/to/extracted/folder ``` This will convert the dataset to csv form and create a file `dataset.csv`. -1. Upload this folder (with the images and `dataset.csv`) to Azure Blob Storage. For details on creating a storage account, +Finally, upload this folder (with the images and `dataset.csv`) to Azure Blob Storage. For details on creating a storage account, see [Setting up AzureML](setting_up_aml.md#step-4-create-a-storage-account-for-your-datasets). The dataset should go into a container called `datasets`, with a folder name of your choice (`name_of_your_dataset_on_azure` in the description below). -### Setting up training +### Creating the model configuration and starting training -You have two options for running the Glaucoma model: -- You can directly work on a fork of the InnerEye repository. In this case, you need to modify `AZURE_DATASET_ID` -in `GlaucomaPublic.py` to match the dataset upload location, called `name_of_your_dataset_on_azure` above. -If you choose that, you can start training via -``` -python InnerEye/ML/runner.py --model=GlaucomaPublic --azureml=True -``` -- Alternatively, you can create a separate runner and a separate model configuration folder. The steps described -below refer to this route. - -#### Setting up a second runner -1. Set up a directory outside of InnerEye to holds your configs, as in -[Setting Up Training](building_models.md#setting-up-training). After this step, you should have a folder InnerEyeLocal - beside InnerEye with files `settings.yml` and `ML/runner.py`. - -#### Creating the classification model configuration -The full configuration for the Glaucoma model is at `InnerEye/ML/configs/classification/GlaucomaPublic`. -All that needs to be done is change the dataset. We will do this by subclassing GlaucomaPublic in a new config -stored in `InnerEyeLocal/ML` -1. Create folder configs/classification under InnerEyeLocal/ML -1. Create a config file called GlaucomaPublicExt.py there which extends the GlaucomaPublic class that looks like +Next, you need to create a configuration file `InnerEye/ML/configs/MyGlaucoma.py` + which extends the GlaucomaPublic class like this: ```python from InnerEye.ML.configs.classification.GlaucomaPublic import GlaucomaPublic - - -class GlaucomaPublicExt(GlaucomaPublic): +class MyGlaucomaModel(GlaucomaPublic): def __init__(self) -> None: super().__init__() self.azure_dataset_id="name_of_your_dataset_on_azure" ``` -1. In `settings.yml`, set `model_configs_namespace` to `InnerEyeLocal.ML.configs` so this config -is found by the runner. Set `extra_code_directory` to `InnerEyeLocal`. +The value for `self.azure_dataset_id` should match the dataset upload location, called +`name_of_your_dataset_on_azure` above. -#### Start Training -Run the following to start a job on AzureML +Once that config is in place, you can start training in AzureML via ``` -python InnerEyeLocal/ML/runner.py --azureml=True --model=GlaucomaPublicExt +python InnerEye/ML/runner.py --model=MyGlaucomaModel --azureml=True ``` -See [Model Training](building_models.md) for details on training outputs, resuming training, testing models and model ensembles. + +As an alternative to working with a fork of the repository, you can use InnerEye-DeepLearning via a submodule. +Please check [here](innereye_as_submodule.md) for details. + ## Sample segmentation task: Segmentation of Lung CT @@ -71,46 +53,45 @@ This example is based on the [Lung CT Segmentation Challenge 2017](https://wiki. ### Downloading and preparing the dataset -1. The dataset [[3]](#3)[[4]](#4) can be downloaded [here](https://wiki.cancerimagingarchive.net/display/Public/Lung+CT+Segmentation+Challenge+2017#021ca3c9a0724b0d9df784f1699d35e2). -1. The next step is to convert the dataset from DICOM-RT to NIFTI. Before this, place the downloaded dataset in another - parent folder, which we will call `datasets`. This file structure is expected by the converison tool. -1. Use the [InnerEye-CreateDataset](https://github.com/microsoft/InnerEye-createdataset) to create a NIFTI dataset - from the downloaded (DICOM) files. +The dataset [[3]](#3)[[4]](#4) can be downloaded [here](https://wiki.cancerimagingarchive.net/display/Public/Lung+CT+Segmentation+Challenge+2017#021ca3c9a0724b0d9df784f1699d35e2). + +You need to convert the dataset from DICOM-RT to NIFTI. Before this, place the downloaded dataset in another + parent folder, which we will call `datasets`. This file structure is expected by the conversion tool. + +Next, use the +[InnerEye-CreateDataset](https://github.com/microsoft/InnerEye-createdataset) commandline tools to create a +NIFTI dataset from the downloaded (DICOM) files. After installing the tool, run ```batch InnerEye.CreateDataset.Runner.exe dataset --datasetRootDirectory= --niftiDatasetDirectory= --dicomDatasetDirectory= --geoNorm 1;1;3 ``` Now, you should have another folder under `datasets` with the converted Nifti files. The `geonorm` tag tells the tool to normalize the voxel sizes during conversion. -1. Upload this folder (with the images and dataset.csv) to Azure Blob Storage. For details on creating a storage account, -see [Setting up AzureML](setting_up_aml.md#step-4-create-a-storage-account-for-your-datasets). - - -### Setting up training -1. Set up a directory outside of InnerEye to holds your configs, as in -[Setting Up Training](building_models.md#setting-up-training). After this step, you should have a folder InnerEyeLocal -beside InnerEye with files settings.yml and ML/runner.py. -### Creating the segmentation model configuration -The full configuration for the Lung model is at InnerEye/ML/configs/segmentation/Lung. -All that needs to be done is change the dataset. We will do this by subclassing Lung in a new config -stored in InnerEyeLocal/ML -1. Create folder configs/segmentation under InnerEyeLocal/ML -1. Create a config file called LungExt.py there which extends the GlaucomaPublic class that looks like this: +Finally, upload this folder (with the images and dataset.csv) to Azure Blob Storage. For details on creating a storage account, +see [Setting up AzureML](setting_up_aml.md#step-4-create-a-storage-account-for-your-datasets). All files should go +into a folder in the `datasets` container, for example `my_lung_dataset`. This folder name will need to go into the +`azure_dataset_id` field of the model configuration, see below. + +### Creating the model configuration and starting training +You can then create a new model configuration, based on the template +[Lung.py](../InnerEye/ML/configs/segmentation/Lung.py). To do this, create a file +`InnerEye/ML/configs/segmentation/MyLungModel.py`, where you create a subclass of the template Lung model, and +add the `azure_dataset_id` field (i.e., the name of the folder that contains the uploaded data from above), +so that it looks like: ```python -from InnerEye.ML.configs.segmentation.Lung import Lung - -class LungExt(Lung): +from InnerEye.ML.configs.segmentation.Lung import Lung +class MyLungModel(Lung): def __init__(self) -> None: - super().__init__(azure_dataset_id="name_of_your_dataset_on_azure") -``` -1. In `settings.yml`, set `model_configs_namespace` to `InnerEyeLocal.ML.configs` so this config -is found by the runner. Set `extra_code_directory` to `InnerEyeLocal`. - -### Start Training -Run the following to start a job on AzureML + super().__init__() + self.azure_dataset_id = "my_lung_dataset" ``` -python InnerEyeLocal/ML/runner.py --azureml=True --model=LungExt --train=True +If you are using InnerEye as a submodule, please add this configuration in your private configuration folder, +as described for the Glaucoma model [here](innereye_as_submodule.md). + +You can now run the following command to start a job on AzureML: +``` +python InnerEye/ML/runner.py --azureml=True --model=MyLungModel ``` See [Model Training](building_models.md) for details on training outputs, resuming training, testing models and model ensembles.