* 📝Move docs folder to sphinx-docs

* Trigger build for new URL

* 📝 Fix build to include README + CHANGLOG

* 📝 Add back in link fixing

* 🐛 Fix docs links

* 🚨 📝 Fix markdown linting

* 📝 Change relative links to GitHub ones permanently

* 📝 Replace more relative paths

* 📝 Switch to symlinks

* 📝 Replace README in toctree

* 📝 Update README

* 🐛 Attempt to fix images not rendering

* 🐛 Fix broken links

* Remove IDE settings from gitignore

* ️ Move docs to `docs/` and add Makefile back

* 🙈 Update gitignore

* ♻️ ️ Resolve review comments and change theme

* 📝 🔀 Rebase + markdown linting

* 🔥 Remove build files (again)

* 🙈 Remove pieline-breaking symlink

*  Add furo to sphinx dependencies

* 📌 Move sphinx deps to environment.yml + lock

* 📝 Improve doc folder structure

* Return to copying instead of symlink

* 📝 Update indexing and titles

* 📝 Address review comments
This commit is contained in:
Peter Hessey 2022-08-04 09:15:19 +01:00 коммит произвёл GitHub
Родитель 4e12cec106
Коммит c1b363e158
Не найден ключ, соответствующий данной подписи
Идентификатор ключа GPG: 4AEE18F83AFDEB23
49 изменённых файлов: 437 добавлений и 392 удалений

6
.gitignore поставляемый
Просмотреть файл

@ -84,8 +84,10 @@ instance/
.scrapy
# Sphinx documentation
sphinx-docs/build/
sphinx-docs/source/md/
docs/build/
docs/source/md/CHANGELOG.md
docs/source/md/README.md
docs/source/md/LICENSE
# PyBuilder
target/

Просмотреть файл

@ -9,11 +9,7 @@ build:
python: miniconda3-4.7
sphinx:
configuration: sphinx-docs/source/conf.py
python:
install:
- requirements: sphinx-docs/requirements.txt
configuration: docs/source/conf.py
conda:
environment: environment.yml

Просмотреть файл

@ -181,7 +181,7 @@ institution id and series id columns are missing.
- ([#441](https://github.com/microsoft/InnerEye-DeepLearning/pull/441)) Add script to move models from one AzureML workspace to another: `python InnerEye/Scripts/move_model.py`
- ([#417](https://github.com/microsoft/InnerEye-DeepLearning/pull/417)) Added a generic way of adding PyTorch Lightning
models to the toolbox. It is now possible to train almost any Lightning model with the InnerEye toolbox in AzureML,
with only minimum code changes required. See [the MD documentation](docs/bring_your_own_model.md) for details.
with only minimum code changes required. See [the MD documentation](docs/source/md/bring_your_own_model.md) for details.
- ([#430](https://github.com/microsoft/InnerEye-DeepLearning/pull/430)) Update conversion to 1.0.1 InnerEye-DICOM-RT to
add: manufacturer, SoftwareVersions, Interpreter and ROIInterpretedTypes.
- ([#385](https://github.com/microsoft/InnerEye-DeepLearning/pull/385)) Add the ability to train a model on multiple
@ -354,7 +354,7 @@ console for easier diagnostics.
#### Fixed
- When registering a model, it now has a consistent folder structured, described [here](docs/deploy_on_aml.md). This
- When registering a model, it now has a consistent folder structured, described [here](docs/source/md/deploy_on_aml.md). This
folder structure is present irrespective of using InnerEye as a submodule or not. In particular, exactly 1 Conda
environment will be contained in the model.

Просмотреть файл

@ -2,53 +2,26 @@
[![Build Status](https://innereye.visualstudio.com/InnerEye/_apis/build/status/InnerEye-DeepLearning/InnerEye-DeepLearning-PR?branchName=main)](https://innereye.visualstudio.com/InnerEye/_build?definitionId=112&branchName=main)
## Overview
InnerEye-DeepLearning (IE-DL) is a toolbox for easily training deep learning models on 3D medical images. Simple to run both locally and in the cloud with [AzureML](https://docs.microsoft.com/en-gb/azure/machine-learning/), it allows users to train and run inference on the following:
This is a deep learning toolbox to train models on medical images (or more generally, 3D images).
It integrates seamlessly with cloud computing in Azure.
- Segmentation models.
- Classification and regression models.
- Any PyTorch Lightning model, via a [bring-your-own-model setup](docs/source/md/bring_your_own_model.md).
On the modelling side, this toolbox supports
In addition, this toolbox supports:
- Segmentation models
- Classification and regression models
- Adding cloud support to any PyTorch Lightning model, via a [bring-your-own-model setup](docs/bring_your_own_model.md)
On the user side, this toolbox focusses on enabling machine learning teams to achieve more. It is cloud-first, and
relies on [Azure Machine Learning Services (AzureML)](https://docs.microsoft.com/en-gb/azure/machine-learning/) for execution,
bookkeeping, and visualization. Taken together, this gives:
- **Traceability**: AzureML keeps a full record of all experiments that were executed, including a snapshot of
the code. Tags are added to the experiments automatically, that can later help filter and find old experiments.
- **Transparency**: All team members have access to each other's experiments and results.
- **Reproducibility**: Two model training runs using the same code and data will result in exactly the same metrics. All
sources of randomness like multithreading are controlled for.
- **Cost reduction**: Using AzureML, all compute (virtual machines, VMs) is requested at the time of starting the
training job, and freed up at the end. Idle VMs will not incur costs. In addition, Azure low priority
nodes can be used to further reduce costs (up to 80% cheaper).
- **Scale out**: Large numbers of VMs can be requested easily to cope with a burst in jobs.
Despite the cloud focus, all training and model testing works just as well on local compute, which is important for
model prototyping, debugging, and in cases where the cloud can't be used. In particular, if you already have GPU
machines available, you will be able to utilize them with the InnerEye toolbox.
In addition, our toolbox supports:
- Cross-validation using AzureML's built-in support, where the models for
individual folds are trained in parallel. This is particularly important for the long-running training jobs
often seen with medical images.
- Hyperparameter tuning using
[Hyperdrive](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-tune-hyperparameters).
- Cross-validation using AzureML, where the models for individual folds are trained in parallel. This is particularly important for the long-running training jobs often seen with medical images.
- Hyperparameter tuning using [Hyperdrive](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-tune-hyperparameters).
- Building ensemble models.
- Easy creation of new models via a configuration-based approach, and inheritance from an existing
architecture.
- Easy creation of new models via a configuration-based approach, and inheritance from an existing architecture.
Once training in AzureML is done, the models can be deployed from within AzureML.
## Documentation
For all documentation, including setup guides and APIs, please refer to the [IE-DL Read the Docs site](https://innereye-deeplearning.readthedocs.io/#).
## Quick Setup
This quick setup assumes you are using a machine running Ubuntu with Git, Git LFS, Conda and Python 3.7+ installed. Please refer to the [setup guide](docs/environment.md) for more detailed instructions on getting InnerEye set up with other operating systems and installing the above prerequisites.
### Instructions
This quick setup assumes you are using a machine running Ubuntu with Git, Git LFS, Conda and Python 3.7+ installed. Please refer to the [setup guide](docs/source/md/environment.md) for more detailed instructions on getting InnerEye set up with other operating systems and installing the above prerequisites.
1. Clone the InnerEye-DeepLearning repo by running the following command:
@ -73,25 +46,7 @@ If the above runs with no errors: Congratulations! You have successfully built y
If it fails, please check the
[troubleshooting page on the Wiki](https://github.com/microsoft/InnerEye-DeepLearning/wiki/Issues-with-code-setup-and-the-HelloWorld-model).
## Other Documentation
Further detailed instructions, including setup in Azure, are here:
1. [Setting up your environment](docs/environment.md)
1. [Setting up Azure Machine Learning](docs/setting_up_aml.md)
1. [Training a simple segmentation model in Azure ML](docs/hello_world_model.md)
1. [Creating a dataset](docs/creating_dataset.md)
1. [Building models in Azure ML](docs/building_models.md)
1. [Sample Segmentation and Classification tasks](docs/sample_tasks.md)
1. [Debugging and monitoring models](docs/debugging_and_monitoring.md)
1. [Model diagnostics](docs/model_diagnostics.md)
1. [Move a model to a different workspace](docs/move_model.md)
1. [Working with FastMRI models](docs/fastmri.md)
1. [Active label cleaning and noise robust learning toolbox](https://github.com/microsoft/InnerEye-DeepLearning/blob/1606729c7a16e1bfeb269694314212b6e2737939/InnerEye-DataQuality/README.md)
1. [Using InnerEye as a git submodule](docs/innereye_as_submodule.md)
1. [Evaluating pre-trained models](docs/hippocampus_model.md)
## Deployment
## Full InnerEye Deployment
We offer a companion set of open-sourced tools that help to integrate trained CT segmentation models with clinical
software systems:
@ -99,20 +54,23 @@ software systems:
- The [InnerEye-Gateway](https://github.com/microsoft/InnerEye-Gateway) is a Windows service running in a DICOM network,
that can route anonymized DICOM images to an inference service.
- The [InnerEye-Inference](https://github.com/microsoft/InnerEye-Inference) component offers a REST API that integrates
with the InnnEye-Gateway, to run inference on InnerEye-DeepLearning models.
with the InnerEye-Gateway, to run inference on InnerEye-DeepLearning models.
Details can be found [here](docs/deploy_on_aml.md).
Details can be found [here](docs/source/md/deploy_on_aml.md).
![docs/deployment.png](docs/deployment.png)
![docs/deployment.png](docs/source/images/deployment.png)
## More information
## Benefits of InnerEye-DeepLearning
1. [Project InnerEye](https://www.microsoft.com/en-us/research/project/medical-image-analysis/)
1. [Releases](docs/releases.md)
1. [Changelog](CHANGELOG.md)
1. [Testing](docs/testing.md)
1. [How to do pull requests](docs/pull_requests.md)
1. [Contributing](docs/contributing.md)
In combiniation with the power of AzureML, InnerEye provides the following benefits:
- **Traceability**: AzureML keeps a full record of all experiments that were executed, including a snapshot of the code. Tags are added to the experiments automatically, that can later help filter and find old experiments.
- **Transparency**: All team members have access to each other's experiments and results.
- **Reproducibility**: Two model training runs using the same code and data will result in exactly the same metrics. All sources of randomness are controlled for.
- **Cost reduction**: Using AzureML, all compute resources (virtual machines, VMs) are requested at the time of starting the training job and freed up at the end. Idle VMs will not incur costs. Azure low priority nodes can be used to further reduce costs (up to 80% cheaper).
- **Scalability**: Large numbers of VMs can be requested easily to cope with a burst in jobs.
Despite the cloud focus, InnerEye is designed to be able to run locally too, which is important for model prototyping, debugging, and in cases where the cloud can't be used. Therefore, if you already have GPU machines available, you will be able to utilize them with the InnerEye toolbox.
## Licensing

Просмотреть файл

@ -10,8 +10,8 @@ dependencies:
- blas=1.0=mkl
- blosc=1.21.0=h4ff587b_1
- bzip2=1.0.8=h7b6447c_0
- ca-certificates=2022.4.26=h06a4308_0
- certifi=2022.5.18.1=py38h06a4308_0
- ca-certificates=2022.07.19=h06a4308_0
- certifi=2022.6.15=py38h06a4308_0
- cudatoolkit=11.3.1=h2bc3f7f_2
- ffmpeg=4.2.2=h20bf706_0
- freetype=2.11.0=h70c0345_0
@ -42,10 +42,10 @@ dependencies:
- mkl-service=2.4.0=py38h7f8727e_0
- mkl_fft=1.3.1=py38hd3c417c_0
- mkl_random=1.2.2=py38h51133e4_0
- ncurses=6.3=h7f8727e_2
- ncurses=6.3=h5eee18b_3
- nettle=3.7.3=hbbd107a_1
- openh264=2.1.1=h4ff587b_0
- openssl=1.1.1o=h7f8727e_0
- openssl=1.1.1q=h7f8727e_0
- pip=20.1.1=py38_1
- python=3.8.3
- python-blosc=1.7.0=py38h7b6447c_0
@ -53,7 +53,7 @@ dependencies:
- pytorch-mutex=1.0=cuda
- readline=8.1.2=h7f8727e_1
- setuptools=61.2.0=py38h06a4308_0
- sqlite=3.38.3=hc218d9a_0
- sqlite=3.39.0=h5082296_0
- tk=8.6.12=h1ccaba5_0
- torchvision=0.11.1=py38_cu113
- typing_extensions=4.1.1=pyh06a4308_0
@ -63,19 +63,20 @@ dependencies:
- zlib=1.2.12=h7f8727e_2
- zstd=1.5.2=ha4553b6_0
- pip:
- absl-py==1.1.0
- absl-py==1.2.0
- adal==1.2.7
- aiohttp==3.8.1
- aiosignal==1.2.0
- alembic==1.8.0
- alabaster==0.7.12
- alembic==1.8.1
- ansiwrap==0.8.4
- applicationinsights==0.11.10
- argon2-cffi==21.3.0
- argon2-cffi-bindings==21.2.0
- async-timeout==4.0.2
- attrs==21.4.0
- attrs==22.1.0
- azure-common==1.1.28
- azure-core==1.24.1
- azure-core==1.24.2
- azure-graphrbac==0.61.1
- azure-identity==1.7.0
- azure-mgmt-authorization==0.61.0
@ -102,39 +103,42 @@ dependencies:
- azureml-train-automl-client==1.36.0
- azureml-train-core==1.36.0
- azureml-train-restclients-hyperdrive==1.36.0
- babel==2.10.3
- backports-tempfile==1.0
- backports-weakref==1.0.post1
- beautifulsoup4==4.11.1
- black==22.3.0
- bleach==5.0.0
- black==22.6.0
- bleach==5.0.1
- cachetools==4.2.4
- cffi==1.15.0
- charset-normalizer==2.0.12
- cffi==1.15.1
- charset-normalizer==2.1.0
- click==8.1.3
- cloudpickle==1.6.0
- colorama==0.4.5
- commonmark==0.9.1
- conda-merge==0.1.5
- contextlib2==21.6.0
- coverage==6.4.1
- coverage==6.4.2
- cryptography==3.3.2
- cycler==0.11.0
- databricks-cli==0.17.0
- dataclasses-json==0.5.2
- debugpy==1.6.0
- debugpy==1.6.2
- defusedxml==0.7.1
- deprecated==1.2.13
- distro==1.7.0
- docker==4.3.1
- docutils==0.17.1
- dotnetcore2==2.1.23
- entrypoints==0.4
- execnet==1.9.0
- fastjsonschema==2.15.3
- fastjsonschema==2.16.1
- fastmri==0.2.0
- flake8==3.8.3
- flask==2.1.2
- frozenlist==1.3.0
- fsspec==2022.5.0
- flask==2.2.0
- frozenlist==1.3.1
- fsspec==2022.7.1
- furo==2022.6.21
- fusepy==3.0.1
- future==0.18.2
- gitdb==4.0.9
@ -143,22 +147,23 @@ dependencies:
- google-auth-oauthlib==0.4.6
- gputil==1.4.0
- greenlet==1.1.2
- grpcio==1.46.3
- grpcio==1.47.0
- gunicorn==20.1.0
- h5py==2.10.0
- hi-ml==0.2.2
- hi-ml-azure==0.2.2
- humanize==4.2.0
- humanize==4.2.3
- idna==3.3
- imageio==2.15.0
- importlib-metadata==4.11.4
- importlib-resources==5.8.0
- imagesize==1.4.1
- importlib-metadata==4.12.0
- importlib-resources==5.9.0
- iniconfig==1.1.1
- innereye-dicom-rt==1.0.3
- ipykernel==6.15.0
- ipykernel==6.15.1
- ipython==7.31.1
- ipython-genutils==0.2.0
- ipywidgets==7.7.0
- ipywidgets==7.7.1
- isodate==0.6.1
- itsdangerous==2.1.2
- jeepney==0.8.0
@ -166,26 +171,26 @@ dependencies:
- jmespath==0.10.0
- joblib==0.16.0
- jsonpickle==2.2.0
- jsonschema==4.6.0
- jsonschema==4.9.1
- jupyter==1.0.0
- jupyter-client==6.1.5
- jupyter-console==6.4.3
- jupyter-core==4.10.0
- jupyter-console==6.4.4
- jupyter-core==4.11.1
- jupyterlab-pygments==0.2.2
- jupyterlab-widgets==1.1.0
- kiwisolver==1.4.3
- jupyterlab-widgets==1.1.1
- kiwisolver==1.4.4
- lightning-bolts==0.4.0
- llvmlite==0.34.0
- mako==1.2.0
- markdown==3.3.7
- mako==1.2.1
- markdown==3.4.1
- markupsafe==2.1.1
- marshmallow==3.16.0
- marshmallow==3.17.0
- marshmallow-enum==1.5.1
- matplotlib==3.3.0
- mccabe==0.6.1
- mistune==0.8.4
- mlflow==1.23.1
- mlflow-skinny==1.26.1
- mlflow-skinny==1.27.0
- monai==0.6.0
- more-itertools==8.13.0
- msal==1.18.0
@ -195,12 +200,12 @@ dependencies:
- multidict==6.0.2
- mypy==0.910
- mypy-extensions==0.4.3
- nbclient==0.6.4
- nbclient==0.6.6
- nbconvert==6.5.0
- nbformat==5.4.0
- ndg-httpsclient==0.5.1
- nest-asyncio==1.5.5
- networkx==2.8.4
- networkx==2.8.5
- nibabel==4.0.1
- notebook==6.4.12
- numba==0.51.2
@ -215,11 +220,12 @@ dependencies:
- pathspec==0.9.0
- pexpect==4.8.0
- pillow==9.0.0
- pkgutil-resolve-name==1.3.10
- platformdirs==2.5.2
- pluggy==0.13.1
- portalocker==2.4.0
- portalocker==2.5.1
- prometheus-client==0.14.1
- prometheus-flask-exporter==0.20.2
- prometheus-flask-exporter==0.20.3
- protobuf==3.20.1
- psutil==5.7.2
- ptyprocess==0.7.0
@ -251,11 +257,11 @@ dependencies:
- qtconsole==5.3.1
- qtpy==2.1.0
- querystring-parser==1.2.4
- requests==2.28.0
- requests==2.28.1
- requests-oauthlib==1.3.1
- rich==10.13.0
- rpdb==0.1.6
- rsa==4.8
- rsa==4.9
- ruamel-yaml==0.16.12
- ruamel-yaml-clib==0.2.6
- runstats==1.8.0
@ -268,8 +274,18 @@ dependencies:
- simpleitk==1.2.4
- six==1.15.0
- smmap==5.0.0
- snowballstemmer==2.2.0
- soupsieve==2.3.2.post1
- sqlalchemy==1.4.37
- sphinx==5.0.2
- sphinx-basic-ng==0.0.1a12
- sphinx-rtd-theme==1.0.0
- sphinxcontrib-applehelp==1.0.2
- sphinxcontrib-devhelp==1.0.2
- sphinxcontrib-htmlhelp==2.0.0
- sphinxcontrib-jsmath==1.0.1
- sphinxcontrib-qthelp==1.0.3
- sphinxcontrib-serializinghtml==1.1.5
- sqlalchemy==1.4.39
- sqlparse==0.4.2
- stopit==1.1.2
- stringcase==1.2.0
@ -281,22 +297,22 @@ dependencies:
- terminado==0.15.0
- textwrap3==0.9.2
- threadpoolctl==3.1.0
- tifffile==2022.5.4
- tifffile==2022.7.31
- tinycss2==1.1.1
- toml==0.10.2
- tomli==2.0.1
- torchio==0.18.74
- torchmetrics==0.6.0
- tornado==6.1
- tornado==6.2
- tqdm==4.64.0
- typing-inspect==0.7.1
- umap-learn==0.5.2
- urllib3==1.26.7
- webencodings==0.5.1
- websocket-client==1.3.3
- werkzeug==2.1.2
- widgetsnbextension==3.6.0
- werkzeug==2.2.1
- widgetsnbextension==3.6.1
- wrapt==1.14.1
- yacs==0.1.8
- yarl==1.7.2
- zipp==3.8.0
- yarl==1.8.1
- zipp==3.8.1

Просмотреть файл

@ -15,11 +15,11 @@ env_name="${name_arr[1]}"
# clear old conda envs, create new one
export CONDA_ALWAYS_YES="true"
conda env remove --name ${env_name::-1}
conda env remove --name ${env_name}
conda env create --file primary_deps.yml
# export new environment to environment.yml
conda env export -n ${env_name::-1} | grep -v "prefix:" > environment.yml
conda env export -n ${env_name} | grep -v "prefix:" > environment.yml
unset CONDA_ALWAYS_YES
# remove python version hash (technically not locked, so still potential for problems here if python secondary deps change)

Просмотреть файл

@ -14,12 +14,7 @@ help:
.PHONY: help Makefile
# Do some preprocessing, including copying over md files to the source directory so sphinx can find them,
# and changing references to codefiles in md files to urls.
preprocess:
python preprocess.py
# Catch-all target: route all unknown targets to Sphinx using the new
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
%: Makefile preprocess
%: Makefile
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)

12
docs/README.md Normal file
Просмотреть файл

@ -0,0 +1,12 @@
# Building docs for InnerEye-DeepLearning
1. First, make sure you have all the packages necessary for InnerEye.
2. Install pip dependencies from docs/requirements.txt:
```shell
pip install -r requirements.txt
```
1. Run `make html` from the `docs` folder. This will create html files under docs/build/html.
1. From the `docs/build/html` folder, run `python -m http.server 8080` to host the docs locally.
1. From your browser, navigate to `http://localhost:8080` to view the documentation.

Просмотреть файл

@ -10,8 +10,6 @@ if "%SPHINXBUILD%" == "" (
set SOURCEDIR=source
set BUILDDIR=build
if "%1" == "" goto help
%SPHINXBUILD% >NUL 2>NUL
if errorlevel 9009 (
echo.
@ -21,13 +19,11 @@ if errorlevel 9009 (
echo.may add the Sphinx directory to PATH.
echo.
echo.If you don't have Sphinx installed, grab it from
echo.http://sphinx-doc.org/
echo.https://www.sphinx-doc.org/
exit /b 1
)
REM Do some preprocessing, including copying over md files to the source directory so sphinx can find them,
REM and changing references to codefiles in md files to urls.
python preprocess.py
if "%1" == "" goto help
%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
goto end

2
docs/requirements.txt Normal file
Просмотреть файл

@ -0,0 +1,2 @@
- recommonmark
- readthedocs-sphinx-ext

Просмотреть файл

@ -16,6 +16,7 @@
# documentation root, make it absolute.
#
import sys
import shutil
from pathlib import Path
repo_dir = Path(__file__).absolute().parents[2]
sys.path.insert(0, str(repo_dir))
@ -62,7 +63,7 @@ exclude_patterns = [] # type: ignore
# The theme to use for HTML and HTML Help pages. See the documentation for
# a list of builtin themes.
#
html_theme = 'sphinx_rtd_theme'
html_theme = 'furo'
# Add any paths that contain custom static files (such as style sheets) here,
# relative to this directory. They are copied after the builtin static files,
@ -84,3 +85,29 @@ autodoc_default_options = {
'members': True,
'undoc-members': True,
}
# -- Copy markdown files to source directory --------------------------------
def replace_in_file(filepath: Path, original_str: str, replace_str: str) -> None:
"""
Replace all occurences of the original_str with replace_str in the file provided.
"""
text = filepath.read_text()
text = text.replace(original_str, replace_str)
filepath.write_text(text)
sphinx_root = Path(__file__).absolute().parent
docs_path = Path(sphinx_root / "md")
repository_root = sphinx_root.parent.parent
# Copy files that are in the head of the repository
files_to_copy = ["CHANGELOG.md", "README.md"]
for file_to_copy in files_to_copy:
copy_path = docs_path / file_to_copy
source_path = repository_root / file_to_copy
shutil.copy(source_path, copy_path)
replace_in_file(copy_path, "docs/source/md/", "")
replace_in_file(copy_path, "/LICENSE", "https://github.com/microsoft/InnerEye-DeepLearning/blob/main/LICENSE")
replace_in_file(copy_path, "docs/source/images/", "../images/")

Просмотреть файл

51
docs/source/index.rst Normal file
Просмотреть файл

@ -0,0 +1,51 @@
.. InnerEye documentation master file, created by
sphinx-quickstart on Sun Jun 28 18:04:34 2020.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
InnerEye-DeepLearning Documentation
===================================
.. toctree::
:maxdepth: 1
:caption: Overview and user guides
md/README.md
md/environment.md
md/WSL.md
md/hello_world_model.md
md/setting_up_aml.md
md/creating_dataset.md
md/building_models.md
md/sample_tasks.md
md/bring_your_own_model.md
md/debugging_and_monitoring.md
md/model_diagnostics.md
md/move_model.md
md/hippocampus_model.md
.. toctree::
:maxdepth: 1
:caption: Further reading for contributors
md/pull_requests.md
md/testing.md
md/contributing.md
md/deploy_on_aml.md
md/fastmri.md
md/innereye_as_submodule.md
md/releases.md
md/self_supervised_models.md
md/CHANGELOG.md
.. toctree::
:caption: API documentation (🚧 Work In Progress 🚧)
rst/api/index
Indices and tables
==================
* :ref:`genindex`
* :ref:`modindex`

Просмотреть файл

@ -1,4 +1,4 @@
# How to use the Windows Subsystem for Linux (WSL2) for development
# Windows Subsystem for Linux (WSL2)
We are aware of two issues with running our toolbox on Windows:
@ -14,8 +14,8 @@ Subsystem for Linux (WSL2) or a plain Ubuntu Linux box.
If you are running a Windows box with a GPU, please follow the documentation
[here](https://docs.microsoft.com/en-us/windows/win32/direct3d12/gpu-cuda-in-wsl) to access the GPU from within WSL2.
You can also find a video walkthrough of WSL2+CUDA installation
here: https://channel9.msdn.com/Shows/Tabs-vs-Spaces/GPU-Accelerated-Machine-Learning-with-WSL-2
There is also a video walkthrough of WSL2+CUDA installation:
[GPU Accelerated Machine Learning with WSL 2](https://channel9.msdn.com/Shows/Tabs-vs-Spaces/GPU-Accelerated-Machine-Learning-with-WSL-2).
## Install WSL2
@ -27,7 +27,8 @@ To use the commandline setup, please first install
Optionally, restart your machine.
In PowerShell as Administrator type:
```
```shell
wsl --install
```
@ -38,7 +39,7 @@ installed, ensure that your distribution is running on top of WSL2 by executing
`wsl --list --verbose`
If all is good, the output should look like this:
```
```shell
$> wsl --list -v
NAME STATE VERSION
* Ubuntu-20.04 Running 2
@ -63,17 +64,17 @@ Start the Windows Terminal app, create an Ubuntu tab. In the shell, run the foll
- Create conda environment: `conda env create --file environment.yml`
- Clean your pyc files (in case you have some left from Windows):
```
```shell
find * -name '*.pyc' | xargs -d'\n' rm`
```
## Configure PyCharm
- https://www.jetbrains.com/help/pycharm/using-wsl-as-a-remote-interpreter.html
- [Instructions for using WSL as a remote interpreter](https://www.jetbrains.com/help/pycharm/using-wsl-as-a-remote-interpreter.html)
- You might need to reset all your firewall settings to make the debugger work with PyCharm. This can be done with these
PowerShell commands (as Administrator):
```
```shell
$myIp = (Ubuntu2004 run "cat /etc/resolv.conf | grep nameserver | cut -d' ' -f2")
New-NetFirewallRule -DisplayName "WSL" -Direction Inbound -LocalAddress $myIp -Action Allow
```
@ -86,4 +87,4 @@ New-NetFirewallRule -DisplayName "WSL" -Direction Inbound -LocalAddress $myIp -
## Configure VSCode
- https://code.visualstudio.com/docs/remote/wsl
- [Instructions for configuring WSL in VSCode](https://code.visualstudio.com/docs/remote/wsl)

Просмотреть файл

@ -2,6 +2,7 @@
The InnerEye toolbox is capable of training any PyTorch Lighting (PL) model inside of AzureML, making
use of all the usual InnerEye toolbox features:
- Working with different model in the same codebase, and selecting one by name
- Distributed training in AzureML
- Logging via AzureML's native capabilities
@ -9,13 +10,14 @@ use of all the usual InnerEye toolbox features:
- Supply commandline overrides for model configuration elements, to quickly queue many jobs
This can be used by
- Defining a special container class, that encapsulates the PyTorch Lighting model to train, and the data that should
be used for training and testing.
- Adding essential trainer parameters like number of epochs to that container.
- Invoking the InnerEye runner and providing the name of the container class, like this:
`python InnerEye/ML/runner.py --model=MyContainer`. To train in AzureML, just add a `--azureml` flag.
There is a fully working example [HelloContainer](../InnerEye/ML/configs/other/HelloContainer.py), that implements
There is a fully working example [HelloContainer](https://github.com/microsoft/InnerEye-DeepLearning/tree/main/InnerEye/ML/configs/other/HelloContainer.py), that implements
a simple 1-dimensional regression model from data stored in a CSV file. You can run that
from the command line by `python InnerEye/ML/runner.py --model=HelloContainer`.
@ -23,6 +25,7 @@ from the command line by `python InnerEye/ML/runner.py --model=HelloContainer`.
In order to use these capabilities, you need to implement a class deriving from `LightningContainer`. This class
encapsulates everything that is needed for training with PyTorch Lightning:
- The `create_model` method needs to return a subclass of `LightningModule`, that has
all the usual PyTorch Lightning methods required for training, like the `training_step` and `forward` methods. This
object needs to adhere to additional constraints, see below.
@ -42,13 +45,15 @@ model configuration classes reside in folder `My/Own/configs` from the repositor
If you are doing cross validation you need to ensure that the `LightningDataModule` returned by your container's
`get_data_module` method:
- Needs to take into account the number of cross validation splits, and the cross validation split index when
preparing the data.
- Needs to log val/Loss in its `validation_step` method.
You can find a working example of handling cross validation in the
[HelloContainer](../InnerEye/ML/configs/other/HelloContainer.py) class.
[HelloContainer](https://github.com/microsoft/InnerEye-DeepLearning/tree/main/InnerEye/ML/configs/other/HelloContainer.py) class.
*Example*:
```python
from pathlib import Path
from torch.utils.data import DataLoader
@ -99,6 +104,7 @@ class MyContainer(LightningContainer):
```
Where does the data for training come from?
- When training a model on a local box or VM, the data is read from the `local_dataset` folder that you define in the
container.
- When training a model in AzureML, the code searches for a folder called `folder_name_in_azure_blob_storage` in
@ -112,6 +118,7 @@ via PyTorch Lightning's [built-in test functionality](https://pytorch-lightning.
See below for an alternative way of running the evaluation on the test set.
### Data loaders
The example above creates `DataLoader` objects from a dataset. When creating those, you need to specify a batch size
(how many samples from your dataset will go into one minibatch), and a number of worker processes. Note that, by
default, data loading will happen in the main process, meaning that your GPU will sit idle while the CPU reads data
@ -133,11 +140,13 @@ to the current working directory are later uploaded to Azure blob storage at the
will also be later available via the AzureML UI.
### Trainer arguments
All arguments that control the PyTorch Lightning `Trainer` object are defined in the class `TrainerParams`. A
`LightningContainer` object inherits from this class. The most essential one is the `num_epochs` field, which controls
the `max_epochs` argument of the `Trainer`.
Usage example:
```python
from pytorch_lightning import LightningModule, LightningDataModule
from InnerEye.ML.lightning_container import LightningContainer
@ -154,10 +163,12 @@ class MyContainer(LightningContainer):
```
For further details how the `TrainerParams` are used, refer to the `create_lightning_trainer` method in
[InnerEye/ML/model_training.py](../InnerEye/ML/model_training.py)
[InnerEye/ML/model_training.py](https://github.com/microsoft/InnerEye-DeepLearning/tree/main/InnerEye/ML/model_training.py)
### Optimizer and LR scheduler arguments
There are two possible ways of choosing the optimizer and LR scheduler:
- The Lightning model returned by `create_model` can define its own `configure_optimizers` method, with the same
signature as `LightningModule.configure_optimizers`. This is the typical way of configuring it for Lightning models.
- Alternatively, the model can inherit from `LightningModuleWithOptimizer`. This class implements a
@ -166,6 +177,7 @@ available from the command line, and you can, for example, start a new run with
supplying the additional commandline flag `--l_rate=1e-2`.
### Evaluating the trained model
The InnerEye toolbox provides two possible routes of implementing that:
You can either use PyTorch Lightning's built-in capabilities, via the `test_step` method. If the model that is
@ -176,7 +188,8 @@ by the `test_dataloader` of the `LightningDataModule` that is used for training/
Alternatively, the model can implement the methods defined in `InnerEyeInference`. In this case, the methods will be
call in this order:
```
```python
model.on_inference_start()
for dataset_split in [Train, Val, Test]
model.on_inference_epoch_start(dataset_split, is_ensemble_model=False)
@ -190,6 +203,7 @@ model.on_inference_end()
## Overriding properties on the commandline
You can define hyperparameters that affect data and/or model, as in the following code snippet:
```python
import param
from pytorch_lightning import LightningModule
@ -201,12 +215,14 @@ class DummyContainerWithParameters(LightningContainer):
return MyLightningModel(self.num_layers)
...
```
All parameters added in this form will be automatically accessible from the commandline, there is no need to define
a separate argument parser: When starting training, you can add a flag like `--num_layers=7`.
## Examples
### Setting only the required fields
```python
from pytorch_lightning import LightningModule, LightningDataModule
from InnerEye.ML.lightning_container import LightningContainer

Просмотреть файл

@ -36,7 +36,7 @@ if __name__ == '__main__':
## Creating the model configuration
You will find a variety of model configurations [here](/InnerEye/ML/configs/segmentation). Those not ending
You will find a variety of model configurations [here](https://github.com/microsoft/InnerEye-DeepLearning/tree/main/InnerEye/ML/configs/segmentation). Those not ending
in `Base.py` reference open-sourced data and can be used as they are. Those ending in `Base.py`
are partially specified, and can be used by having other model configurations inherit from them and supply the missing
parameter values: a dataset ID at least, and optionally other values. For example, a `Prostate` model might inherit
@ -54,7 +54,7 @@ class Prostate(ProstateBase):
azure_dataset_id="name-of-your-AML-dataset-with-prostate-data")
```
The allowed parameters and their meanings are defined in [`SegmentationModelBase`](/InnerEye/ML/config.py).
The allowed parameters and their meanings are defined in [`SegmentationModelBase`](https://github.com/microsoft/InnerEye-DeepLearning/tree/main/InnerEye/ML/config.py).
The class name must be the same as the basename of the file containing it, so `Prostate.py` must contain `Prostate`.
In `settings.yml`, set `model_configs_namespace` to `InnerEyeLocal.ML.configs` so this config
is found by the runner.
@ -253,7 +253,7 @@ and the generated posteriors are passed to the usual model testing downstream pi
### Interpreting results
Once your HyperDrive AzureML runs are completed, you can visualize the results by running the
[`plot_cross_validation.py`](/InnerEye/ML/visualizers/plot_cross_validation.py) script locally:
[`plot_cross_validation.py`](https://github.com/microsoft/InnerEye-DeepLearning/tree/main/InnerEye/ML/visualizers/plot_cross_validation.py) script locally:
```shell
python InnerEye/ML/visualizers/plot_cross_validation.py --run_recovery_id ... --epoch ...
@ -266,8 +266,8 @@ find them in the production portal, and run statistical tests to compute the sig
across the splits and with respect to other runs that you specify. This is done for you during
the run itself (see below), but you can use the script post hoc to compare arbitrary runs
with each other. Details of the tests can be found
in [`wilcoxon_signed_rank_test.py`](/InnerEye/Common/Statistics/wilcoxon_signed_rank_test.py)
and [`mann_whitney_test.py`](/InnerEye/Common/Statistics/mann_whitney_test.py).
in [`wilcoxon_signed_rank_test.py`](https://github.com/microsoft/InnerEye-DeepLearning/tree/main/InnerEye/Common/Statistics/wilcoxon_signed_rank_test.py)
and [`mann_whitney_test.py`](https://github.com/microsoft/InnerEye-DeepLearning/tree/main/InnerEye/Common/Statistics/mann_whitney_test.py).
## Where are my outputs and models?
@ -314,7 +314,7 @@ the `metrics.csv` files of the current run and the comparison run(s).
between the current run and any specified baselines (earlier runs) to compare with. Each paragraph of that file compares two models and
indicates, for each structure, when the Dice scores for the second model are significantly better
or worse than the first. For full details, see the
[source code](../InnerEye/Common/Statistics/wilcoxon_signed_rank_test.py).
[source code](https://github.com/microsoft/InnerEye-DeepLearning/tree/main/InnerEye/Common/Statistics/wilcoxon_signed_rank_test.py).
* A directory `scatterplots`, containing a `png` file for every pairing of the current model
with one of the baselines. Each one is named `AAA_vs_BBB.png`, where `AAA` and `BBB` are the run IDs
of the two models. Each plot shows the Dice scores on the test set for the models.

Просмотреть файл

Просмотреть файл

@ -202,7 +202,7 @@ The parameters `subject_column`, `channel_column`, `image_file_column` and `labe
what columns in the csv contain the subject identifiers, channel names, image file paths and labels.
NOTE: If any of the `*_column` parameters are not specified, InnerEye will look for these entries under the default column names
if default names exist. See the CSV headers in [csv_util.py](/InnerEye/ML/utils/csv_util.py) for all the defaults.
if default names exist. See the CSV headers in [csv_util.py](https://github.com/microsoft/InnerEye-DeepLearning/tree/main/InnerEye/ML/utils/csv_util.py) for all the defaults.
### Using channels in dataset.csv

Просмотреть файл

@ -1,8 +1,8 @@
# Debugging and Monitoring Jobs
# Debugging and Monitoring
### Using TensorBoard to monitor AzureML jobs
## Using TensorBoard to monitor AzureML jobs
* **Existing jobs**: execute [`InnerEye/Azure/tensorboard_monitor.py`](/InnerEye/Azure/tensorboard_monitor.py)
* **Existing jobs**: execute [`InnerEye/Azure/tensorboard_monitor.py`](https://github.com/microsoft/InnerEye-DeepLearning/tree/main/InnerEye/Azure/tensorboard_monitor.py)
with either an experiment id `--experiment_name` or a list of run ids `--run_ids job1,job2,job3`.
If an experiment id is provided then all of the runs in that experiment will be monitored. Additionally You can also
filter runs by type by the run's status, setting the `--filters Running,Completed` parameter to a subset of
@ -15,25 +15,27 @@ arguments with your jobs to monitor.
* **New jobs**: when queuing a new AzureML job, pass `--tensorboard`, which will automatically start a new TensorBoard
session, monitoring the newly queued job.
### Resource Monitor
## Resource Monitor
GPU and CPU usage can be monitored throughout the execution of a run (local and AML) by setting the monitoring interval
for the resource monitor eg: `--monitoring_interval_seconds=5`. This will spawn a separate process at the start of the
run which will log both GPU and CPU utilization and memory consumption. These metrics will be written to AzureML as
well as a separate TensorBoard logs file under `Diagnostics`.
### Debugging setup on local machine
## Debugging setup on local machine
For full debugging of any non-trivial model, you will need a GPU. Some basic debugging can also be carried out on
standard Linux or Windows machines.
The main entry point into the code is [`InnerEye/ML/runner.py`](/InnerEye/ML/runner.py). The code takes its
The main entry point into the code is [`InnerEye/ML/runner.py`](https://github.com/microsoft/InnerEye-DeepLearning/tree/main/InnerEye/ML/runner.py). The code takes its
configuration elements from commandline arguments and a settings file,
[`InnerEye/settings.yml`](/InnerEye/settings.yml).
[`InnerEye/settings.yml`](https://github.com/microsoft/InnerEye-DeepLearning/tree/main/InnerEye/settings.yml).
A password for the (optional) Azure Service
Principal is read from `InnerEyeTestVariables.txt` in the repository root directory. The file
is expected to contain a line of the form
```
```text
APPLICATION_KEY=<app key for your AML workspace>
```
@ -44,10 +46,11 @@ To quickly access both runner scripts for local debugging, we created template P
"Template: Azure runner" and "Template: ML runner". If you want to execute the runners on your machine, then
create a copy of the template run configuration, and change the arguments to suit your needs.
### Shorten training run time for debugging
## Shorten training run time for debugging
Here are a few hints how you can reduce the complexity of training if you need to debug an issue. In most cases,
you should then be able to rely on a CPU machine.
* Reduce the number of feature channels in your model. If you run a UNet, for example, you can set
`feature_channels = [1]` in your model definition file.
* Train only for a single epoch. You can set `--num_epochs=1` via the commandline or the `more_switches` variable
@ -59,15 +62,13 @@ and test images, you can provide a comma-separated list, e.g. `--restrict_subjec
With the above settings, you should be able to get a model training run to complete on a CPU machine in a few minutes.
### Verify your changes using a simplified fast model
## Verify your changes using a simplified fast model
If you made any changes to the code that submits experiments (either `azure_runner.py` or `runner.py` or code
imported by those), validate them using a model training run in Azure. You can queue a model training run for the
simplified `BasicModel2Epochs` model.
# Debugging on an AzureML node
## Debugging on an AzureML node
It is sometimes possible to get a Python debugging (pdb) session on the main process for a model
training run on an AzureML compute cluster, for example if a run produces unexpected output,
@ -82,21 +83,27 @@ after the cluster is created. The steps are as follows.
supply the password chosen when the cluster was created.
* Type "bash" for a nicer command shell (optional).
* Identify the main python process with a command such as
```shell
ps aux | grep 'python.*runner.py' | egrep -wv 'bash|grep'
```
You may need to vary this if it does not yield exactly one line of output.
* Note the process identifier (the value in the PID column, generally the second one).
* Issue the commands
```shell
kill -TRAP nnnn
nc 127.0.0.1 4444
```
where `nnnn` is the process identifier. If the python process is in a state where it can
accept the connection, the "nc" command will print a prompt from which you can issue pdb
commands.
Notes:
* The last step (kill and nc) can be successfully issued at most once for a given process.
Thus if you might want a colleague to carry out the debugging, think carefully before
issuing these commands yourself.

Просмотреть файл

@ -1,8 +1,9 @@
# Model Deployment
![deployment.png](deployment.png)
![deployment.png](../images/deployment.png)
InnerEye segmentation models using a single DICOM series as input and producing DICOM-RT can be integrated with DICOM networks using:
- [InnerEye-Gateway](https://github.com/microsoft/InnerEye-Gateway): a Windows service that provides DICOM AETs to run InnerEye-DeepLearning models
- [InnerEye-Inference](https://github.com/microsoft/InnerEye-Inference): a REST API for the InnnEye-Gateway to run inference on InnerEye-DeepLearning models
@ -26,7 +27,8 @@ you will see an entry for "Registered models", that will take you to the model t
In AzureML, navigate to the "Models" section, then locate the model that has just been registered. In the "Artifacts"
tab, you can inspect the files that have been registered. This will have a structure like this:
```
```text
final_model/
├──score.py
├──environment.yml
@ -55,8 +57,10 @@ have a second folder with code that you would like to deploy alongside the Inner
the model comes out of an ensemble training run.
## Ensemble models
Ensemble models built from different cross validation runs will be registered with the same file structure. The only
differences are
- The top-level folder is called `final_ensemble_model`.
- There will be more checkpoints stored in the model itself, one checkpoint for each cross validation fold.

Просмотреть файл

@ -1,10 +1,10 @@
# Set up InnerEye-DeepLearning
# Setup
## Operating System
We recommend using our toolbox with [Ubuntu 20.04 LTS](https://releases.ubuntu.com/20.04/). Most core InnerEye functionality will be stable on other operating systems, but PyTorch's full feature set is only available on Linux. All jobs in AzureML, both training and inference, run from an Ubuntu 20.04 Docker image. This means that using Ubuntu 20.04 locally allows for maximum reproducibility between your local and AzureML environments.
For Windows users, Ubuntu can be set up with [Windows Subsystem for Linux (WSL)](https://docs.microsoft.com/en-us/windows/wsl/install). Please refer to the [InneryEye WSL docs](docs/WSL.md) for more detailed instructions on getting WSL set up.
For Windows users, Ubuntu can be set up with [Windows Subsystem for Linux (WSL)](https://docs.microsoft.com/en-us/windows/wsl/install). Please refer to the [InneryEye WSL docs](WSL.md) for more detailed instructions on getting WSL set up.
MacOS users can access an Ubuntu OS through [VirtualBox](https://www.virtualbox.org/wiki/Downloads).

Просмотреть файл

@ -9,21 +9,26 @@ a single GPU box. With the help of the InnerEye toolbox and distributed training
can be reduced dramatically.
In order to work with the challenge data in Azure, you will need to
- Register for the challenge
- Have the InnerEye toolbox set up with Azure as described [here](setting_up_aml.md)
- Download and prepare the challenge data, or use the script that we provide here to bulk download directly from
AWS into Azure blob storage.
## Registering for the challenge
In order to download the dataset, you need to register [here](https://fastmri.org/dataset/).
You will shortly receive an email with links to the dataset. In that email, there are two sections containing
scripts to download the data, like this:
```
To download Knee MRI files, we recommend using curl with recovery mode turned on:
```shell
curl -C "https://....amazonaws.com/knee_singlecoil_train.tar.gz?AWSAccessKeyId=...Expires=1610309839" --output knee_singlecoil_train.tar.gz"
...
```
There are two sections of that kind, one for the knee data and one for the brain data. Copy and paste *all* the lines
with `curl` commands into a text file, for example called `curl.txt`. In total, there should be 10 lines with `curl`
commands for the knee data, and 7 for the brain data (including the SHA256 file).
@ -32,6 +37,7 @@ commands for the knee data, and 7 for the brain data (including the SHA256 file)
We are providing a script that will bulk download all files in the FastMRI dataset from AWS to Azure blob storage.
To start that script, you need
- The file that contains all the `curl` commands to download the data (see above). The downloading script will
extract all the AWS access tokens from the `curl` commands.
- The connection string to the Azure storage account that stores your dataset.
@ -48,6 +54,7 @@ and the connection string as commandline arguments, enclosed in quotes:
`python InnerEye/Scripts/prepare_fastmri.py --curl curl.txt --connection_string "<your_connection_string>"` --location westeurope
This script will
- Authenticate against Azure either using the Service Principal credentials that you set up in Step 3 of the
[AzureML setup](setting_up_aml.md), or your own credentials. To use the latter, you need to be logged in via the Azure
command line interface (CLI), available [here](https://docs.microsoft.com/en-us/cli/azure/) for all platforms.
@ -61,6 +68,7 @@ Alternatively, find the Data Factory "fastmri-copy-data" in your Azure portal, a
drill down into all running pipelines.
Once the script is complete, you will have the following datasets in Azure blob storage:
- `knee_singlecoil`, `knee_multicoil`, and `brain_multicoil` with all files unpacked
- `knee_singlecoil_compressed`, `knee_multicoil_compressed`, and `brain_multicoil_compressed` with the `.tar` and
`.tar.gz` files as downloaded. NOTE: The raw challenge data files all have a `.tar.gz` extension, even though some
@ -69,13 +77,12 @@ with their corrected extension.
- The DICOM files are stored in the folders `knee_DICOMs` and `brain_DICOMs` (uncompressed) and
`knee_DICOMs_compressed` and `brain_DICOMs_compressed` (as `.tar` files)
### Troubleshooting the data downloading
If you see a runtime error saying "The subscription is not registered to use namespace 'Microsoft.DataFactory'", then
follow the steps described [here](https://stackoverflow.com/a/48419951/5979993), to enable DataFactory for your
subscription.
## Running a FastMri model with InnerEye
The Azure Data Factory that downloaded the data has put it into the storage account you supplied on the commandline.
@ -84,12 +91,14 @@ Hence, after the downloading completes, you are ready to use the InnerEye toolbo
the FastMRI data.
There are 2 example models already coded up in the InnerEye toolbox, defined in
[fastmri_varnet.py](../InnerEye/ML/configs/other/fastmri_varnet.py): `KneeMulticoil` and
[fastmri_varnet.py](https://github.com/microsoft/InnerEye-DeepLearning/tree/main/InnerEye/ML/configs/other/fastmri_varnet.py): `KneeMulticoil` and
`BrainMulticoil`. As with all InnerEye models, you can start a training run by specifying the name of the class
that defines the model, like this:
```shell
python InnerEye/ML/runner.py --model KneeMulticoil --azureml --num_nodes=4
```
This will start an AzureML job with 4 nodes training at the same time. Depending on how you set up your compute
cluster, this will use a different number of GPUs: For example, if your cluster uses ND24 virtual machines, where
each VM has 4 Tesla P40 cards, training will use a total of 16 GPUs.
@ -114,7 +123,6 @@ For that, just add the `--use_dataset_mount` flag to the commandline. This may i
the storage account cannot provide the data quick enough - however, we have not observed a drop in GPU utilization even
when training on 8 nodes in parallel. For more details around dataset mounting please refer to the next section.
## Performance considerations for BrainMulticoil
Training a FastMri model on the `brain_multicoil` dataset is particularly challenging because the dataset is larger.
@ -123,39 +131,45 @@ Downloading the dataset can - depending on the types of nodes - already make the
The InnerEye toolbox has a way of working around that problem, by reading the dataset on-the-fly from the network,
rather than downloading it at the start of the job. You can trigger this behaviour by supplying an additional
commandline argument `--use_dataset_mount`, for example:
```shell
python InnerEye/ML/runner.py --model BrainMulticoil --azureml --num_nodes=4 --use_dataset_mount
```
With this flag, the InnerEye training script will start immediately, without downloading data beforehand.
However, the fastMRI data module generates a cache file before training, and to build that, it needs to traverse the
full dataset. This will lead to a long (1-2 hours) startup time before starting the first epoch, while it is
creating this cache file. This can be avoided by copying the cache file from a previous run into to the dataset folder.
More specifically, you need to follow these steps:
* Start a training job, training for only 1 epoch, like
- Start a training job, training for only 1 epoch, like
```shell
python InnerEye/ML/runner.py --model BrainMulticoil --azureml --use_dataset_mount --num_epochs=1
```
* Wait until the job starts has finished creating the cache file - the job will print out a message
- Wait until the job starts has finished creating the cache file - the job will print out a message
"Saving dataset cache to dataset_cache.pkl", visible in the log file `azureml-logs/70_driver_log.txt`, about 1-2 hours
after start. At that point, you can cancel the job.
* In the "Outputs + logs" section of the AzureML job, you will now see a file `outputs/dataset_cache.pkl` that has
- In the "Outputs + logs" section of the AzureML job, you will now see a file `outputs/dataset_cache.pkl` that has
been produced by the job. Download that file.
* Upload the file `dataset_cache.pkl` to the storage account that holds the fastMRI datasets, in the `brain_multicoil`
- Upload the file `dataset_cache.pkl` to the storage account that holds the fastMRI datasets, in the `brain_multicoil`
folder that was previously created by the Azure Data Factory. You can do that via the Azure Portal or Azure Storage
Explorer. Via the Azure Portal, you can search for the storage account that holds your data, then select
"Data storage: Containers" in the left hand navigation. You should see a folder named `datasets`, and inside of that
`brain_multicoil`. Once in that folder, press the "Upload" button at the top and select the `dataset_cache.pkl` file.
* Start the training job again, this time you can start multi-node training right away, like this:
- Start the training job again, this time you can start multi-node training right away, like this:
```shell
python InnerEye/ML/runner.py --model BrainMulticoil --azureml --use_dataset_mount --num_nodes=8. This new
```
This job should pick up the existing cache file, and output a message like "Copying a pre-computed dataset cache
file ..."
The same trick can of course be applied to other models as well (`KneeMulticoil`).
# Running on a GPU machine
## Running on a GPU machine
You can of course run the InnerEye fastMRI models on a reasonably large machine with a GPU for development and
debugging purposes. Before running, we recommend to download the datasets using a tool
@ -163,30 +177,36 @@ like [azcopy](http://aka.ms/azcopy) into a folder, for example the `datasets` fo
To use `azcopy`, you will need the access key to the storage account that holds your data - it's the same storage
account that was used when creating the Data Factory that downloaded the data.
- To get that, navigate to the [Azure Portal](https://portal.azure.com), and search for the storage account
that you created to hold your datasets (Step 4 in [AzureML setup](setting_up_aml.md)).
- On the left hand navigation, there is a section "Access Keys". Select that and copy out one of the two keys (_not_
- On the left hand navigation, there is a section "Access Keys". Select that and copy out one of the two keys (*not*
the connection strings). The key is a base64 encoded string, it should not contain any special characters apart from
`+`, `/`, `.` and `=`
Then run this script in the repository root folder:
```shell
mkdir datasets
azcopy --source-key <storage_account_key> --source https://<your_storage_acount>.blob.core.windows.net/datasets/brain_multicoil --destination datasets/brain_multicoil --recursive
```
Replace `brain_multicoil` with any of the other datasets names if needed.
If you follow these suggested folder structures, there is no further change necessary to the models. You can then
run, for example, the `BrainMulticoil` model by dropping the `--azureml` flag like this:
```shell
python InnerEye/ML/runner.py --model BrainMulticoil
```
The code will recognize that an Azure dataset named `brain_multicoil` is already present in the `datasets` folder,
and skip the download.
If you choose to download the dataset to a different folder, for example `/foo/brain_multicoil`, you will need to
make a small adjustment to the model in [fastmri_varnet.py](../InnerEye/ML/configs/other/fastmri_varnet.py),
make a small adjustment to the model in [fastmri_varnet.py](https://github.com/microsoft/InnerEye-DeepLearning/tree/main/InnerEye/ML/configs/other/fastmri_varnet.py),
and add the `local_dataset` argument like this:
```python
class BrainMulticoil(FastMri):
def __init__(self) -> None:

Просмотреть файл

@ -1,6 +1,6 @@
# Training a Hello World segmentation model
In the configs folder, you will find a config file called [HelloWorld.py](../InnerEye/ML/configs/segmentation/HelloWorld.py)
In the configs folder, you will find a config file called [HelloWorld.py](https://github.com/microsoft/InnerEye-DeepLearning/tree/main/InnerEye/ML/configs/segmentation/HelloWorld.py)
We have created this file to demonstrate how to:
1. Subclass SegmentationModelBase which is the base config for all segmentation model configs

Просмотреть файл

@ -1,4 +1,3 @@
# Trained model for hippocampal segmentation
## Purpose
@ -13,11 +12,11 @@ Please note that this model is intended for research purposes only. You are resp
## Usage
The following instructions assume you have completed the preceding setup steps in the [InnerEye README](https://github.com/microsoft/InnerEye-DeepLearning/), in particular, [Setting up Azure Machine Learning](https://github.com/microsoft/InnerEye-DeepLearning/blob/main/docs/setting_up_aml.md).
The following instructions assume you have completed the preceding setup steps in the [InnerEye README](https://github.com/microsoft/InnerEye-DeepLearning/), in particular, [Setting up Azure Machine Learning](setting_up_aml.md).
### Create an Azure ML Dataset
To evaluate this model on your own data, you will first need to register an [Azure ML Dataset](https://docs.microsoft.com/en-us/azure/machine-learning/v1/how-to-create-register-datasets). You can follow the instructions in the InnerEye repo for [creating datasets](https://github.com/microsoft/InnerEye-DeepLearning/blob/main/docs/creating_dataset.md) in order to do this.
To evaluate this model on your own data, you will first need to register an [Azure ML Dataset](https://docs.microsoft.com/en-us/azure/machine-learning/v1/how-to-create-register-datasets). You can follow the instructions in the for [creating datasets](creating_dataset.md) in order to do this.
## Downloading the model
@ -25,7 +24,7 @@ The saved weights from the trained Hippocampus model can be downloaded along wit
### Registering a model in Azure ML
To evaluate the model in Azure ML, you must first [register an Azure ML Model](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.model.model?view=azure-ml-py#remarks). To register the Hippocampus model in your AML Workspace, unpack the source code downloaded in the previous step and follow InnerEye's [instructions to upload models to Azure ML](https://github.com/microsoft/InnerEye-DeepLearning/blob/main/docs/move_model.md).
To evaluate the model in Azure ML, you must first [register an Azure ML Model](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.model.model?view=azure-ml-py#remarks). To register the Hippocampus model in your AML Workspace, unpack the source code downloaded in the previous step and follow InnerEye's [instructions to upload models to Azure ML](move_model.md).
Run the following from a folder that contains both the `ENVIRONMENT/` and `MODEL/` folders (these exist inside the downloaded model files):
@ -45,7 +44,7 @@ python InnerEye/Scripts/move_model.py \
### Evaluating the model
You can evaluate the model either in Azure ML or locally using the downloaded checkpoint files. These 2 scenarios are described in more detail, along with instructions in [testing an existing model](https://github.com/microsoft/InnerEye-DeepLearning/blob/main/docs/building_models.md#testing-an-existing-model).
You can evaluate the model either in Azure ML or locally using the downloaded checkpoint files. These 2 scenarios are described in more detail, along with instructions in [testing an existing model](building_models.md#testing-an-existing-model).
For example, to evaluate the model on your Dataset in Azure ML, run the following from within the directory `*/MODEL/final_ensemble_model/`
@ -74,9 +73,9 @@ To deploy this model, see the instructions in the [InnerEye README](https://gith
---
# Hippocampal Segmentation Model Card
## Hippocampal Segmentation Model Card
## Model details
### Model details
- Organisation: Biomedical Imaging Team at Microsoft Research, Cambridge UK.
- Model date: 5th July 2022.
@ -86,33 +85,33 @@ To deploy this model, see the instructions in the [InnerEye README](https://gith
- License: The model is released under MIT license as described [here](https://github.com/microsoft/InnerEye-DeepLearning/blob/main/LICENSE).
- Contact: innereyeinfo@microsoft.com.
## Limitations
### Limitations
This model has been trained on a subset of the ADNI dataset. There have been various phases of ADNI spanning different time periods. In this Model Card we refer to the original, or ADNI 1, study. This dataset comprises scans and metadata from patients between the ages of 55-90 from 57 different sites across the US and Canada [source](https://adni.loni.usc.edu/study-design/#background-container). Therefore a major limitation of this model would be the ability to generalise to patients outside of this demographic. Another limitation is that The MRI protocol for ADNI1 (which was collected between 2004-2009) focused on imaging on 1.5T scanners [source](https://adni.loni.usc.edu/methods/mri-tool/mri-analysis/). Modern scanners may have higher field strengths and therefore different levels of contrast which could lead to different performance from the results we report.
The results of this model have not been validated by clinical experts. We expect the user to evaluate the result
## Intended Uses
### Intended Uses
This model is for research purposes only. It is intended to be used for the task of segmenting hippocampi from brain MRI scans. Any other task is out of scope for this model.
## About the data
### About the data
The model was trained on 998 pairs of MRI segmentation + segmentation. The model was further validated on 127 pairs of images and tested on 125 pairs. A further 317 pairs were retained as a held-out test set for the final evaluation of the model, which is what we report performance on.
All of this data comes from the Alzheimer's Disease Neuroimaging Initiative study [link to website](https://adni.loni.usc.edu/). The data is publicly available, but requires signing a Data Use Agreement before access is granted.
## About the ground-truth segmentations
### About the ground-truth segmentations
The segmentations were also downloaded from the ADNI dataset. They were created semi-automatically using software from [Medtronic Surgical Navigation Technologies](https://www.medtronic.com/us-en/healthcare-professionals/products/neurological/surgical-navigation-systems.html). Further information is available on the [ADNI website](https://adni.loni.usc.edu/).
## Metrics
### Metrics
Note that due to the ADNI Data Usage Agreement we are only able to share aggregate-level metrics from our evaluation. Evaluation is performed on a held out test set of 252 pairs of MRI + segmentation pairs from the ADNI dataset.
Dice Score for Left and Right hippocampus respectively on a held-out test set of :
![hippocampus_metrics_boxplot.png](hippocampus_metrics_boxplot.png)
![hippocampus_metrics_boxplot.png](../images/hippocampus_metrics_boxplot.png)
| Structure | count | DiceNumeric_mean | DiceNumeric_std | DiceNumeric_min | DiceNumeric_max | HausdorffDistance_mm_mean | HausdorffDistance_mm_std | HausdorffDistance_mm_min | HausdorffDistance_mm_max | MeanDistance_mm_mean | MeanDistance_mm_std | MeanDistance_mm_min | MeanDistance_mm_max |
|---------------|---------|------------------|-----------------|-----------------|-----------------|---------------------------|--------------------------|--------------------------|--------------------------|----------------------|---------------------|---------------------|---------------------|
| hippocampus_L | 252 | 0.918 | 0.022 | 0.819 | 0.953 | 2.206 | 0.812 | 1.206 | 6.964 | 0.168 | 0.054 | 0.096 | 0.399 |

Просмотреть файл

@ -3,6 +3,7 @@
You can use InnerEye as a submodule in your own project.
If you go down that route, here's the list of files you will need in your project (that's the same as those
given in [this document](building_models.md))
* `environment.yml`: Conda environment with python, pip, pytorch
* `settings.yml`: A file similar to `InnerEye\settings.yml` containing all your Azure settings
* A folder like `ML` that contains your additional code, and model configurations.
@ -11,14 +12,17 @@ and Azure settings; see the [Building models](building_models.md) instructions f
`myrunner.py` should look like.
You then need to add the InnerEye code as a git submodule, in folder `innereye-deeplearning`:
```shell
git submodule add https://github.com/microsoft/InnerEye-DeepLearning innereye-deeplearning
```
Then configure your Python IDE to consume *both* your repository root *and* the `innereye-deeplearning` subfolder as inputs.
In Pycharm, you would do that by going to Settings/Project Structure. Mark your repository root as "Source", and
`innereye-deeplearning` as well.
Example commandline runner that uses the InnerEye runner (called `myrunner.py` above):
```python
import sys
from pathlib import Path
@ -73,9 +77,11 @@ if __name__ == '__main__':
The example below creates a new flavour of the Glaucoma model in `InnerEye/ML/configs/classification/GlaucomaPublic`.
All that needs to be done is change the dataset. We will do this by subclassing GlaucomaPublic in a new config
stored in `InnerEyeLocal/configs`
1. Create folder `InnerEyeLocal/configs`
1. Create a config file `InnerEyeLocal/configs/GlaucomaPublicExt.py` which extends the `GlaucomaPublic` class
like this:
```python
from InnerEye.ML.configs.classification.GlaucomaPublic import GlaucomaPublic
@ -84,12 +90,16 @@ class MyGlaucomaModel(GlaucomaPublic):
super().__init__()
self.azure_dataset_id="name_of_your_dataset_on_azure"
```
1. In `settings.yml`, set `model_configs_namespace` to `InnerEyeLocal.configs` so this config
is found by the runner. Set `extra_code_directory` to `InnerEyeLocal`.
#### Start Training
### Start Training
Run the following to start a job on AzureML:
```
```shell
python myrunner.py --azureml --model=MyGlaucomaModel
```
See [Model Training](building_models.md) for details on training outputs, resuming training, testing models and model ensembles.

Просмотреть файл

@ -22,4 +22,4 @@ scan. Dark red indicates voxels that are sampled very often, transparent red ind
infrequently.
Example thumbnail when viewed in the AzureML UI:
![](screenshot_azureml_patch_sampling.png)
![Azure ML patch sampling screenshot](../images/screenshot_azureml_patch_sampling.png)

Просмотреть файл

Просмотреть файл

@ -1,4 +1,4 @@
## Suggested Workflow for Pull Requests
# Suggested Workflow for Pull Requests
* Pull Requests (PRs) should implement one change, and hence be small. If in doubt, err on the side of making the PR
too small, rather than too big. It reduces the chance for you as the author to overlook issues. Small PRs are easier

Просмотреть файл

@ -9,4 +9,4 @@ If your code relies on specific functions inside the InnerEye code base, you sho
The current InnerEye codebase is not published as a Python package, and hence does not have implicit version numbers.
We are applying tagging instead, with increases corresponding to what otherwise would be major/minor versions.
Please refer to the [Changelog](../CHANGELOG.md) for an overview of recent changes.
Please refer to the [Changelog](https://github.com/microsoft/InnerEye-DeepLearning/blob/main/CHANGELOG.md) for an overview of recent changes.

Просмотреть файл

@ -3,21 +3,23 @@
This document contains two sample tasks for the classification and segmentation pipelines.
The document will walk through the steps in [Training Steps](building_models.md), but with specific examples for each task.
Before trying to train these models, you should have followed steps to set up an [environment](environment.md) and [AzureML](setting_up_aml.md)
Before trying to train these models, you should have followed steps to set up an [environment](environment.md) and [AzureML](setting_up_aml.md).
## Sample classification task: Glaucoma Detection on OCT volumes
This example is based on the paper [A feature agnostic approach for glaucoma detection in OCT volumes](https://arxiv.org/pdf/1807.04855v3.pdf).
### Downloading and preparing the dataset
### Downloading and preparing the glaucoma dataset
The dataset is available [here](https://zenodo.org/record/1481223#.Xs-ehzPiuM_) <sup>[[1]](#1)</sup>.
After downloading and extracting the zip file, run the [create_glaucoma_dataset_csv.py](https://github.com/microsoft/InnerEye-DeepLearning/blob/main/InnerEye/Scripts/create_glaucoma_dataset_csv.py)
script on the extracted folder.
```
```shell
python create_dataset_csv.py /path/to/extracted/folder
```
This will convert the dataset to csv form and create a file `dataset.csv`.
Finally, upload this folder (with the images and `dataset.csv`) to Azure Blob Storage. For details on creating a storage account,
@ -25,10 +27,11 @@ see [Setting up AzureML](setting_up_aml.md#step-4-create-a-storage-account-for-y
into a container called `datasets`, with a folder name of your choice (`name_of_your_dataset_on_azure` in the
description below).
### Creating the model configuration and starting training
### Creating the glaucoma model configuration and starting training
Next, you need to create a configuration file `InnerEye/ML/configs/MyGlaucoma.py`
which extends the GlaucomaPublic class like this:
```python
from InnerEye.ML.configs.classification.GlaucomaPublic import GlaucomaPublic
class MyGlaucomaModel(GlaucomaPublic):
@ -36,25 +39,26 @@ class MyGlaucomaModel(GlaucomaPublic):
super().__init__()
self.azure_dataset_id="name_of_your_dataset_on_azure"
```
The value for `self.azure_dataset_id` should match the dataset upload location, called
`name_of_your_dataset_on_azure` above.
Once that config is in place, you can start training in AzureML via
```
```shell
python InnerEye/ML/runner.py --model=MyGlaucomaModel --azureml
```
As an alternative to working with a fork of the repository, you can use InnerEye-DeepLearning via a submodule.
Please check [here](innereye_as_submodule.md) for details.
## Sample segmentation task: Segmentation of Lung CT
This example is based on the [Lung CT Segmentation Challenge 2017](https://wiki.cancerimagingarchive.net/display/Public/Lung+CT+Segmentation+Challenge+2017) <sup>[[2]](#2)</sup>.
### Downloading and preparing the dataset
### Downloading and preparing the lung dataset
The dataset <sup>[[3]](#3)[[4]](#4)</sup> can be downloaded [here](https://wiki.cancerimagingarchive.net/display/Public/Lung+CT+Segmentation+Challenge+2017#021ca3c9a0724b0d9df784f1699d35e2).
The dataset <sup>[[3]][#3]([4](#4)</sup> can be downloaded [here](https://wiki.cancerimagingarchive.net/display/Public/Lung+CT+Segmentation+Challenge+2017#021ca3c9a0724b0d9df784f1699d35e2).
You need to convert the dataset from DICOM-RT to NIFTI. Before this, place the downloaded dataset in another
parent folder, which we will call `datasets`. This file structure is expected by the conversion tool.
@ -63,9 +67,11 @@ Next, use the
[InnerEye-CreateDataset](https://github.com/microsoft/InnerEye-createdataset) commandline tools to create a
NIFTI dataset from the downloaded (DICOM) files.
After installing the tool, run
```batch
InnerEye.CreateDataset.Runner.exe dataset --datasetRootDirectory=<path to the 'datasets' folder> --niftiDatasetDirectory=<output folder name for converted dataset> --dicomDatasetDirectory=<name of downloaded folder inside 'datasets'> --geoNorm 1;1;3
```
Now, you should have another folder under `datasets` with the converted Nifti files.
The `geonorm` tag tells the tool to normalize the voxel sizes during conversion.
@ -74,13 +80,14 @@ see [Setting up AzureML](setting_up_aml.md#step-4-create-a-storage-account-for-y
into a folder in the `datasets` container, for example `my_lung_dataset`. This folder name will need to go into the
`azure_dataset_id` field of the model configuration, see below.
### Creating the model configuration and starting training
### Creating the lung model configuration and starting training
You can then create a new model configuration, based on the template
[Lung.py](../InnerEye/ML/configs/segmentation/Lung.py). To do this, create a file
[Lung.py](https://github.com/microsoft/InnerEye-DeepLearning/tree/main/InnerEye/ML/configs/segmentation/Lung.py). To do this, create a file
`InnerEye/ML/configs/segmentation/MyLungModel.py`, where you create a subclass of the template Lung model, and
add the `azure_dataset_id` field (i.e., the name of the folder that contains the uploaded data from above),
so that it looks like:
```python
from InnerEye.ML.configs.segmentation.Lung import Lung
class MyLungModel(Lung):
@ -88,19 +95,22 @@ class MyLungModel(Lung):
super().__init__()
self.azure_dataset_id = "my_lung_dataset"
```
If you are using InnerEye as a submodule, please add this configuration in your private configuration folder,
as described for the Glaucoma model [here](innereye_as_submodule.md).
You can now run the following command to start a job on AzureML:
```
```shell
python InnerEye/ML/runner.py --azureml --model=MyLungModel
```
See [Model Training](building_models.md) for details on training outputs, resuming training, testing models and model ensembles.
### References
<a id="1">[1]</a>
Ishikawa, Hiroshi. (2018). OCT volumes for glaucoma detection (Version 1.0.0) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.1481223
Ishikawa, Hiroshi. (2018). OCT volumes for glaucoma detection (Version 1.0.0) [Data set]. Zenodo. <http://doi.org/10.5281/zenodo.1481223>
<a id="2">[2]</a>
Yang, J. , Veeraraghavan, H. , Armato, S. G., Farahani, K. , Kirby, J. S., Kalpathy-Kramer, J. , van Elmpt, W. , Dekker, A. , Han, X. , Feng, X. , Aljabar, P. , Oliveira, B. , van der Heyden, B. , Zamdborg, L. , Lam, D. , Gooding, M. and Sharp, G. C. (2018),
@ -108,7 +118,7 @@ Autosegmentation for thoracic radiation treatment planning: A grand challenge at
<a id="3">[3]</a>
Yang, Jinzhong; Sharp, Greg; Veeraraghavan, Harini ; van Elmpt, Wouter ; Dekker, Andre; Lustberg, Tim; Gooding, Mark. (2017).
Data from Lung CT Segmentation Challenge. The Cancer Imaging Archive. http://doi.org/10.7937/K9/TCIA.2017.3r3fvz08
Data from Lung CT Segmentation Challenge. The Cancer Imaging Archive. <http://doi.org/10.7937/K9/TCIA.2017.3r3fvz08>
<a id="4">[4]</a>
Clark K, Vendt B, Smith K, Freymann J, Kirby J, Koppel P, Moore S, Phillips S, Maffitt D, Pringle M, Tarbox L, Prior F.

Просмотреть файл

@ -5,7 +5,7 @@ folder allows you to train self-supervised models using
[SimCLR](http://proceedings.mlr.press/v119/chen20j/chen20j.pdf) or
[BYOL](https://proceedings.neurips.cc/paper/2020/file/f3ada80d5c4ee70142b17b8192b2958e-Paper.pdf). This code runs as a "
bring-your-own-model" self-contained module (
see [docs/bring_your_own_model.md](https://github.com/microsoft/InnerEye-DeepLearning/blob/main/docs/bring_your_own_model.md))
see the [bring-your-own-model instructions](bring_your_own_model.md)
.
Here, we provide implementations for four datasets to get you kickstarted with self-supervised models:
@ -18,7 +18,7 @@ Here, we provide implementations for four datasets to get you kickstarted with s
[NIH Chest-Xray](https://www.kaggle.com/nih-chest-xrays/data) (112k Chest-Xray scans) or
[CheXpert](https://stanfordmlgroup.github.io/competitions/chexpert/) (228k scans).
### Multi-dataset support
## Multi-dataset support
During self-supervised training, a separate linear classifier is trained on top of learnt image embeddings. In this way,
users can continuously monitor the representativeness of learnt image embeddings for a given downstream classification
@ -37,7 +37,7 @@ Here we described how to quickly start a training job with our ready made config
To kick-off a training for a SimCLR and BYOL models on CIFAR10, simply run
```
```shell
python ML/runner.py --model=CIFAR10BYOL
python ML/runner.py --model=CIFAR10SimCLR
```
@ -48,7 +48,7 @@ For this dataset, it will automatically take care of downloading the dataset to
#### Step 0: Get the data
#### If you run on your local machine:
#### If you run on your local machine
Prior to starting training a model on this dataset, you will need to download it from Kaggle to your machine:
@ -82,11 +82,11 @@ the dataset location fields:
Example to train a SSL model with BYOL on the NIH dataset and monitor the embeddings quality on the Kaggle RSNA
Pneumonia Challenge classification task:
```
```shell
python ML/runner.py --model=NIH_RSNA_BYOL
```
## Configuring your own SSL models:
## Configuring your own SSL models
### About SSLContainer configuration
@ -120,11 +120,12 @@ with the following available arguments:
* `num_epochs`: number of epochs to train for.
In case you wish to first test your model locally, here some optional arguments that can be useful:
* `local_dataset`: path to local dataset, if passed the azure dataset will be ignored
* `is_debug_model`: if True it will only run on the first batch of each epoch
* `drop_last`: if False (True by default) it will keep the last batch also if incomplete
### Creating your own datamodules:
### Creating your own datamodules
To use this code with your own data, you will need to:
@ -156,16 +157,16 @@ the ``cxr_linear_head_augmentations.yaml`` config defines the augmentations to u
WARNING: this file will be ignored for CIFAR examples where we use the default pl-bolts augmentations.
## Finetuning a linear head on top of a pretrained SSL model.
## Finetuning a linear head on top of a pretrained SSL model
Alongside with the modules to train your SSL models, we also provide examplary modules that allow you to build a
classifier on top of a pretrained SSL model. The base class for these modules is `SSLClassifierContainer`. It builds on
top of the `SSLContainer` with additional command line arguments allowing you to specify where to find the checkpoint
for your pretrained model. For this you have two options:
- If you are running locally, you can provide the local path to your pretrained model checkpoint
* If you are running locally, you can provide the local path to your pretrained model checkpoint
via `--local_weights_path`.
- If your are running on AML, use the `pretraining_run_recovery_id` field. Providing this field, will mean that AML will
* If your are running on AML, use the `pretraining_run_recovery_id` field. Providing this field, will mean that AML will
automatically download the checkpoints to the current node, will pick the latest checkpoint to build the classifier on
top. Beware not to confuse `pretraining_run_recovery_id` with `run_recovery_id` as the latter is use to continue training on
the same model (which is not the case here).
@ -179,7 +180,7 @@ argument. By default, this is set to True.
We provide an example of such a classifier container for CIFAR named `SSLClassifierCIFAR`. To launch a finetuning run
for this model on CIFAR10, just run
```
```shell
python ML/runner.py --model=SSLClassifierCIFAR --pretraining_run_recovery_id={THE_ID_TO_YOUR_SSL_TRAINING_JOB}
```
@ -188,12 +189,12 @@ python ML/runner.py --model=SSLClassifierCIFAR --pretraining_run_recovery_id={TH
Similarly, we provide class to allow you to simply start a finetuning job for CXR model in `CXRImageClassifier`. By
default, this will launch a finetuning job on the RSNA Pneumonia dataset. To start the run:
```
```shell
python ML/runner.py --model=CXRImageClassifier --pretraining_run_recovery_id={THE_ID_TO_YOUR_SSL_TRAINING_JOB}
```
or for a local run
```
```shell
python ML/runner.py --model=CXRImageClassifier --local_weights_path={LOCAL_PATH_TO_YOUR_SSL_CHECKPOINT}
```

Просмотреть файл

@ -1,4 +1,4 @@
# How to setup Azure Machine Learning for InnerEye
# AzureML Setup
Our preferred way to use AzureML is using the [AzureTRE](https://microsoft.github.io/AzureTRE/)
@ -12,7 +12,7 @@ In short, you will need to:
* Optional: Register your application to create a Service Principal Object.
* Optional: Set up a storage account to store your datasets. You may already have such a storage account, or you may
want to re-use the storage account that is created with the AzureML workspace - in both cases, you can skip this step.
* Update your [settings.yml](/InnerEye/settings.yml) file and KeyVault with your own credentials.
* Update your [settings.yml](https://github.com/microsoft/InnerEye-DeepLearning/tree/main/InnerEye/settings.yml) file and KeyVault with your own credentials.
Once you're done with these steps, you will be ready for the next steps described in [Creating a dataset](https://github.com/microsoft/InnerEye-createdataset),
[Building models in Azure ML](building_models.md) and
@ -43,7 +43,7 @@ need to be kept inside of the UK)
You can invoke the deployment also by going to [Azure](https://ms.portal.azure.com/#create/Microsoft.Template),
selecting "Build your own template", and in the editor upload the
[json template file](/azure-pipelines/azure_deployment_template.json) included in the repository.
[json template file]([/azure-pipelines/azure_deployment_template.json](https://github.com/microsoft/InnerEye-DeepLearning/blob/main/azure-pipelines/azure_deployment_template.json) included in the repository.
### Step 1: Create an AzureML workspace
@ -179,7 +179,7 @@ create a container called "datasets".
### Step 6: Update the variables in `settings.yml`
The [settings.yml](../InnerEye/settings.yml) file is used to store your Azure setup. In order to be able to
The [settings.yml](https://github.com/microsoft/InnerEye-DeepLearning/tree/main/InnerEye/settings.yml) file is used to store your Azure setup. In order to be able to
train your model you will need to update this file using the settings for your Azure subscription.
1. You will first need to retrieve your `tenant_id`. You can find your tenant id by navigating to
@ -188,7 +188,7 @@ resource. Copy and paste the GUID to the `tenant_id` field of the `.yml` file. M
[here](https://docs.microsoft.com/en-us/azure/active-directory/develop/quickstart-create-new-tenant).
1. You then need to retrieve your subscription id. In the search bar look for `Subscriptions`. Then in the subscriptions list,
look for the subscription you are using for your workspace. Copy the value of the `Subscription ID` in the corresponding
field of [settings.yml](../InnerEye/settings.yml).
field of [settings.yml](https://github.com/microsoft/InnerEye-DeepLearning/tree/main/InnerEye/settings.yml).
1. Copy the application ID of your Service Principal that you retrieved earlier (cf. Step 3) to the `application_id` field.
If you did not set up a Service Principal, fill that with an empty string or leave out altogether.
1. Update the `resource_group:` field with your resource group name (created in Step 1).

Просмотреть файл

@ -1,4 +1,4 @@
## Pytest and testing on CPU and GPU machines
# Pytest and testing on CPU and GPU machines
All non-trivial proposed changes to the code base should be accompanied by tests.

Просмотреть файл

Просмотреть файл

@ -10,8 +10,8 @@ dependencies:
- blas=1.0=mkl
- blosc=1.21.0=h4ff587b_1
- bzip2=1.0.8=h7b6447c_0
- ca-certificates=2022.4.26=h06a4308_0
- certifi=2022.5.18.1=py38h06a4308_0
- ca-certificates=2022.07.19=h06a4308_0
- certifi=2022.6.15=py38h06a4308_0
- cudatoolkit=11.3.1=h2bc3f7f_2
- ffmpeg=4.2.2=h20bf706_0
- freetype=2.11.0=h70c0345_0
@ -42,10 +42,10 @@ dependencies:
- mkl-service=2.4.0=py38h7f8727e_0
- mkl_fft=1.3.1=py38hd3c417c_0
- mkl_random=1.2.2=py38h51133e4_0
- ncurses=6.3=h7f8727e_2
- ncurses=6.3=h5eee18b_3
- nettle=3.7.3=hbbd107a_1
- openh264=2.1.1=h4ff587b_0
- openssl=1.1.1o=h7f8727e_0
- openssl=1.1.1q=h7f8727e_0
- pip=20.1.1=py38_1
- python=3.8.3
- python-blosc=1.7.0=py38h7b6447c_0
@ -53,7 +53,7 @@ dependencies:
- pytorch-mutex=1.0=cuda
- readline=8.1.2=h7f8727e_1
- setuptools=61.2.0=py38h06a4308_0
- sqlite=3.38.3=hc218d9a_0
- sqlite=3.39.0=h5082296_0
- tk=8.6.12=h1ccaba5_0
- torchvision=0.11.1=py38_cu113
- typing_extensions=4.1.1=pyh06a4308_0
@ -63,19 +63,20 @@ dependencies:
- zlib=1.2.12=h7f8727e_2
- zstd=1.5.2=ha4553b6_0
- pip:
- absl-py==1.1.0
- absl-py==1.2.0
- adal==1.2.7
- aiohttp==3.8.1
- aiosignal==1.2.0
- alembic==1.8.0
- alabaster==0.7.12
- alembic==1.8.1
- ansiwrap==0.8.4
- applicationinsights==0.11.10
- argon2-cffi==21.3.0
- argon2-cffi-bindings==21.2.0
- async-timeout==4.0.2
- attrs==21.4.0
- attrs==22.1.0
- azure-common==1.1.28
- azure-core==1.24.1
- azure-core==1.24.2
- azure-graphrbac==0.61.1
- azure-identity==1.7.0
- azure-mgmt-authorization==0.61.0
@ -102,39 +103,42 @@ dependencies:
- azureml-train-automl-client==1.36.0
- azureml-train-core==1.36.0
- azureml-train-restclients-hyperdrive==1.36.0
- babel==2.10.3
- backports-tempfile==1.0
- backports-weakref==1.0.post1
- beautifulsoup4==4.11.1
- black==22.3.0
- bleach==5.0.0
- black==22.6.0
- bleach==5.0.1
- cachetools==4.2.4
- cffi==1.15.0
- charset-normalizer==2.0.12
- cffi==1.15.1
- charset-normalizer==2.1.0
- click==8.1.3
- cloudpickle==1.6.0
- colorama==0.4.5
- commonmark==0.9.1
- conda-merge==0.1.5
- contextlib2==21.6.0
- coverage==6.4.1
- coverage==6.4.2
- cryptography==3.3.2
- cycler==0.11.0
- databricks-cli==0.17.0
- dataclasses-json==0.5.2
- debugpy==1.6.0
- debugpy==1.6.2
- defusedxml==0.7.1
- deprecated==1.2.13
- distro==1.7.0
- docker==4.3.1
- docutils==0.17.1
- dotnetcore2==2.1.23
- entrypoints==0.4
- execnet==1.9.0
- fastjsonschema==2.15.3
- fastjsonschema==2.16.1
- fastmri==0.2.0
- flake8==3.8.3
- flask==2.1.2
- frozenlist==1.3.0
- fsspec==2022.5.0
- flask==2.2.0
- frozenlist==1.3.1
- fsspec==2022.7.1
- furo==2022.6.21
- fusepy==3.0.1
- future==0.18.2
- gitdb==4.0.9
@ -143,22 +147,23 @@ dependencies:
- google-auth-oauthlib==0.4.6
- gputil==1.4.0
- greenlet==1.1.2
- grpcio==1.46.3
- grpcio==1.47.0
- gunicorn==20.1.0
- h5py==2.10.0
- hi-ml==0.2.2
- hi-ml-azure==0.2.2
- humanize==4.2.0
- humanize==4.2.3
- idna==3.3
- imageio==2.15.0
- importlib-metadata==4.11.4
- importlib-resources==5.8.0
- imagesize==1.4.1
- importlib-metadata==4.12.0
- importlib-resources==5.9.0
- iniconfig==1.1.1
- innereye-dicom-rt==1.0.3
- ipykernel==6.15.0
- ipykernel==6.15.1
- ipython==7.31.1
- ipython-genutils==0.2.0
- ipywidgets==7.7.0
- ipywidgets==7.7.1
- isodate==0.6.1
- itsdangerous==2.1.2
- jeepney==0.8.0
@ -166,26 +171,26 @@ dependencies:
- jmespath==0.10.0
- joblib==0.16.0
- jsonpickle==2.2.0
- jsonschema==4.6.0
- jsonschema==4.9.1
- jupyter==1.0.0
- jupyter-client==6.1.5
- jupyter-console==6.4.3
- jupyter-core==4.10.0
- jupyter-console==6.4.4
- jupyter-core==4.11.1
- jupyterlab-pygments==0.2.2
- jupyterlab-widgets==1.1.0
- kiwisolver==1.4.3
- jupyterlab-widgets==1.1.1
- kiwisolver==1.4.4
- lightning-bolts==0.4.0
- llvmlite==0.34.0
- mako==1.2.0
- markdown==3.3.7
- mako==1.2.1
- markdown==3.4.1
- markupsafe==2.1.1
- marshmallow==3.16.0
- marshmallow==3.17.0
- marshmallow-enum==1.5.1
- matplotlib==3.3.0
- mccabe==0.6.1
- mistune==0.8.4
- mlflow==1.23.1
- mlflow-skinny==1.26.1
- mlflow-skinny==1.27.0
- monai==0.6.0
- more-itertools==8.13.0
- msal==1.18.0
@ -195,12 +200,12 @@ dependencies:
- multidict==6.0.2
- mypy==0.910
- mypy-extensions==0.4.3
- nbclient==0.6.4
- nbclient==0.6.6
- nbconvert==6.5.0
- nbformat==5.4.0
- ndg-httpsclient==0.5.1
- nest-asyncio==1.5.5
- networkx==2.8.4
- networkx==2.8.5
- nibabel==4.0.1
- notebook==6.4.12
- numba==0.51.2
@ -215,11 +220,12 @@ dependencies:
- pathspec==0.9.0
- pexpect==4.8.0
- pillow==9.0.0
- pkgutil-resolve-name==1.3.10
- platformdirs==2.5.2
- pluggy==0.13.1
- portalocker==2.4.0
- portalocker==2.5.1
- prometheus-client==0.14.1
- prometheus-flask-exporter==0.20.2
- prometheus-flask-exporter==0.20.3
- protobuf==3.20.1
- psutil==5.7.2
- ptyprocess==0.7.0
@ -251,11 +257,11 @@ dependencies:
- qtconsole==5.3.1
- qtpy==2.1.0
- querystring-parser==1.2.4
- requests==2.28.0
- requests==2.28.1
- requests-oauthlib==1.3.1
- rich==10.13.0
- rpdb==0.1.6
- rsa==4.8
- rsa==4.9
- ruamel-yaml==0.16.12
- ruamel-yaml-clib==0.2.6
- runstats==1.8.0
@ -268,8 +274,18 @@ dependencies:
- simpleitk==1.2.4
- six==1.15.0
- smmap==5.0.0
- snowballstemmer==2.2.0
- soupsieve==2.3.2.post1
- sqlalchemy==1.4.37
- sphinx==5.0.2
- sphinx-basic-ng==0.0.1a12
- sphinx-rtd-theme==1.0.0
- sphinxcontrib-applehelp==1.0.2
- sphinxcontrib-devhelp==1.0.2
- sphinxcontrib-htmlhelp==2.0.0
- sphinxcontrib-jsmath==1.0.1
- sphinxcontrib-qthelp==1.0.3
- sphinxcontrib-serializinghtml==1.1.5
- sqlalchemy==1.4.39
- sqlparse==0.4.2
- stopit==1.1.2
- stringcase==1.2.0
@ -281,22 +297,22 @@ dependencies:
- terminado==0.15.0
- textwrap3==0.9.2
- threadpoolctl==3.1.0
- tifffile==2022.5.4
- tifffile==2022.7.31
- tinycss2==1.1.1
- toml==0.10.2
- tomli==2.0.1
- torchio==0.18.74
- torchmetrics==0.6.0
- tornado==6.1
- tornado==6.2
- tqdm==4.64.0
- typing-inspect==0.7.1
- umap-learn==0.5.2
- urllib3==1.26.7
- webencodings==0.5.1
- websocket-client==1.3.3
- werkzeug==2.1.2
- widgetsnbextension==3.6.0
- werkzeug==2.2.1
- widgetsnbextension==3.6.1
- wrapt==1.14.1
- yacs==0.1.8
- yarl==1.7.2
- zipp==3.8.0
- yarl==1.8.1
- zipp==3.8.1

Просмотреть файл

@ -11,8 +11,8 @@ dependencies:
- python-blosc=1.7.0
- torchvision=0.11.1
- pip:
- azure-mgmt-resource==12.1.0
- azure-mgmt-datafactory==1.1.0
- azure-mgmt-resource==12.1.0
- azure-storage-blob==12.6.0
- azureml-mlflow==1.36.0
- azureml-sdk==1.36.0
@ -24,25 +24,27 @@ dependencies:
- docker==4.3.1
- fastmri==0.2.0
- flake8==3.8.3
- furo==2022.6.21
- gitpython==3.1.7
- gputil==1.4.0
- h5py==2.10.0
- hi-ml==0.2.2
- ipython==7.31.1
- hi-ml-azure==0.2.2
- imageio==2.15.0
- InnerEye-DICOM-RT==1.0.3
- ipython==7.31.1
- joblib==0.16.0
- jupyter==1.0.0
- jupyter-client==6.1.5
- jupyter==1.0.0
- lightning-bolts==0.4.0
- matplotlib==3.3.0
- mlflow==1.23.1
- monai==0.6.0
- mypy==0.910
- mypy-extensions==0.4.3
- mypy==0.910
- numba==0.51.2
- numba==0.51.2
- numpy==1.19.1
- numba==0.51.2
- opencv-python-headless==4.5.1.48
- pandas==1.1.0
- papermill==2.2.2
@ -53,10 +55,10 @@ dependencies:
- pydicom==2.0.0
- pyflakes==2.2.0
- PyJWT==1.7.1
- pytest==6.0.1
- pytest-cov==2.10.1
- pytest-forked==1.3.0
- pytest-xdist==1.34.0
- pytest==6.0.1
- pytorch-lightning==1.5.5
- rich==10.13.0
- rpdb==0.1.6
@ -68,6 +70,8 @@ dependencies:
- seaborn==0.10.1
- simpleitk==1.2.4
- six==1.15.0
- sphinx-rtd-theme==1.0.0
- sphinx==5.0.2
- stopit==1.1.2
- tabulate==0.8.7
- tensorboard==2.3.0

Просмотреть файл

@ -1,8 +0,0 @@
### Building docs for InnerEye-DeepLearning
1. First, make sure you have all the packages necessary for InnerEye.
1. Install pip dependencies from sphinx-docs/requirements.txt.
```
pip install -r requirements.txt
```
1. Run `make html` from the folder sphinx-docs. This will create html files under sphinx-docs/build/html.

Просмотреть файл

@ -1,37 +0,0 @@
# ------------------------------------------------------------------------------------------
# Copyright (c) Microsoft Corporation. All rights reserved.
# Licensed under the MIT License (MIT). See LICENSE in the repo root for license information.
# ------------------------------------------------------------------------------------------
import shutil
from pathlib import Path
def replace_in_file(filepath: Path, original_str: str, replace_str: str) -> None:
"""
Replace all occurences of the original_str with replace_str in the file provided.
"""
text = filepath.read_text()
text = text.replace(original_str, replace_str)
filepath.write_text(text)
if __name__ == '__main__':
sphinx_root = Path(__file__).absolute().parent
repository_root = sphinx_root.parent
markdown_root = sphinx_root / "source" / "md"
repository_url = "https://github.com/microsoft/InnerEye-DeepLearning"
# Create directories source/md and source/md/docs where files will be copied to
if markdown_root.exists():
shutil.rmtree(markdown_root)
markdown_root.mkdir()
# copy README.md and doc files
shutil.copy(repository_root / "README.md", markdown_root)
shutil.copy(repository_root / "CHANGELOG.md", markdown_root)
shutil.copytree(repository_root / "docs", markdown_root / "docs")
# replace links to files in repository with urls
md_files = markdown_root.rglob("*.md")
for filepath in md_files:
replace_in_file(filepath, "](/", f"]({repository_url}/blob/main/")

Просмотреть файл

@ -1,3 +0,0 @@
sphinx==5.0.2
sphinx-rtd-theme==1.0.0
recommonmark==0.7.1

Просмотреть файл

@ -1,50 +0,0 @@
.. InnerEye documentation master file, created by
sphinx-quickstart on Sun Jun 28 18:04:34 2020.
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
InnerEye-DeepLearning Documentation
===================================
.. toctree::
:maxdepth: 1
md/README.md
md/docs/WSL.md
md/docs/environment.md
md/docs/setting_up_aml.md
md/docs/creating_dataset.md
md/docs/building_models.md
md/docs/sample_tasks.md
md/docs/debugging_and_monitoring.md
.. toctree::
:maxdepth: 1
:caption: Further reading for contributors
md/docs/pull_requests.md
md/docs/testing.md
md/docs/contributing.md
md/docs/hello_world_model.md
md/docs/deploy_on_aml.md
md/docs/bring_your_own_model.md
md/docs/fastmri.md
md/docs/innereye_as_submodule.md
md/docs/model_diagnostics.md
md/docs/move_model.md
md/docs/releases.md
md/docs/self_supervised_models.md
md/CHANGELOG.md
.. toctree::
:caption: API documentation
rst/api/index
Indices and tables
==================
* :ref:`genindex`
* :ref:`modindex`