InnerEye-DeepLearning/CHANGELOG.md

34 KiB

Changelog

Early versions of this toolbox used a manually created changelog. As of March 2022, we have switched to using Github's auto-generated changelog. If you would like to view the changelog for a particular release, you can do so on the Releases page. Each release contains a link for "Full Changelog".

Changelog for Versions before March 2022

Upcoming

Added

  • (#709) Update hi-ml submodule.
  • (#667) Automatically and linearly scale the learning rate of the SSL encoder to the number of GPUs.
  • (#689) Show default argument values in help message.
  • (#671) Remove sequence models and unused variables. Simplify README.
  • (#693) Improve instructions for HelloWorld model in AzureML.
  • (#678) Add function to get log level name and use it for logging.
  • (#666) Replace RadIO with TorchIO for patch-based inference.
  • (#643) Test for recovery of SSL job. Tracks learning rate and train loss.
  • (#594) When supplying a "--tag" argument, the AzureML jobs use that value as the display name, to more easily distinguish run.
  • (#640) Cancel AzureML jobs from previous runs of the PR build in the same branch to reduce AML load
  • (#577) Commandline switch monitor_gpu to monitor GPU utilization via Lightning's GpuStatsMonitor, switch monitor_loading to check batch loading times via BatchTimeCallback, and pl_profiler to turn on the Lightning profiler (simple, advanced, or pytorch)
  • (#544) Add documentation for segmentation model evaluation.
  • (#637) Add option to encode in chunks and to load pre-cached dataset in CPU or GPU in the histo pipeline.
  • (#465) Adding ability to run segmentation inference module on test data with partial ground truth files. (Also 522.)
  • (#502) More flags for fine control of when to run inference.
  • (#492) Adding capability for regression tests for test jobs that run in AzureML.
  • (#509) Run inference on registered models (single and ensemble) using the parameter model_id.
  • (#554) Added a parameter pretraining_dataset_id to NIH_COVID_BYOL to specify the name of the SSL training dataset.
  • (#560) Added pre-commit hooks.
  • (#619) Add DeepMIL PANDA
  • (#559) Adding the accompanying code for the "Active label cleaning: Improving dataset quality under resource constraints" paper.
  • (#589) Add LightningContainer.update_azure_config() hook to enable overriding AzureConfig parameters from a container (e.g. experiment_name, cluster, num_nodes).
  • (#617) Commandline flag pl_check_val_every_n_epoch to control how often validation is happening
  • (#618) Using Azure Pipeline Cache to avoid re-building conda environnment repeatedly
  • (#603) Add histopathology module
  • (#614) Checkpoint downloading falls back to looking into AzureML if no checkpoints on disk
  • (#613) Add additional tests for histopathology datasets
  • (#616) Add more histopathology configs and tests
  • (#621) Add WSI preprocessing functions and enable tiling more generic slide datasets
  • (#634) Add WSI heatmaps and thumbnails to standard test outputs
  • (#635) Add tile selection and binary label for online evaluation of PANDA SSL
  • (#647) Add class-wise accuracy logging and confusion matrix to DeepMIL
  • (#653) Add dropout to DeepMIL and fix feature extractor setup.
  • (#650) Enable fine-tuning in DeepMIL using PANDA as the classification task.
  • (#656) Add subsampling transform and support for MIL mean pooling.
  • (#679) Add FP and TN slides/tiles to DeepMIL outputs and extend outputs to multi-class problems.

Changed

  • (#677) Update TorchIO version to include the recent bug fix related to patch-based inference.
  • (#666) Replace RadIO with TorchIO for patch-based inference.
  • (#659) Update cudatoolkit version from 11.1 to 11.3.
  • (#588) Replace SciPy with PIL.PngImagePlugin.PngImageFile to load png files.
  • (#585) Switching to PyTorch 1.10.0 and torchvision 0.11.1
  • (#576) The console output is no longer written to stdout.txt because AzureML handles that better now
  • (#531) Updated PL to 1.3.8, torchmetrics and pl-bolts and changed relevant metrics and SSL code API.
  • (#555) Make the SSLContainer compatible with new datasets
  • (#533) Better defaults for inference on ensemble children.
  • (#536) Inference will not run on the validation set by default, this can be turned on via the --inference_on_val_set flag.
  • (#548) Many Azure-related functions have been moved out of the toolbox, into the separate hi-ml Python package.
  • (#502) Renamed command line option 'perform_training_set_inference' to 'inference_on_train_set'. Replaced command line option 'perform_validation_and_test_set_inference' with the pair of options 'inference_on_val_set' and 'inference_on_test_set'.
  • (#496) All plots are now saved as PNG, rather than JPG.
  • (#497) Reducing the size of the code snapshot that gets uploaded to AzureML, by skipping all test folders.
  • (#509) Parameter extra_downloaded_run_id has been renamed to pretraining_run_checkpoints.
  • (#526) Updated Covid config to use a multiclass formulation. Moved functions create_metric_computers and compute_and_log_metrics from ScalarLightning to ScalarModelBase.
  • (#554) Updated report in CovidModel. Set parameters in the config to run inference on both the validation and test sets by default.
  • (#584) SSL models write the optimizer state for the linear head to the checkpoint now.
  • (#594) Pytorch is now non-deterministic by default. Upgrade to AzureML-SDK 1.36
  • (#566) Update hi-ml dependency to hi-ml-azure.
  • (#591) Upgrade Pytorch Lightning to 1.5.0
  • (#572) Updated to new version of hi-ml package
  • (#623) Save checkpoints in SSLOnlineEvaluator without DDP wrapper code
  • (#617) Provide an easier way for LightningContainers to add callbacks.
  • (#596) Add cudatoolkit=11.1 specification to environment.yml.
  • (#615) Minor changes to checkpoint download from AzureML.
  • (#605) Make build jobs deterministic for regression testing.
  • (#633) Model training now only writes one recovery checkpoint, rather than multiple ones. Frequency is controlled by autosave_every_n_val_epochs.
  • (#632) Nifti test data is no longer stored in Git LFS

Fixed

  • (#701) Fix 3D images expected to be 4D for intensity normalization.
  • (#704) Add submodules to sys.path to fix autodoc's warning.
  • (#699) Fix Sphinx warnings.
  • (#682) Ensure the shape of input patches is compatible with model constraints.
  • (#681) Pad model outputs if they are smaller than the inputs.
  • (#683) Fix missing separator error in docs Makefile.
  • (#659) Fix caching and checkpointing for TCGA CRCk dataset.
  • (#649) Fix for the _convert_to_tensor_if_necessary method so that PIL.Image as well as np.array get converted to torch.Tensor.
  • (#606) Bug fix: registered models do not include the hi-ml submodule
  • (#646) Workaround for bug in PL: CombinedLoader cannot be used for training data when using DDP
  • (#593) Bug fix for hi-ml 0.1.11 issue (#130): empty mount point is turned into ".", which fails the AML job
  • (#587) Bug fix for regression in AzureML's handling of environments: upgrade to hi-ml 0.1.11
  • (#625) updates to PandaDeepMIL to enable the use of a SSL pre-trained checkpoint and updated commit to hi-ml
  • (#537) Print warning if inference is disabled but comparison requested.
  • (#567) fix pillow version.
  • (#546) Environment and hello_world_model documentation updated
  • (#525) Enable --store_dataset_sample
  • (#495) Fix model comparison.
  • (#547) The parameter pl_find_unused_parameters was no longer used to initialize the DDP Plugin.
  • (#482) Check bool parameter is either true or false.
  • (#475) Bug in AML SDK meant that we could not train any large models anymore because data loaders ran out of memory.
  • (#472) Correct model path for moving ensemble models.
  • (#494) Fix an issue where multi-node jobs for LightningContainer models can get stuck at test set inference.
  • (#498) Workaround for the problem that downloading multiple large checkpoints can time out.
  • (#515) Workaround for occasional issues with dataset mounting and running matplotblib on some machines. Re-instantiated a disabled test.
  • (#509) Fix issue where model checkpoints were not loaded in inference-only runs when using lightning containers.
  • (#553) Fix incomplete test data module setup in Lightning inference.
  • (#557) Fix issue where learning rate was not set correctly in the SimCLR module
  • (#622) Fix issue with multi-GPU jobs on a VM: each process tries to create a folder structure
  • (#558) Fix issue with the CovidModel config where model weights from a finetuning run were incompatible with the model architecture created for non-finetuning runs.
  • (#604) Fix issue where runs on a VM would download the dataset even when a local dataset is provided.
  • (#628) SSL SimCLR using the wrong LR schedule when running on multiple nodes
  • (#638) SimClr cosine LR scheduler was using wrong length information when using with long linear head datasets
  • (#612) SSL online evaluator was not doing distributed training
  • (#652) Run pytest build on Windows after Linux agent version upgrade
  • (#655) Run pytest on Linux again, but with Ubuntu 20.04
  • (#674) Fix DeepMIL metrics bug whereby hard labels were used instead of probabilities.

Removed

  • (#692) Replace InnerEye-DataQuality with a link to commit,
  • (#577) Removing the monitoring of batch loading time, use the BatchTimeCallback from hi-ml instead
  • (#542) Removed Windows test leg from build pipeline.
  • (#509) Parameters local_weights_path and weights_url can no longer be used to initialize a training run, only inference runs.
  • (#526) Removed get_posthoc_label_transform in class ScalarModelBase. Instead, functions get_loss_function and compute_and_log_metrics in ScalarModelBase can be implemented to compute the loss and metrics in a task-specific manner.
  • (#554) Removed cryptography from list of invalid packages in test_invalid_python_packages as it is already present as a dependency in our conda environment.
  • (#596) Removed obsolete TrainGlaucomaCV from PR build.
  • (#604) Removed all code that downloads datasets, this is now all handled by hi-ml

Deprecated

  • (#633) Model fields recovery_checkpoint_save_interval and recovery_checkpoints_save_last_k have been retired. Recovery checkpoint handling is now controlled by autosave_every_n_val_epochs.

0.3 (2021-06-01)

Added

  • (#483) Allow cross validation with 'bring your own' Lightning models (without ensemble building).
  • (#489) Remove portal query for outliers.
  • (#488) Better handling of missing seriesId in segmentation cross validation reports.
  • (#454) Checking that labels are mutually exclusive.
  • (#447) Added a sanity check to ensure there are no missing channels, nor missing files. If missing channels in the csv file or filenames associated with channels are incorrect, pipeline exits with error report before running training or inference.
  • (#446) Guarding save_outlier so that it works when institution id and series id columns are missing.
  • (#441) Add script to move models from one AzureML workspace to another: python InnerEye/Scripts/move_model.py
  • (#417) Added a generic way of adding PyTorch Lightning models to the toolbox. It is now possible to train almost any Lightning model with the InnerEye toolbox in AzureML, with only minimum code changes required. See the MD documentation for details.
  • (#430) Update conversion to 1.0.1 InnerEye-DICOM-RT to add: manufacturer, SoftwareVersions, Interpreter and ROIInterpretedTypes.
  • (#385) Add the ability to train a model on multiple nodes in AzureML. Example: Add --num_nodes=2 to the commandline arguments to train on 2 nodes.
  • (#366) and (#407) add new parameters to the score.py script of use_dicom and result_zip_dicom_name. If use_dicom==True then the input file should be a zip of a DICOM series. This will be unzipped and converted to Nifti format before processing. The result will then be converted to a DICOM-RT file, zipped and stored as result_zip_dicom_name.
  • (#416) Add a github action chat checks if CHANGELOG.md has been modified.
  • (#412) Dataset files can now have arbitrary names, and are no longer restricted to be called dataset.csv, via the config field dataset_csv. This allows to have a single set of image files in a folder, but multiple datasets derived from it.
  • (#391) Support for multilabel classification tasks. Multilabel models can be trained by adding the parameter class_names to the config for classification models. class_names should contain the name of each label class in the dataset, and the order of names should match the order of class label indices in dataset.csv. dataset.csv supports multiple labels (indices corresponding to class_names) per subject in the label column. Multiple labels should be encoded as a string with labels separated by a |, for example "0|2|4". Note that this PR does not add support for multiclass models, where the labels are mutually exclusive.
  • (#425) The number of layers in a Unet is no longer fixed at 4, but can be set via the config field num_downsampling_paths. A lower number of layers may be useful for decreasing memory requirements, or for working with smaller images. (The minimum image size in any dimension when using a network of n layers is 2**n.)
  • (#426) Flake8, mypy, and testing the HelloWorld model is now happening in a Github action, no longer in Azure Pipelines.
  • (#405) Cross-validation runs for classification models now also generate a report notebook summarising the metrics from the individual splits. Also includes minor formatting improvements for standard classification reports.
  • (#438) Add links and small docs to InnerEye-Gateway and InnerEye-Inference
  • (#439) Enable automatic job recovery from last recovery checkpoint in case of job pre-emption on AML. Give the possibility to the user to keep more than one recovery checkpoint.
  • (#442) Enable defining custom scalar losses (ScalarLoss.CustomClassification and CustomRegression), prediction targets (ScalarModelBase.target_names), and reporting (ModelConfigBase.generate_custom_report()) in scalar configs, providing more flexibility for defining model configs with custom behaviour while leveraging the existing InnerEye workflows.
  • (#444) Added setup scripts and documentation to work with the FastMRI challenge datasets.
  • (#444) Git-related information is now printed to the console for easier diagnostics.
  • (#445) Adding test coverage for the HelloContainer model with multiple GPUs
  • (#450) Adds the metric "Accuracy at threshold 0.5" to the classification report (classification_crossval_report.ipynb).
  • (#451) Write a file model_outputs.csv with columns subject, prediction_target, label, model_output and cross_validation_split_index. This file is not written out for sequence models.
  • (#440) Added support for training of self-supervised models (BYOL and SimCLR) based on the bring-your-own-model framework. Providing examples configurations for training of SSL models on CIFAR10/100 datasets as well as for chest-x-ray datasets such as NIH CHest-Xray or RSNA Pneumonia Detection Challenge datasets. See SSL doc for more details.
  • (#455) All models trained on AzureML are registered. The codepath previously allowed only segmentation models (subclasses of SegmentationModelBase) to be registered. Models are registered after a training run or if the only_register_model flag is set. Models may be legacy InnerEye config-based models or may be defined using the LightningContainer class. Additionally, the TrainHelloWorldAndHelloContainer job in the PR build has been split into two jobs, TrainHelloWorld and TrainHelloContainer. A pytest marker after_training_hello_container has been added to run tests after training is finished in the TrainHelloContainer job.
  • (#456) Adding configs to train Covid detection models.
  • (#463) Add arguments dirs_recursive and dirs_non_recursive to mypy_runner.py to let users specify a list of directories to run mypy on.

Changed

  • (#385) Starting an AzureML run now uses the ScriptRunConfig object, rather than the deprecated Estimator object.
  • (#385) When registering a model, the name of the Python execution environment is added as a tag. This tag is read when running inference, and the execution environment is re-used.
  • (#411) Upgraded to PyTorch 1.8.0, PyTorch-Lightning 1.1.8 and AzureML SDK 1.23.0
  • (#432) Upgraded to PyTorch-Lightning 1.2.7. Add end-to-end test for classification cross-validation. WARNING: upgrade PL version causes hanging of multi-node training.
  • (#437) Upgrade to PyTorch-Lightning 1.2.8.
  • (#439) Recovery checkpoints are now named recovery_epoch=x.ckpt instead of recovery.ckpt or recovery-v0.ckpt.
  • (#451) Change the signature for function generate_custom_report in ModelConfigBase to take only the path to the reports folder and a ModelProcessing object.
  • (#444) The method before_training_on_rank_zero of the LightningContainer class has been renamed to before_training_on_global_rank_zero. The order in which the hooks are called has been changed.
  • (#458) Simplifying and generalizing the way we handle data augmentations for classification models. The pipelining logic is now taken care of by a ImageTransformPipeline class that takes as input a list of transforms to chain together. This pipeline takes of applying transforms on 3D or 2D images. The user can choose to apply the same transformation for all channels (RGB example) or whether to apply different transformation for each channel (if each channel represents a different modality / time point for example). The pipeline can now work directly with out-of-the box torchvision transform (as long as they support [..., C, H, W] inputs). This allows to get rid of nearly all of our custom augmentations functions. The conversion from pipeline of image transformation to ScalarItemAugmentation is now taken care of under the hood, the user does not need to call this wrapper for each config class. In models derived from ScalarModelConfig to change which augmentations are applied to the images inputs (resp. segmentations inputs), users can override get_image_transform (resp. get_segmentation_transform). These two functions replace the old get_image_sample_transforms method. See docs/building_models.md for more information on augmentations.

Fixed

  • (#422) Documentation - clarified setting_up_aml.md datastore creation instructions and fixed small typos in hello_world_model.md
  • (#432) Fixed cross-validation for classification models. Fixed multi-gpu metrics aggregation. Add end-to-end test for classification cross-validation. Add fix to bug in ddp setting when running multi-node with 1 gpu per node.
  • (#435) If parameter model in AzureConfig is not set, display an error message and terminate the run.
  • (#437) Fixed multi-node DDP bug in PL v1.2.8. Re-add end-to-end test for multi-node.
  • (#445) Fixed a bug when running inference for container models on machines with >1 GPU

Removed

  • (#439) Deprecated start_epoch config argument.
  • (#450) Delete unused classification_report.ipynb.
  • (#455) Removed the AzureRunner conda environment. The full InnerEye conda environment is needed to submit a training job to AzureML.
  • (#458) Getting rid of all the unused code for RandAugment & Co. The user has now instead complete freedom to specify the set of augmentations to use.
  • (#468) Removed the KneeSinglecoil example model

Deprecated

0.2 (2021-01-29)

Added

  • (#323) There are new model configuration fields (and hence, commandline options), in particular for controlling PyTorch Lightning (PL) training:
    • max_num_gpus controls how many GPUs are used at most for training (default: all GPUs, value -1).
    • pl_num_sanity_val_steps controls the PL trainer flag num_sanity_val_steps
    • pl_deterministic controls the PL trainer flags benchmark and deterministic
    • generate_report controls if a HTML report will be written (default: True)
    • recovery_checkpoint_save_interval determines how often a checkpoint for training recovery is saved.
  • (#336) New extensions of SegmentationModelBases HeadAndNeckBase and ProstateBase. Use these classes to build your own Head&Neck or Prostate models, by just providing a list of foreground classes.
  • (#363) Grouped dataset splits and k-fold cross-validation. This allows, for example, training on datasets with multiple images per subject without leaking data from the same subject across train/test/validation sets or cross-validation folds. To use this functionality, simply provide the name of the CSV grouping column (group_column) when creating the DatasetSplits object in your model config's get_model_train_test_dataset_splits() method. See the InnerEye.ML.utils.split_dataset.DatasetSplits class for details.

Changed

  • (#323) The codebase has undergone a massive refactoring, to use PyTorch Lightning as the foundation for all training. As a consequence of that:
    • Training is now using Distributed Data Parallel with synchronized batchnorm. The number of GPUs to use can be controlled by a new commandline argument max_num_gpus.
    • Several classes, like ModelTrainingSteps*, have been removed completely.
    • The final model is now always the one that is written at the end of all training epochs.
    • The old code that options to run full image inference at multiple epochs (i.e., multiple checkpoints), this has been removed, alongside the respective commandline options save_start_epoch, save_step_epochs, epochs_to_test, test_diff_epochs, test_step_epochs, test_start_epoch
    • The commandline option register_model_only_for_epoch is now called only_register_model, and is boolean.
    • All metrics are written to AzureML and Tensorboard in a unified format. A training Dice score for 'bladder' would previously be called Train_Dice/bladder, now it is train/Dice/bladder.
    • Due to a different checkpoint format, it is no longer possible to use checkpoints written by the previous version of the code.
  • The arguments of the score.py script changed: data_root -> data_folder, it no longer assumes a fixed data subfolder. project_root -> model_root, test_image_channels -> image_files.
  • By default, the visualization of patch sampling for segmentation models will run on only 1 image (down from 5). This is because patch sampling is expensive to compute, taking 1min per large CT scan.
  • (#336) Renamed HeadAndNeckBase to HeadAndNeckPaper, and ProstateBase to ProstatePaper.
  • (#427) Move dicom loading function from SimpleITK to pydicom. Loading time improved by 30x.

Fixed

  • When registering a model, it now has a consistent folder structured, described here. This folder structure is present irrespective of using InnerEye as a submodule or not. In particular, exactly 1 Conda environment will be contained in the model.

Removed

  • The commandline options to control which checkpoint is saved, and which is used for inference, have been removed: save_start_epoch, save_step_epochs, epochs_to_test, test_diff_epochs, test_step_epochs, test_start_epoch
  • Removed blobxfer completely. When downloading a dataset from Azure, we now use AzureML dataset downloading tools. Please remove the following fields from your settings.yml file: 'datasets_storage_account' and 'datasets_container'.
  • Removed ProstatePaperBase.
  • Removed ability to perform sub-fold cross validation. The parameters number_of_cross_validation_splits_per_fold and cross_validation_sub_fold_split_index have been removed from ScalarModelBase.

Deprecated

0.1 (2020-11-13)

  • This is the baseline release.