Existing metrics for binary classification models use binary cross entropy, computed off the posteriors. That is not safe to cast to 16bit precision. This PR enables that. Metrics can now computed either off the logits or off posteriors.
Add use-dicom command line option to inference. If set then scoring expects input to be a zip of a DICOM series and the output will be a zip of a DICOM-RT file.
- Fix a problem with the GPU utilization metrics not being correctly visible in AzureML
- Log GPU utilization as a table, so that we don't have too many metrics
- Coverage reporting complains that it does not like the HTML output folder.
- Exclude the Tests* folders from the report, so that the overall coverage figures make more sense
* Fix vacuous test for at-least-one in dataset split
The expression 'all([len(x[mode]) >= 1] for mode in x.keys())' will always evaluate to True, because 'bool([False]) == True'.
* Fix post-init validation of pairwise split intersections
Previously, it erroneously checked for empty three-way intersection of train,
test, and val, whereas the correct check is for pairwise intersections:
train-test, train-val, and test-val.
* Add group_column and validation logic
* Add method to split dataset by arbitrary key column
* Delegate DatasetSplits.from_subject_ids to _from_split_keys
* Add DatasetSplits.from_groups convenience method
* Add grouping logic to DatasetSplits.from_proportions
* Implement grouped k-fold cross-validation
This employs scikit-learn's GroupKFold class.
* Add tests for grouped splits and grouped k-fold crossval
* Update changelog
* Fix mypy warnings
* Document that restricted and by-institution splits don't support grouping
* Add validation of groups in test data
* Move itertools.combinations import to top of file
* Add grouped splits usage instructions to changelog
* Move itertools.combinations import to top of test file
Removes the ability to perform sub-fold cross validation. Removes parameters `number_of_cross_validation_splits_per_fold`
and `cross_validation_sub_fold_split_index` from ScalarModelBase.
* Renamed HeadAndNeckBase to HeadAndNeckPaper, and ProstateBase to ProstatePaper;
* Added new extensions of SegmentationModelBases HeadAndNeckBase and ProstateBase;
* Delete unused ProstatePaperBase.py;
* Added validation for SliceExclusionRule and SummedProbabilityRule.
* Remove blobxfer
* Update CHANGELOG.md
* Remove configs that are not required
* Remove from environment.yml
* Fix numba issue
* Improve CHANGELOG.md
* Fix tests
* Remove configs that are not required
* test
* fix test
* download fix
* create separate model folder
* fixing tests
* making HD check better
* Tests
* inverted logic
* registering on parent run
* docu
- Make file structure consistent across normal training and training when InnerEye is a submodule
- Add test coverage for the file structure of registered models
- Add documentation around how the model structure looks like
- If multiple Conda files are used in an InnerEye run, they are merged into one environment file for deployment. The complicated merge inside of `run_scoring` could be deprecated in principle, but leaving it there if we need for legacy models.
- Add test coverage for `submit_for_inference`: Previous test was using a hardcoded legacy model, meaning that any changes to model structure could have broken the script
- The test for `submit_for_inference` is no longer submitted from the big AzureML run, shortening the runtime of that part of the PR build. Instead, it is triggered after the `TrainViaSubmodule` part of the build. The corresponding AzureML experiment is no longer `model_inference`, but the same experiment as all other AzureML runs.
- The test for `submit_for_inference` was previously running on the expensive `training-nd24` cluster, now on the cheaper `nc12`.
- `submit_for_inference` now correctly uses the `score.py` file that is inside of the model, rather than copying it from the repository root.
- Adds a parameter `weights_url` to DeepLearningConfig to download model weights from a URL.
- Adds a parameter `local_weights_path` to DeepLearningConfig to initialize model weights from a local checkpoint. This can also be used to perform inference on a checkpoint from a local training run.
- Refactors all checkpoint logic, including recovering from run_recovery into a class CheckpointHandler
- Adds a parameter `epochs_to_test` to DeepLearningConfig which can be used to specify a list of epochs to test in a training/inference run.
- Deprecates DeepLearningConfig parameters `test_diff_epochs`, `test_step_epochs` and `test_start_epoch`.
Closes#178Closes#297
- Rename the `TestOutputDirectories` class because it is picked up by pytest as something it expects to contain tests
- Switch fields to using `Path`, rather than `str`