* Fix shell lexer name
* Update CHANGELOG
* Fix CHANGELOG
* Fix "html_static_path entry '_static' does not exist"
* Clean up preprocess script
* Fix link to InnerEye-DataQuality
* Use shutil.copy to copy files
* Remove extra info from CHANGELOG
* Fix broken link to LICENSE
* Fix lexer name for YAML
* Remove colons from headers
* Fix InnerEye module not being found
* Fix DeepMIL metrics input bug
* Add first version of metrics tests
* Update submodule
* Add test for DeepMIL metrics inputs
* Clean-up and update submodule
* Update changelog
* Upgrade mlflow due to Component Governance warning
* Replace RadIO with TorchIO
* Ensure patches are float32 for forward pass
* Update changelog
* Ignore some types to fix mypy errors
* Remove APEX from conda environment in docs example
Co-authored-by: Javier <jaalvare@microsoft.com>
* Add subsampling transform
* Add option to allow_missing_keys for Subsampled
* Add dropout param to BaseMIL
* Add docstring and tests for Subsampled
* Update changelog
* Update to hi-ml with mean pooling
* Enable mean pooling in DeepMIL
* Add/refactor mean pooling tests
* Update changelog
* Update to latest hi-ml with mean pooling
While we updated DeepMIL for the Panda dataset to work with the latest changes, we did not update DeepMIL for the TCGA CRCK dataset.
This PR updates how the caching of the encoded tiles is done and how the checkpoints of the DeepMIL model is saved and loaded.
No additional tests are required since these are the same functions that we use for the Panda dataset. For all of them a test already exists.
Last, the PR updates the cudatoolkit version, Anton and I found that this is the root cause for all our problems with ddp
This PR contains two changes necessary to run DeepSMILE on a large dataset when using the Innereye SSL checkpoint (or any other encoder with high dimension): 1 -option to encode in chunks (this prevents OOM error when performing the encoding) 2 -option to load the cached encoded dataset in CPU (this prevents OOM when loading from the cache)
It also changes how the PNG images are loaded all over the histo pipeline to make the loading faster (see https://hi-ml.readthedocs.io/en/latest/loading_images.html)
AzureML jobs from failed previous PR builds do not get cancelled, consuming excessive resources. Now kill all queued and running jobs before starting new ones.
SimCLR cosine LR scheduler was getting wrong length information when the linear head dataset was longer than the encoder dataset. Also, removed lots of obsolete pytest.skipif.is_windows() annotations.
* heatmap and thumbnail output, deepmil module for panda
* deepmil module panda subclass
* heatmap of selected tiles with correct location
* scaled and shifted rectangle coordinates
* slide dataset at container level
* Instantiate dataset class in basemil
* dataset paths for AML
* heatmap utils selected
* fix mypy and flake8 errors
* mypy error resolved
* address PR comments
* flake8 errors resolved
* PR comments addressed
* add test for plots
* PR comments more
* test for heatmap
* tests for heatmap
* add file comparison plot tests
* reverting test_dict values for other test with hardcoded values
* PR comments Valentina
* PR comments
* change colormap to reds
* remove redundant for loop
Co-authored-by: t-hsharma <t-hsharma@microsoft.com>
Autosaving checkpoints by default every 1 epoch to a fixed file name. Retiring the "top k" recovery checkpoint notion because that was tied to specific models that needed more than 1 checkpoint.
* Add basic dataset and environment changes
* Add loading/preproc utils
* Back-up PANDA tiling scripts
* Refactor and generalise tiling scripts
* Remove Azure scripts
* Add test WSI file
* Add preprocessing tests
* Update changelog
* Add Linux condition for cuCIM in environment.yml
* Use PANDA instead of TCGA-PRAD in test
* Leave TcgaPradDataset as an example
* Fix skipped InnerEye dataset tests
* Create and test mock slides dataset
* Remove Tests/ML/datasets from pytest discovery
- Callbacks presently need to be specified via trainer kwargs, which is cumbersome. Introduce a callbacks method.
- Add a flag to control how often validation happens
Workaround for a temporary issue with low-priority preemption: checkpoint files are not available on disk upon job restart. Trying to download from AML.