Autosaving checkpoints by default every 1 epoch to a fixed file name. Retiring the "top k" recovery checkpoint notion because that was tied to specific models that needed more than 1 checkpoint.
* update pl
* fix one test
* our fix not needed anymore
* fix yet another test
* add new torchmetrics
* fix checkpoints
* fix some test
* fix one test more
* attempt to fix test
* Update byol code to match new pl bolts
* needed to update
* back to how it was
* update
* update
* changelog
* update regression metrics
* skip test on wind
* flake8
* forgot to update this
* mypy
* remove comment
* try to see if problem comes from sync dist flag
* few fixes
* Update expected number of subjects
* correct more
* flake8
Co-authored-by: Anton Schwaighofer <antonsc@microsoft.com>
- The regresssion test code in #492 missed out files in AzureML, because the wrong function was called in the main workflow.
- Cleaner folder structure for regression test results
- Enable regression tests on text and binary files, that are either produced by the job or uploaded to the run context
- Adding a large set of these regression test files to all models in PR builds