final public release PR (#125)
* Merged PR 42: Python package structure Created Python package structure * Merged PR 50: Röth-Tarantola generative model for velocities - Created Python package structure for generative models for velocities - Implemented the [Röth-Tarantola model](https://doi.org/10.1029/93JB01563) * Merged PR 51: Isotropic AWE forward modelling using Devito Implemented forward modelling for the isotropic acoustic wave equation using [Devito](https://www.devitoproject.org/) * Merged PR 52: PRNG seed Exposed PRNG seed in generative models for velocities * Merged PR 53: Docs update - Updated LICENSE - Added Microsoft Open Source Code of Conduct - Added Contributing section to README * Merged PR 54: CLI for velocity generators Implemented CLI for velocity generators * Merged PR 69: CLI subpackage using Click Reimplemented CLI as subpackage using Click * Merged PR 70: VS Code settings Added VS Code settings * Merged PR 73: CLI for forward modelling Implemented CLI for forward modelling * Merged PR 76: Unit fixes - Changed to use km/s instead of m/s for velocities - Fixed CLI interface * Merged PR 78: Forward modelling CLI fix * Merged PR 85: Version 0.1.0 * Merging work on salt dataset * Adds computer vision to dependencies * Updates dependecies * Update * Updates the environemnt files * Updates readme and envs * Initial running version of dutchf3 * INFRA: added structure templates. * VOXEL: initial rough code push - need to clean up before PRing. * Working version * Working version before refactor * adding cgmanifest to staging * adding a yml file with CG build task * added prelim NOTICE file * quick minor fixes in README * 3D SEG: first commit for PR. * 3D SEG: removed data files to avoid redistribution. * Merged PR 126: updated notice file with previously excluded components updated notice file with previously excluded components * Updates * 3D SEG: restyled batch file, moving onto others. * Working HRNet * 3D SEG: finished going through Waldeland code * Updates test scripts and makes it take processing arguments * minor update * Fixing imports * Refactoring the experiments * Removing .vscode * Updates gitignore * Merged PR 174: F3 Dutch README, and fixed issues in prepare_data.py This PR includes the following changes: - added README instructions for running f3dutch experiments - prepare_dataset.py didn't work for creating section-based splits, so I fixed a few issues. There are no changes to the patch-based splitting logic. - ran black formatter on the file, which created all the formatting changes (sorry!) * Merged PR 204: Adds loaders to deepseismic from cv_lib * Merged PR 209: changes to section loaders in data.py Changes in this PR will affect patch scripts as well. The following are required changes in patch scripts: - get_train_loader() in train.py should be changed to get_patch_loader(). I created separate function to load section and patch loaders. - SectionLoader now swaps H and W dims. When loading test data in patch, this line can be removed (and tested) from test.py h, w = img.shape[-2], img.shape[-1] # height and width * Merged PR 210: BENCHMARKS: added placeholder for benchmarks. BENCHMARKS: added placeholder for benchmarks. * Merged PR 211: Fixes issues left over from changes to data.py * Merged PR 220: Adds Horovod and fixes Add Horovod training script Updates dependencies in Horovod docker file Removes hard coding of path in data.py * Merged PR 222: Moves cv_lib into repo and updates setup instructions * Merged PR 236: Cleaned up dutchf3 data loaders @<Mathew Salvaris> , @<Ilia Karmanov> , @<Max Kaznady> , please check out if this PR will affect your experiments. The main change is with the initialization of sections/patches attributes of loaders. Previously, we were unnecessarily assigning all train/val splits to train loaders, rather than only those belonging to the given split for that loader. Similar for test loaders. This will affect your code if you access these attributes. E.g. if you have something like this in your experiments: ``` train_set = TrainPatchLoader(…) patches = train_set.patches[train_set.split] ``` or ``` train_set = TrainSectionLoader(…) sections = train_set.sections[train_set.split] ``` * Updates the repo with preliminary results for 2D segmentation * Merged PR 248: Experiment: section-based Alaudah training/testing This PR includes the section-based experiments on dutchf3 to replicate Alaudah's work. No changes were introduced to the code outside this experiment. * Merged PR 253: Waldeland based voxel loaders and TextureNet model Related work items: #16357 * Merged PR 290: A demo notebook on local train/eval on F3 data set Notebook and associated files + minor change in a patch_deconvnet_skip.py model file. Related work items: #17432 * Merged PR 312: moved dutchf3_section to experiments/interpretation moved dutchf3_section to experiments/interpretation Related work items: #17683 * Merged PR 309: minor change to README to reflect the changes in prepare_data script minor change to README to reflect the changes in prepare_data script Related work items: #17681 * Merged PR 315: Removing voxel exp Related work items: #17702 * Merged PR 361: VOXEL: fixes to original voxel2pixel code to make it work with the rest of the repo. Realized there was one bug in the code and the rest of the functions did not work with the different versions of libraries which we have listed in the conda yaml file. Also updated the download script. Related work items: #18264 * Merged PR 405: minor mods to notebook, more documentation A very small PR - Just a few more lines of documentation in the notebook, to improve clarity. Related work items: #17432 * Merged PR 368: Adds penobscot Adds for penobscot - Dataset reader - Training script - Testing script - Section depth augmentation - Patch depth augmentation - Iinline visualisation for Tensorboard Related work items: #14560, #17697, #17699, #17700 * Merged PR 407: Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Related work items: #16362 * Merged PR 452: decouple docker image creation from azureml removed all azureml dependencies from 010_CreateExperimentationDockerImage_GeophysicsTutorial_FWI_Azure_devito.ipynb All other changes are due to trivial reruns Related work items: #18346 * Merged PR 512: Pre-commit hooks for formatting and style checking Opening this PR to start the discussion - I added the required dotenv files and instructions for setting up pre-commit hooks for formatting and style checking. For formatting, we are using black, and style checking flake8. The following files are added: - .pre-commit-config.yaml - defines git hooks to be installed - .flake8 - settings for flake8 linter - pyproject.toml - settings for black formatter The last two files define the formatting and linting style we want to enforce on the repo. All of us would set up the pre-commit hooks locally, so regardless of what formatting/linting settings we have in our local editors, the settings specified by the git hooks would still be enforced prior to the commit, to ensure consistency among contributors. Some questions to start the discussion: - Do you want to change any of the default settings in the dotenv files - like the line lengths, error messages we exclude or include, or anything like that. - Do we want to have a requirements-dev.txt file for contributors? This setup uses pre-commit package, I didn't include it in the environment.yaml file, but instead instructed the user to install it in the CONTRIBUTING.MD file. - Once you have the hooks installed, it will only affect the files you are committing in the future. A big chunk of our codebase does not conform to the formatting/style settings. We will have to run the hooks on the codebase retrospectively. I'm happy to do that, but it will create many changes and a significant looking PR :) Any thoughts on how we should approach this? Thanks! Related work items: #18350 * Merged PR 513: 3D training script for Waldeland's model with Ignite Related work items: #16356 * Merged PR 565: Demo notebook updated with 3D graph Changes: 1) Updated demo notebook with the 3D visualization 2) Formatting changes due to new black/flake8 git hook Related work items: #17432 * Merged PR 569: Minor PR: change to pre-commit configuration files Related work items: #18350 * Merged PR 586: Purging unused files and experiments Purging unused files and experiments Related work items: #20499 * Merged PR 601: Fixes to penobscot experiments A few changes: - Instructions in README on how to download and process Penobscot and F3 2D data sets - moved prepare_data scripts to the scripts/ directory - fixed a weird issue with a class method in Penobscot data loader - fixed a bug in section loader (_add_extra_channel in section loader was not necessary and was causing an issue) - removed config files that were not tested or working in Penobscot experiments - modified default.py so it's working if train.py ran without a config file Related work items: #20694 * Merged PR 605: added common metrics to Waldeland model in Ignite Related work items: #19550 * added cela copyright headers to all non-empty .py files (#3) * switched to ACR instead of docker hub (#4) * sdk.v1.0.69, plus switched to ACR push. ACR pull coming next * full acr use, push and pull, and use in Estimator * temp fix for dcker image bug * fixed the az acr login --username and --password issue * full switch to ACR for docker image storage * Vapaunic/metrics (#1) * Merging work on salt dataset * Adds computer vision to dependencies * Updates dependecies * Update * Updates the environemnt files * Updates readme and envs * Initial running version of dutchf3 * INFRA: added structure templates. * VOXEL: initial rough code push - need to clean up before PRing. * Working version * Working version before refactor * quick minor fixes in README * 3D SEG: first commit for PR. * 3D SEG: removed data files to avoid redistribution. * Updates * 3D SEG: restyled batch file, moving onto others. * Working HRNet * 3D SEG: finished going through Waldeland code * Updates test scripts and makes it take processing arguments * minor update * Fixing imports * Refactoring the experiments * Removing .vscode * Updates gitignore * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * minor wording fix * minor wording fix * enabled splitting dataset into sections, rather than only patches * enabled splitting dataset into sections, rather than only patches * merged duplicate ifelse blocks * merged duplicate ifelse blocks * refactored prepare_data.py * refactored prepare_data.py * added scripts for section train test * added scripts for section train test * section train/test works for single channel input * section train/test works for single channel input * Merged PR 174: F3 Dutch README, and fixed issues in prepare_data.py This PR includes the following changes: - added README instructions for running f3dutch experiments - prepare_dataset.py didn't work for creating section-based splits, so I fixed a few issues. There are no changes to the patch-based splitting logic. - ran black formatter on the file, which created all the formatting changes (sorry!) * Merged PR 204: Adds loaders to deepseismic from cv_lib * train and test script for section based training/testing * train and test script for section based training/testing * Merged PR 209: changes to section loaders in data.py Changes in this PR will affect patch scripts as well. The following are required changes in patch scripts: - get_train_loader() in train.py should be changed to get_patch_loader(). I created separate function to load section and patch loaders. - SectionLoader now swaps H and W dims. When loading test data in patch, this line can be removed (and tested) from test.py h, w = img.shape[-2], img.shape[-1] # height and width * Merged PR 210: BENCHMARKS: added placeholder for benchmarks. BENCHMARKS: added placeholder for benchmarks. * Merged PR 211: Fixes issues left over from changes to data.py * removing experiments from deep_seismic, following the new struct * removing experiments from deep_seismic, following the new struct * Merged PR 220: Adds Horovod and fixes Add Horovod training script Updates dependencies in Horovod docker file Removes hard coding of path in data.py * section train/test scripts * section train/test scripts * Add cv_lib to repo and updates instructions * Add cv_lib to repo and updates instructions * Removes data.py and updates readme * Removes data.py and updates readme * Updates requirements * Updates requirements * Merged PR 222: Moves cv_lib into repo and updates setup instructions * renamed train/test scripts * renamed train/test scripts * train test works on alaudah section experiments, a few minor bugs left * train test works on alaudah section experiments, a few minor bugs left * cleaning up loaders * cleaning up loaders * Merged PR 236: Cleaned up dutchf3 data loaders @<Mathew Salvaris> , @<Ilia Karmanov> , @<Max Kaznady> , please check out if this PR will affect your experiments. The main change is with the initialization of sections/patches attributes of loaders. Previously, we were unnecessarily assigning all train/val splits to train loaders, rather than only those belonging to the given split for that loader. Similar for test loaders. This will affect your code if you access these attributes. E.g. if you have something like this in your experiments: ``` train_set = TrainPatchLoader(…) patches = train_set.patches[train_set.split] ``` or ``` train_set = TrainSectionLoader(…) sections = train_set.sections[train_set.split] ``` * training testing for sections works * training testing for sections works * minor changes * minor changes * reverting changes on dutchf3/local/default.py file * reverting changes on dutchf3/local/default.py file * added config file * added config file * Updates the repo with preliminary results for 2D segmentation * Merged PR 248: Experiment: section-based Alaudah training/testing This PR includes the section-based experiments on dutchf3 to replicate Alaudah's work. No changes were introduced to the code outside this experiment. * Merged PR 253: Waldeland based voxel loaders and TextureNet model Related work items: #16357 * Merged PR 290: A demo notebook on local train/eval on F3 data set Notebook and associated files + minor change in a patch_deconvnet_skip.py model file. Related work items: #17432 * Merged PR 312: moved dutchf3_section to experiments/interpretation moved dutchf3_section to experiments/interpretation Related work items: #17683 * Merged PR 309: minor change to README to reflect the changes in prepare_data script minor change to README to reflect the changes in prepare_data script Related work items: #17681 * Merged PR 315: Removing voxel exp Related work items: #17702 * sync with new experiment structure * sync with new experiment structure * added a logging handler for array metrics * added a logging handler for array metrics * first draft of metrics based on the ignite confusion matrix * first draft of metrics based on the ignite confusion matrix * metrics now based on ignite.metrics * metrics now based on ignite.metrics * modified patch train.py with new metrics * modified patch train.py with new metrics * Merged PR 361: VOXEL: fixes to original voxel2pixel code to make it work with the rest of the repo. Realized there was one bug in the code and the rest of the functions did not work with the different versions of libraries which we have listed in the conda yaml file. Also updated the download script. Related work items: #18264 * modified metrics with ignore_index * modified metrics with ignore_index * Merged PR 405: minor mods to notebook, more documentation A very small PR - Just a few more lines of documentation in the notebook, to improve clarity. Related work items: #17432 * Merged PR 368: Adds penobscot Adds for penobscot - Dataset reader - Training script - Testing script - Section depth augmentation - Patch depth augmentation - Iinline visualisation for Tensorboard Related work items: #14560, #17697, #17699, #17700 * Merged PR 407: Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Related work items: #16362 * Merged PR 452: decouple docker image creation from azureml removed all azureml dependencies from 010_CreateExperimentationDockerImage_GeophysicsTutorial_FWI_Azure_devito.ipynb All other changes are due to trivial reruns Related work items: #18346 * Merged PR 512: Pre-commit hooks for formatting and style checking Opening this PR to start the discussion - I added the required dotenv files and instructions for setting up pre-commit hooks for formatting and style checking. For formatting, we are using black, and style checking flake8. The following files are added: - .pre-commit-config.yaml - defines git hooks to be installed - .flake8 - settings for flake8 linter - pyproject.toml - settings for black formatter The last two files define the formatting and linting style we want to enforce on the repo. All of us would set up the pre-commit hooks locally, so regardless of what formatting/linting settings we have in our local editors, the settings specified by the git hooks would still be enforced prior to the commit, to ensure consistency among contributors. Some questions to start the discussion: - Do you want to change any of the default settings in the dotenv files - like the line lengths, error messages we exclude or include, or anything like that. - Do we want to have a requirements-dev.txt file for contributors? This setup uses pre-commit package, I didn't include it in the environment.yaml file, but instead instructed the user to install it in the CONTRIBUTING.MD file. - Once you have the hooks installed, it will only affect the files you are committing in the future. A big chunk of our codebase does not conform to the formatting/style settings. We will have to run the hooks on the codebase retrospectively. I'm happy to do that, but it will create many changes and a significant looking PR :) Any thoughts on how we should approach this? Thanks! Related work items: #18350 * Merged PR 513: 3D training script for Waldeland's model with Ignite Related work items: #16356 * Merged PR 565: Demo notebook updated with 3D graph Changes: 1) Updated demo notebook with the 3D visualization 2) Formatting changes due to new black/flake8 git hook Related work items: #17432 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * merged tests into this branch * merged tests into this branch * Merged PR 569: Minor PR: change to pre-commit configuration files Related work items: #18350 * Merged PR 586: Purging unused files and experiments Purging unused files and experiments Related work items: #20499 * moved prepare data under scripts * moved prepare data under scripts * removed untested model configs * removed untested model configs * fixed weird bug in penobscot data loader * fixed weird bug in penobscot data loader * penobscot experiments working for hrnet, seresnet, no depth and patch depth * penobscot experiments working for hrnet, seresnet, no depth and patch depth * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * fixed bugs in my previous 'fix' * fixed bugs in my previous 'fix' * removed redundant _open_mask from subclasses * removed redundant _open_mask from subclasses * Merged PR 601: Fixes to penobscot experiments A few changes: - Instructions in README on how to download and process Penobscot and F3 2D data sets - moved prepare_data scripts to the scripts/ directory - fixed a weird issue with a class method in Penobscot data loader - fixed a bug in section loader (_add_extra_channel in section loader was not necessary and was causing an issue) - removed config files that were not tested or working in Penobscot experiments - modified default.py so it's working if train.py ran without a config file Related work items: #20694 * Merged PR 605: added common metrics to Waldeland model in Ignite Related work items: #19550 * Removed redundant extract_metric_from * Removed redundant extract_metric_from * formatting changes in metrics * formatting changes in metrics * modified penobscot experiment to use new local metrics * modified penobscot experiment to use new local metrics * modified section experimen to pass device to metrics * modified section experimen to pass device to metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * fixed other experiments after new metrics * fixed other experiments after new metrics * removed apex metrics from distributed train.py * removed apex metrics from distributed train.py * added ignite-based metrics to dutch voxel experiment * added ignite-based metrics to dutch voxel experiment * removed apex metrics * removed apex metrics * modified penobscot test script to use new metrics * pytorch-ignite pre-release with new metrics until stable available * removed cell output from the F3 notebook * deleted .vscode * modified metric import in test_metrics.py * separated metrics out as a module * BUILD: added build setup files. (#5) * Update main_build.yml for Azure Pipelines * BUILD: added build status badges (#6) * Adds dataloader for numpy datasets as well as demo pipeline for such a dataset (#7) * Finished version of numpy data loader * Working training script for demo * Adds the new metrics * Fixes docstrings and adds header * Removing extra setup.py * Log config file now experiment specific (#8) * Merging work on salt dataset * Adds computer vision to dependencies * Updates dependecies * Update * Updates the environemnt files * Updates readme and envs * Initial running version of dutchf3 * INFRA: added structure templates. * VOXEL: initial rough code push - need to clean up before PRing. * Working version * Working version before refactor * quick minor fixes in README * 3D SEG: first commit for PR. * 3D SEG: removed data files to avoid redistribution. * Updates * 3D SEG: restyled batch file, moving onto others. * Working HRNet * 3D SEG: finished going through Waldeland code * Updates test scripts and makes it take processing arguments * minor update * Fixing imports * Refactoring the experiments * Removing .vscode * Updates gitignore * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * minor wording fix * minor wording fix * enabled splitting dataset into sections, rather than only patches * enabled splitting dataset into sections, rather than only patches * merged duplicate ifelse blocks * merged duplicate ifelse blocks * refactored prepare_data.py * refactored prepare_data.py * added scripts for section train test * added scripts for section train test * section train/test works for single channel input * section train/test works for single channel input * Merged PR 174: F3 Dutch README, and fixed issues in prepare_data.py This PR includes the following changes: - added README instructions for running f3dutch experiments - prepare_dataset.py didn't work for creating section-based splits, so I fixed a few issues. There are no changes to the patch-based splitting logic. - ran black formatter on the file, which created all the formatting changes (sorry!) * Merged PR 204: Adds loaders to deepseismic from cv_lib * train and test script for section based training/testing * train and test script for section based training/testing * Merged PR 209: changes to section loaders in data.py Changes in this PR will affect patch scripts as well. The following are required changes in patch scripts: - get_train_loader() in train.py should be changed to get_patch_loader(). I created separate function to load section and patch loaders. - SectionLoader now swaps H and W dims. When loading test data in patch, this line can be removed (and tested) from test.py h, w = img.shape[-2], img.shape[-1] # height and width * Merged PR 210: BENCHMARKS: added placeholder for benchmarks. BENCHMARKS: added placeholder for benchmarks. * Merged PR 211: Fixes issues left over from changes to data.py * removing experiments from deep_seismic, following the new struct * removing experiments from deep_seismic, following the new struct * Merged PR 220: Adds Horovod and fixes Add Horovod training script Updates dependencies in Horovod docker file Removes hard coding of path in data.py * section train/test scripts * section train/test scripts * Add cv_lib to repo and updates instructions * Add cv_lib to repo and updates instructions * Removes data.py and updates readme * Removes data.py and updates readme * Updates requirements * Updates requirements * Merged PR 222: Moves cv_lib into repo and updates setup instructions * renamed train/test scripts * renamed train/test scripts * train test works on alaudah section experiments, a few minor bugs left * train test works on alaudah section experiments, a few minor bugs left * cleaning up loaders * cleaning up loaders * Merged PR 236: Cleaned up dutchf3 data loaders @<Mathew Salvaris> , @<Ilia Karmanov> , @<Max Kaznady> , please check out if this PR will affect your experiments. The main change is with the initialization of sections/patches attributes of loaders. Previously, we were unnecessarily assigning all train/val splits to train loaders, rather than only those belonging to the given split for that loader. Similar for test loaders. This will affect your code if you access these attributes. E.g. if you have something like this in your experiments: ``` train_set = TrainPatchLoader(…) patches = train_set.patches[train_set.split] ``` or ``` train_set = TrainSectionLoader(…) sections = train_set.sections[train_set.split] ``` * training testing for sections works * training testing for sections works * minor changes * minor changes * reverting changes on dutchf3/local/default.py file * reverting changes on dutchf3/local/default.py file * added config file * added config file * Updates the repo with preliminary results for 2D segmentation * Merged PR 248: Experiment: section-based Alaudah training/testing This PR includes the section-based experiments on dutchf3 to replicate Alaudah's work. No changes were introduced to the code outside this experiment. * Merged PR 253: Waldeland based voxel loaders and TextureNet model Related work items: #16357 * Merged PR 290: A demo notebook on local train/eval on F3 data set Notebook and associated files + minor change in a patch_deconvnet_skip.py model file. Related work items: #17432 * Merged PR 312: moved dutchf3_section to experiments/interpretation moved dutchf3_section to experiments/interpretation Related work items: #17683 * Merged PR 309: minor change to README to reflect the changes in prepare_data script minor change to README to reflect the changes in prepare_data script Related work items: #17681 * Merged PR 315: Removing voxel exp Related work items: #17702 * sync with new experiment structure * sync with new experiment structure * added a logging handler for array metrics * added a logging handler for array metrics * first draft of metrics based on the ignite confusion matrix * first draft of metrics based on the ignite confusion matrix * metrics now based on ignite.metrics * metrics now based on ignite.metrics * modified patch train.py with new metrics * modified patch train.py with new metrics * Merged PR 361: VOXEL: fixes to original voxel2pixel code to make it work with the rest of the repo. Realized there was one bug in the code and the rest of the functions did not work with the different versions of libraries which we have listed in the conda yaml file. Also updated the download script. Related work items: #18264 * modified metrics with ignore_index * modified metrics with ignore_index * Merged PR 405: minor mods to notebook, more documentation A very small PR - Just a few more lines of documentation in the notebook, to improve clarity. Related work items: #17432 * Merged PR 368: Adds penobscot Adds for penobscot - Dataset reader - Training script - Testing script - Section depth augmentation - Patch depth augmentation - Iinline visualisation for Tensorboard Related work items: #14560, #17697, #17699, #17700 * Merged PR 407: Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Related work items: #16362 * Merged PR 452: decouple docker image creation from azureml removed all azureml dependencies from 010_CreateExperimentationDockerImage_GeophysicsTutorial_FWI_Azure_devito.ipynb All other changes are due to trivial reruns Related work items: #18346 * Merged PR 512: Pre-commit hooks for formatting and style checking Opening this PR to start the discussion - I added the required dotenv files and instructions for setting up pre-commit hooks for formatting and style checking. For formatting, we are using black, and style checking flake8. The following files are added: - .pre-commit-config.yaml - defines git hooks to be installed - .flake8 - settings for flake8 linter - pyproject.toml - settings for black formatter The last two files define the formatting and linting style we want to enforce on the repo. All of us would set up the pre-commit hooks locally, so regardless of what formatting/linting settings we have in our local editors, the settings specified by the git hooks would still be enforced prior to the commit, to ensure consistency among contributors. Some questions to start the discussion: - Do you want to change any of the default settings in the dotenv files - like the line lengths, error messages we exclude or include, or anything like that. - Do we want to have a requirements-dev.txt file for contributors? This setup uses pre-commit package, I didn't include it in the environment.yaml file, but instead instructed the user to install it in the CONTRIBUTING.MD file. - Once you have the hooks installed, it will only affect the files you are committing in the future. A big chunk of our codebase does not conform to the formatting/style settings. We will have to run the hooks on the codebase retrospectively. I'm happy to do that, but it will create many changes and a significant looking PR :) Any thoughts on how we should approach this? Thanks! Related work items: #18350 * Merged PR 513: 3D training script for Waldeland's model with Ignite Related work items: #16356 * Merged PR 565: Demo notebook updated with 3D graph Changes: 1) Updated demo notebook with the 3D visualization 2) Formatting changes due to new black/flake8 git hook Related work items: #17432 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * merged tests into this branch * merged tests into this branch * Merged PR 569: Minor PR: change to pre-commit configuration files Related work items: #18350 * Merged PR 586: Purging unused files and experiments Purging unused files and experiments Related work items: #20499 * moved prepare data under scripts * moved prepare data under scripts * removed untested model configs * removed untested model configs * fixed weird bug in penobscot data loader * fixed weird bug in penobscot data loader * penobscot experiments working for hrnet, seresnet, no depth and patch depth * penobscot experiments working for hrnet, seresnet, no depth and patch depth * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * fixed bugs in my previous 'fix' * fixed bugs in my previous 'fix' * removed redundant _open_mask from subclasses * removed redundant _open_mask from subclasses * Merged PR 601: Fixes to penobscot experiments A few changes: - Instructions in README on how to download and process Penobscot and F3 2D data sets - moved prepare_data scripts to the scripts/ directory - fixed a weird issue with a class method in Penobscot data loader - fixed a bug in section loader (_add_extra_channel in section loader was not necessary and was causing an issue) - removed config files that were not tested or working in Penobscot experiments - modified default.py so it's working if train.py ran without a config file Related work items: #20694 * Merged PR 605: added common metrics to Waldeland model in Ignite Related work items: #19550 * Removed redundant extract_metric_from * Removed redundant extract_metric_from * formatting changes in metrics * formatting changes in metrics * modified penobscot experiment to use new local metrics * modified penobscot experiment to use new local metrics * modified section experimen to pass device to metrics * modified section experimen to pass device to metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * fixed other experiments after new metrics * fixed other experiments after new metrics * removed apex metrics from distributed train.py * removed apex metrics from distributed train.py * added ignite-based metrics to dutch voxel experiment * added ignite-based metrics to dutch voxel experiment * removed apex metrics * removed apex metrics * modified penobscot test script to use new metrics * pytorch-ignite pre-release with new metrics until stable available * removed cell output from the F3 notebook * deleted .vscode * modified metric import in test_metrics.py * separated metrics out as a module * relative logger file path, modified section experiment * removed the REPO_PATH from init * created util logging function, and moved logging file to each experiment * modified demo experiment * modified penobscot experiment * modified dutchf3_voxel experiment * no logging in voxel2pixel * modified dutchf3 patch local experiment * modified patch distributed experiment * modified interpretation notebook * minor changes to comments * DOC: forking dislaimer and new build names. (#9) * Updating README.md with introduction material (#10) * Update README with introduction to DeepSeismic Add intro material for DeepSeismic * Adding logo file * Adding image to readme * Update README.md * Updates the 3D visualisation to use itkwidgets (#11) * Updates notebook to use itkwidgets for interactive visualisation * Adds jupytext to pre-commit (#12) * Add jupytext * Adds demo notebook for HRNet (#13) * Adding TF 2.0 to allow for tensorboard vis in notebooks * Modifies hrnet config for notebook * Add HRNet notebook for demo * Updates HRNet notebook and tidies F3 * removed my username references (#15) * moving 3D models into contrib folder (#16) * Weetok (#17) * Update it to include sections for imaging * Update README.md * Update README.md * added pytest to environmetn, and pytest job to the main build (#18) * Update main_build.yml for Azure Pipelines * minor stylistic changes (#19) * Update main_build.yml for Azure Pipelines Added template for integration tests for scripts and experiments Added setup and env Increased job timeout added complete set of tests * BUILD: placeholder for Azure pipelines for notebooks build. BUILD: added notebooks job placeholders. BUILD: added github badgets for notebook builds * CLEANUP: moved non-release items to contrib (#20) * Updates HRNet notebook 🚀 (#25) * Modifies pre-commit hook to modify output * Modifies the HRNet notebook to use Penobscot dataset Adds parameters to limit iterations Adds parameters meta tag for papermil * Fixing merge peculiarities * Updates environment.yaml (#21) * Pins main libraries Adds cudatoolkit version based on issues faced during workshop * removing files * Updates Readme (#22) * Adds model instructions to readme * Update README.md (#24) I have collected points to all of our BP repos into this central place. We are trying to create links between everything to draw people from one to the other. Can we please add a pointer here to the readme? I have spoken with Max and will be adding Deep Seismic there once you have gone public. * CONTRIB: cleanup for imaging. (#28) * Create Unit Test Build.yml (#29) Adding Unit Test Build. * Update README.md * Update README.md * Create Unit Test Build.yml (#29) Adding Unit Test Build. Update README.md Update README.md * azureml sdk 1.0.74; foxed a few issues around ACR access; added nb 030 for scalability testing * azureml sdk 1.0.74; foxed a few issues around ACR access; added nb 030 for scalability testing * TESTS: added notebook integreation tests. (#65) * TESTS: added notebook integreation tests. * TEST: typo in env name * Addressing a number of minor issues with README and broken links (#67) * Update main_build.yml for Azure Pipelines * Update main_build.yml for Azure Pipelines * BUILD: added build status badges (#6) * Adds dataloader for numpy datasets as well as demo pipeline for such a dataset (#7) * Finished version of numpy data loader * Working training script for demo * Adds the new metrics * Fixes docstrings and adds header * Removing extra setup.py * Log config file now experiment specific (#8) * Merging work on salt dataset * Adds computer vision to dependencies * Updates dependecies * Update * Updates the environemnt files * Updates readme and envs * Initial running version of dutchf3 * INFRA: added structure templates. * VOXEL: initial rough code push - need to clean up before PRing. * Working version * Working version before refactor * quick minor fixes in README * 3D SEG: first commit for PR. * 3D SEG: removed data files to avoid redistribution. * Updates * 3D SEG: restyled batch file, moving onto others. * Working HRNet * 3D SEG: finished going through Waldeland code * Updates test scripts and makes it take processing arguments * minor update * Fixing imports * Refactoring the experiments * Removing .vscode * Updates gitignore * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * minor wording fix * minor wording fix * enabled splitting dataset into sections, rather than only patches * enabled splitting dataset into sections, rather than only patches * merged duplicate ifelse blocks * merged duplicate ifelse blocks * refactored prepare_data.py * refactored prepare_data.py * added scripts for section train test * added scripts for section train test * section train/test works for single channel input * section train/test works for single channel input * Merged PR 174: F3 Dutch README, and fixed issues in prepare_data.py This PR includes the following changes: - added README instructions for running f3dutch experiments - prepare_dataset.py didn't work for creating section-based splits, so I fixed a few issues. There are no changes to the patch-based splitting logic. - ran black formatter on the file, which created all the formatting changes (sorry!) * Merged PR 204: Adds loaders to deepseismic from cv_lib * train and test script for section based training/testing * train and test script for section based training/testing * Merged PR 209: changes to section loaders in data.py Changes in this PR will affect patch scripts as well. The following are required changes in patch scripts: - get_train_loader() in train.py should be changed to get_patch_loader(). I created separate function to load section and patch loaders. - SectionLoader now swaps H and W dims. When loading test data in patch, this line can be removed (and tested) from test.py h, w = img.shape[-2], img.shape[-1] # height and width * Merged PR 210: BENCHMARKS: added placeholder for benchmarks. BENCHMARKS: added placeholder for benchmarks. * Merged PR 211: Fixes issues left over from changes to data.py * removing experiments from deep_seismic, following the new struct * removing experiments from deep_seismic, following the new struct * Merged PR 220: Adds Horovod and fixes Add Horovod training script Updates dependencies in Horovod docker file Removes hard coding of path in data.py * section train/test scripts * section train/test scripts * Add cv_lib to repo and updates instructions * Add cv_lib to repo and updates instructions * Removes data.py and updates readme * Removes data.py and updates readme * Updates requirements * Updates requirements * Merged PR 222: Moves cv_lib into repo and updates setup instructions * renamed train/test scripts * renamed train/test scripts * train test works on alaudah section experiments, a few minor bugs left * train test works on alaudah section experiments, a few minor bugs left * cleaning up loaders * cleaning up loaders * Merged PR 236: Cleaned up dutchf3 data loaders @<Mathew Salvaris> , @<Ilia Karmanov> , @<Max Kaznady> , please check out if this PR will affect your experiments. The main change is with the initialization of sections/patches attributes of loaders. Previously, we were unnecessarily assigning all train/val splits to train loaders, rather than only those belonging to the given split for that loader. Similar for test loaders. This will affect your code if you access these attributes. E.g. if you have something like this in your experiments: ``` train_set = TrainPatchLoader(…) patches = train_set.patches[train_set.split] ``` or ``` train_set = TrainSectionLoader(…) sections = train_set.sections[train_set.split] ``` * training testing for sections works * training testing for sections works * minor changes * minor changes * reverting changes on dutchf3/local/default.py file * reverting changes on dutchf3/local/default.py file * added config file * added config file * Updates the repo with preliminary results for 2D segmentation * Merged PR 248: Experiment: section-based Alaudah training/testing This PR includes the section-based experiments on dutchf3 to replicate Alaudah's work. No changes were introduced to the code outside this experiment. * Merged PR 253: Waldeland based voxel loaders and TextureNet model Related work items: #16357 * Merged PR 290: A demo notebook on local train/eval on F3 data set Notebook and associated files + minor change in a patch_deconvnet_skip.py model file. Related work items: #17432 * Merged PR 312: moved dutchf3_section to experiments/interpretation moved dutchf3_section to experiments/interpretation Related work items: #17683 * Merged PR 309: minor change to README to reflect the changes in prepare_data script minor change to README to reflect the changes in prepare_data script Related work items: #17681 * Merged PR 315: Removing voxel exp Related work items: #17702 * sync with new experiment structure * sync with new experiment structure * added a logging handler for array metrics * added a logging handler for array metrics * first draft of metrics based on the ignite confusion matrix * first draft of metrics based on the ignite confusion matrix * metrics now based on ignite.metrics * metrics now based on ignite.metrics * modified patch train.py with new metrics * modified patch train.py with new metrics * Merged PR 361: VOXEL: fixes to original voxel2pixel code to make it work with the rest of the repo. Realized there was one bug in the code and the rest of the functions did not work with the different versions of libraries which we have listed in the conda yaml file. Also updated the download script. Related work items: #18264 * modified metrics with ignore_index * modified metrics with ignore_index * Merged PR 405: minor mods to notebook, more documentation A very small PR - Just a few more lines of documentation in the notebook, to improve clarity. Related work items: #17432 * Merged PR 368: Adds penobscot Adds for penobscot - Dataset reader - Training script - Testing script - Section depth augmentation - Patch depth augmentation - Iinline visualisation for Tensorboard Related work items: #14560, #17697, #17699, #17700 * Merged PR 407: Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Related work items: #16362 * Merged PR 452: decouple docker image creation from azureml removed all azureml dependencies from 010_CreateExperimentationDockerImage_GeophysicsTutorial_FWI_Azure_devito.ipynb All other changes are due to trivial reruns Related work items: #18346 * Merged PR 512: Pre-commit hooks for formatting and style checking Opening this PR to start the discussion - I added the required dotenv files and instructions for setting up pre-commit hooks for formatting and style checking. For formatting, we are using black, and style checking flake8. The following files are added: - .pre-commit-config.yaml - defines git hooks to be installed - .flake8 - settings for flake8 linter - pyproject.toml - settings for black formatter The last two files define the formatting and linting style we want to enforce on the repo. All of us would set up the pre-commit hooks locally, so regardless of what formatting/linting settings we have in our local editors, the settings specified by the git hooks would still be enforced prior to the commit, to ensure consistency among contributors. Some questions to start the discussion: - Do you want to change any of the default settings in the dotenv files - like the line lengths, error messages we exclude or include, or anything like that. - Do we want to have a requirements-dev.txt file for contributors? This setup uses pre-commit package, I didn't include it in the environment.yaml file, but instead instructed the user to install it in the CONTRIBUTING.MD file. - Once you have the hooks installed, it will only affect the files you are committing in the future. A big chunk of our codebase does not conform to the formatting/style settings. We will have to run the hooks on the codebase retrospectively. I'm happy to do that, but it will create many changes and a significant looking PR :) Any thoughts on how we should approach this? Thanks! Related work items: #18350 * Merged PR 513: 3D training script for Waldeland's model with Ignite Related work items: #16356 * Merged PR 565: Demo notebook updated with 3D graph Changes: 1) Updated demo notebook with the 3D visualization 2) Formatting changes due to new black/flake8 git hook Related work items: #17432 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * merged tests into this branch * merged tests into this branch * Merged PR 569: Minor PR: change to pre-commit configuration files Related work items: #18350 * Merged PR 586: Purging unused files and experiments Purging unused files and experiments Related work items: #20499 * moved prepare data under scripts * moved prepare data under scripts * removed untested model configs * removed untested model configs * fixed weird bug in penobscot data loader * fixed weird bug in penobscot data loader * penobscot experiments working for hrnet, seresnet, no depth and patch depth * penobscot experiments working for hrnet, seresnet, no depth and patch depth * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * fixed bugs in my previous 'fix' * fixed bugs in my previous 'fix' * removed redundant _open_mask from subclasses * removed redundant _open_mask from subclasses * Merged PR 601: Fixes to penobscot experiments A few changes: - Instructions in README on how to download and process Penobscot and F3 2D data sets - moved prepare_data scripts to the scripts/ directory - fixed a weird issue with a class method in Penobscot data loader - fixed a bug in section loader (_add_extra_channel in section loader was not necessary and was causing an issue) - removed config files that were not tested or working in Penobscot experiments - modified default.py so it's working if train.py ran without a config file Related work items: #20694 * Merged PR 605: added common metrics to Waldeland model in Ignite Related work items: #19550 * Removed redundant extract_metric_from * Removed redundant extract_metric_from * formatting changes in metrics * formatting changes in metrics * modified penobscot experiment to use new local metrics * modified penobscot experiment to use new local metrics * modified section experimen to pass device to metrics * modified section experimen to pass device to metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * fixed other experiments after new metrics * fixed other experiments after new metrics * removed apex metrics from distributed train.py * removed apex metrics from distributed train.py * added ignite-based metrics to dutch voxel experiment * added ignite-based metrics to dutch voxel experiment * removed apex metrics * removed apex metrics * modified penobscot test script to use new metrics * pytorch-ignite pre-release with new metrics until stable available * removed cell output from the F3 notebook * deleted .vscode * modified metric import in test_metrics.py * separated metrics out as a module * relative logger file path, modified section experiment * removed the REPO_PATH from init * created util logging function, and moved logging file to each experiment * modified demo experiment * modified penobscot experiment * modified dutchf3_voxel experiment * no logging in voxel2pixel * modified dutchf3 patch local experiment * modified patch distributed experiment * modified interpretation notebook * minor changes to comments * DOC: forking dislaimer and new build names. (#9) * Updating README.md with introduction material (#10) * Update README with introduction to DeepSeismic Add intro material for DeepSeismic * Adding logo file * Adding image to readme * Update README.md * Updates the 3D visualisation to use itkwidgets (#11) * Updates notebook to use itkwidgets for interactive visualisation * Adds jupytext to pre-commit (#12) * Add jupytext * Adds demo notebook for HRNet (#13) * Adding TF 2.0 to allow for tensorboard vis in notebooks * Modifies hrnet config for notebook * Add HRNet notebook for demo * Updates HRNet notebook and tidies F3 * removed my username references (#15) * moving 3D models into contrib folder (#16) * Weetok (#17) * Update it to include sections for imaging * Update README.md * Update README.md * fix for seyviewer and mkdir splits in README + broken link in F3 notebook * issue edits to README * download complete message * Added Yacs info to README.md (#69) * Update main_build.yml for Azure Pipelines * Update main_build.yml for Azure Pipelines * BUILD: added build status badges (#6) * Adds dataloader for numpy datasets as well as demo pipeline for such a dataset (#7) * Finished version of numpy data loader * Working training script for demo * Adds the new metrics * Fixes docstrings and adds header * Removing extra setup.py * Log config file now experiment specific (#8) * Merging work on salt dataset * Adds computer vision to dependencies * Updates dependecies * Update * Updates the environemnt files * Updates readme and envs * Initial running version of dutchf3 * INFRA: added structure templates. * VOXEL: initial rough code push - need to clean up before PRing. * Working version * Working version before refactor * quick minor fixes in README * 3D SEG: first commit for PR. * 3D SEG: removed data files to avoid redistribution. * Updates * 3D SEG: restyled batch file, moving onto others. * Working HRNet * 3D SEG: finished going through Waldeland code * Updates test scripts and makes it take processing arguments * minor update * Fixing imports * Refactoring the experiments * Removing .vscode * Updates gitignore * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * minor wording fix * minor wording fix * enabled splitting dataset into sections, rather than only patches * enabled splitting dataset into sections, rather than only patches * merged duplicate ifelse blocks * merged duplicate ifelse blocks * refactored prepare_data.py * refactored prepare_data.py * added scripts for section train test * added scripts for section train test * section train/test works for single channel input * section train/test works for single channel input * Merged PR 174: F3 Dutch README, and fixed issues in prepare_data.py This PR includes the following changes: - added README instructions for running f3dutch experiments - prepare_dataset.py didn't work for creating section-based splits, so I fixed a few issues. There are no changes to the patch-based splitting logic. - ran black formatter on the file, which created all the formatting changes (sorry!) * Merged PR 204: Adds loaders to deepseismic from cv_lib * train and test script for section based training/testing * train and test script for section based training/testing * Merged PR 209: changes to section loaders in data.py Changes in this PR will affect patch scripts as well. The following are required changes in patch scripts: - get_train_loader() in train.py should be changed to get_patch_loader(). I created separate function to load section and patch loaders. - SectionLoader now swaps H and W dims. When loading test data in patch, this line can be removed (and tested) from test.py h, w = img.shape[-2], img.shape[-1] # height and width * Merged PR 210: BENCHMARKS: added placeholder for benchmarks. BENCHMARKS: added placeholder for benchmarks. * Merged PR 211: Fixes issues left over from changes to data.py * removing experiments from deep_seismic, following the new struct * removing experiments from deep_seismic, following the new struct * Merged PR 220: Adds Horovod and fixes Add Horovod training script Updates dependencies in Horovod docker file Removes hard coding of path in data.py * section train/test scripts * section train/test scripts * Add cv_lib to repo and updates instructions * Add cv_lib to repo and updates instructions * Removes data.py and updates readme * Removes data.py and updates readme * Updates requirements * Updates requirements * Merged PR 222: Moves cv_lib into repo and updates setup instructions * renamed train/test scripts * renamed train/test scripts * train test works on alaudah section experiments, a few minor bugs left * train test works on alaudah section experiments, a few minor bugs left * cleaning up loaders * cleaning up loaders * Merged PR 236: Cleaned up dutchf3 data loaders @<Mathew Salvaris> , @<Ilia Karmanov> , @<Max Kaznady> , please check out if this PR will affect your experiments. The main change is with the initialization of sections/patches attributes of loaders. Previously, we were unnecessarily assigning all train/val splits to train loaders, rather than only those belonging to the given split for that loader. Similar for test loaders. This will affect your code if you access these attributes. E.g. if you have something like this in your experiments: ``` train_set = TrainPatchLoader(…) patches = train_set.patches[train_set.split] ``` or ``` train_set = TrainSectionLoader(…) sections = train_set.sections[train_set.split] ``` * training testing for sections works * training testing for sections works * minor changes * minor changes * reverting changes on dutchf3/local/default.py file * reverting changes on dutchf3/local/default.py file * added config file * added config file * Updates the repo with preliminary results for 2D segmentation * Merged PR 248: Experiment: section-based Alaudah training/testing This PR includes the section-based experiments on dutchf3 to replicate Alaudah's work. No changes were introduced to the code outside this experiment. * Merged PR 253: Waldeland based voxel loaders and TextureNet model Related work items: #16357 * Merged PR 290: A demo notebook on local train/eval on F3 data set Notebook and associated files + minor change in a patch_deconvnet_skip.py model file. Related work items: #17432 * Merged PR 312: moved dutchf3_section to experiments/interpretation moved dutchf3_section to experiments/interpretation Related work items: #17683 * Merged PR 309: minor change to README to reflect the changes in prepare_data script minor change to README to reflect the changes in prepare_data script Related work items: #17681 * Merged PR 315: Removing voxel exp Related work items: #17702 * sync with new experiment structure * sync with new experiment structure * added a logging handler for array metrics * added a logging handler for array metrics * first draft of metrics based on the ignite confusion matrix * first draft of metrics based on the ignite confusion matrix * metrics now based on ignite.metrics * metrics now based on ignite.metrics * modified patch train.py with new metrics * modified patch train.py with new metrics * Merged PR 361: VOXEL: fixes to original voxel2pixel code to make it work with the rest of the repo. Realized there was one bug in the code and the rest of the functions did not work with the different versions of libraries which we have listed in the conda yaml file. Also updated the download script. Related work items: #18264 * modified metrics with ignore_index * modified metrics with ignore_index * Merged PR 405: minor mods to notebook, more documentation A very small PR - Just a few more lines of documentation in the notebook, to improve clarity. Related work items: #17432 * Merged PR 368: Adds penobscot Adds for penobscot - Dataset reader - Training script - Testing script - Section depth augmentation - Patch depth augmentation - Iinline visualisation for Tensorboard Related work items: #14560, #17697, #17699, #17700 * Merged PR 407: Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Related work items: #16362 * Merged PR 452: decouple docker image creation from azureml removed all azureml dependencies from 010_CreateExperimentationDockerImage_GeophysicsTutorial_FWI_Azure_devito.ipynb All other changes are due to trivial reruns Related work items: #18346 * Merged PR 512: Pre-commit hooks for formatting and style checking Opening this PR to start the discussion - I added the required dotenv files and instructions for setting up pre-commit hooks for formatting and style checking. For formatting, we are using black, and style checking flake8. The following files are added: - .pre-commit-config.yaml - defines git hooks to be installed - .flake8 - settings for flake8 linter - pyproject.toml - settings for black formatter The last two files define the formatting and linting style we want to enforce on the repo. All of us would set up the pre-commit hooks locally, so regardless of what formatting/linting settings we have in our local editors, the settings specified by the git hooks would still be enforced prior to the commit, to ensure consistency among contributors. Some questions to start the discussion: - Do you want to change any of the default settings in the dotenv files - like the line lengths, error messages we exclude or include, or anything like that. - Do we want to have a requirements-dev.txt file for contributors? This setup uses pre-commit package, I didn't include it in the environment.yaml file, but instead instructed the user to install it in the CONTRIBUTING.MD file. - Once you have the hooks installed, it will only affect the files you are committing in the future. A big chunk of our codebase does not conform to the formatting/style settings. We will have to run the hooks on the codebase retrospectively. I'm happy to do that, but it will create many changes and a significant looking PR :) Any thoughts on how we should approach this? Thanks! Related work items: #18350 * Merged PR 513: 3D training script for Waldeland's model with Ignite Related work items: #16356 * Merged PR 565: Demo notebook updated with 3D graph Changes: 1) Updated demo notebook with the 3D visualization 2) Formatting changes due to new black/flake8 git hook Related work items: #17432 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * merged tests into this branch * merged tests into this branch * Merged PR 569: Minor PR: change to pre-commit configuration files Related work items: #18350 * Merged PR 586: Purging unused files and experiments Purging unused files and experiments Related work items: #20499 * moved prepare data under scripts * moved prepare data under scripts * removed untested model configs * removed untested model configs * fixed weird bug in penobscot data loader * fixed weird bug in penobscot data loader * penobscot experiments working for hrnet, seresnet, no depth and patch depth * penobscot experiments working for hrnet, seresnet, no depth and patch depth * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * fixed bugs in my previous 'fix' * fixed bugs in my previous 'fix' * removed redundant _open_mask from subclasses * removed redundant _open_mask from subclasses * Merged PR 601: Fixes to penobscot experiments A few changes: - Instructions in README on how to download and process Penobscot and F3 2D data sets - moved prepare_data scripts to the scripts/ directory - fixed a weird issue with a class method in Penobscot data loader - fixed a bug in section loader (_add_extra_channel in section loader was not necessary and was causing an issue) - removed config files that were not tested or working in Penobscot experiments - modified default.py so it's working if train.py ran without a config file Related work items: #20694 * Merged PR 605: added common metrics to Waldeland model in Ignite Related work items: #19550 * Removed redundant extract_metric_from * Removed redundant extract_metric_from * formatting changes in metrics * formatting changes in metrics * modified penobscot experiment to use new local metrics * modified penobscot experiment to use new local metrics * modified section experimen to pass device to metrics * modified section experimen to pass device to metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * fixed other experiments after new metrics * fixed other experiments after new metrics * removed apex metrics from distributed train.py * removed apex metrics from distributed train.py * added ignite-based metrics to dutch voxel experiment * added ignite-based metrics to dutch voxel experiment * removed apex metrics * removed apex metrics * modified penobscot test script to use new metrics * pytorch-ignite pre-release with new metrics until stable available * removed cell output from the F3 notebook * deleted .vscode * modified metric import in test_metrics.py * separated metrics out as a module * relative logger file path, modified section experiment * removed the REPO_PATH from init * created util logging function, and moved logging file to each experiment * modified demo experiment * modified penobscot experiment * modified dutchf3_voxel experiment * no logging in voxel2pixel * modified dutchf3 patch local experiment * modified patch distributed experiment * modified interpretation notebook * minor changes to comments * DOC: forking dislaimer and new build names. (#9) * Updating README.md with introduction material (#10) * Update README with introduction to DeepSeismic Add intro material for DeepSeismic * Adding logo file * Adding image to readme * Update README.md * Updates the 3D visualisation to use itkwidgets (#11) * Updates notebook to use itkwidgets for interactive visualisation * Adds jupytext to pre-commit (#12) * Add jupytext * Adds demo notebook for HRNet (#13) * Adding TF 2.0 to allow for tensorboard vis in notebooks * Modifies hrnet config for notebook * Add HRNet notebook for demo * Updates HRNet notebook and tidies F3 * removed my username references (#15) * moving 3D models into contrib folder (#16) * Weetok (#17) * Update it to include sections for imaging * Update README.md * Update README.md * added info on yacs files * MODEL.PRETRAINED key missing in default.py (#70) * Update main_build.yml for Azure Pipelines * Update main_build.yml for Azure Pipelines * BUILD: added build status badges (#6) * Adds dataloader for numpy datasets as well as demo pipeline for such a dataset (#7) * Finished version of numpy data loader * Working training script for demo * Adds the new metrics * Fixes docstrings and adds header * Removing extra setup.py * Log config file now experiment specific (#8) * Merging work on salt dataset * Adds computer vision to dependencies * Updates dependecies * Update * Updates the environemnt files * Updates readme and envs * Initial running version of dutchf3 * INFRA: added structure templates. * VOXEL: initial rough code push - need to clean up before PRing. * Working version * Working version before refactor * quick minor fixes in README * 3D SEG: first commit for PR. * 3D SEG: removed data files to avoid redistribution. * Updates * 3D SEG: restyled batch file, moving onto others. * Working HRNet * 3D SEG: finished going through Waldeland code * Updates test scripts and makes it take processing arguments * minor update * Fixing imports * Refactoring the experiments * Removing .vscode * Updates gitignore * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * minor wording fix * minor wording fix * enabled splitting dataset into sections, rather than only patches * enabled splitting dataset into sections, rather than only patches * merged duplicate ifelse blocks * merged duplicate ifelse blocks * refactored prepare_data.py * refactored prepare_data.py * added scripts for section train test * added scripts for section train test * section train/test works for single channel input * section train/test works for single channel input * Merged PR 174: F3 Dutch README, and fixed issues in prepare_data.py This PR includes the following changes: - added README instructions for running f3dutch experiments - prepare_dataset.py didn't work for creating section-based splits, so I fixed a few issues. There are no changes to the patch-based splitting logic. - ran black formatter on the file, which created all the formatting changes (sorry!) * Merged PR 204: Adds loaders to deepseismic from cv_lib * train and test script for section based training/testing * train and test script for section based training/testing * Merged PR 209: changes to section loaders in data.py Changes in this PR will affect patch scripts as well. The following are required changes in patch scripts: - get_train_loader() in train.py should be changed to get_patch_loader(). I created separate function to load section and patch loaders. - SectionLoader now swaps H and W dims. When loading test data in patch, this line can be removed (and tested) from test.py h, w = img.shape[-2], img.shape[-1] # height and width * Merged PR 210: BENCHMARKS: added placeholder for benchmarks. BENCHMARKS: added placeholder for benchmarks. * Merged PR 211: Fixes issues left over from changes to data.py * removing experiments from deep_seismic, following the new struct * removing experiments from deep_seismic, following the new struct * Merged PR 220: Adds Horovod and fixes Add Horovod training script Updates dependencies in Horovod docker file Removes hard coding of path in data.py * section train/test scripts * section train/test scripts * Add cv_lib to repo and updates instructions * Add cv_lib to repo and updates instructions * Removes data.py and updates readme * Removes data.py and updates readme * Updates requirements * Updates requirements * Merged PR 222: Moves cv_lib into repo and updates setup instructions * renamed train/test scripts * renamed train/test scripts * train test works on alaudah section experiments, a few minor bugs left * train test works on alaudah section experiments, a few minor bugs left * cleaning up loaders * cleaning up loaders * Merged PR 236: Cleaned up dutchf3 data loaders @<Mathew Salvaris> , @<Ilia Karmanov> , @<Max Kaznady> , please check out if this PR will affect your experiments. The main change is with the initialization of sections/patches attributes of loaders. Previously, we were unnecessarily assigning all train/val splits to train loaders, rather than only those belonging to the given split for that loader. Similar for test loaders. This will affect your code if you access these attributes. E.g. if you have something like this in your experiments: ``` train_set = TrainPatchLoader(…) patches = train_set.patches[train_set.split] ``` or ``` train_set = TrainSectionLoader(…) sections = train_set.sections[train_set.split] ``` * training testing for sections works * training testing for sections works * minor changes * minor changes * reverting changes on dutchf3/local/default.py file * reverting changes on dutchf3/local/default.py file * added config file * added config file * Updates the repo with preliminary results for 2D segmentation * Merged PR 248: Experiment: section-based Alaudah training/testing This PR includes the section-based experiments on dutchf3 to replicate Alaudah's work. No changes were introduced to the code outside this experiment. * Merged PR 253: Waldeland based voxel loaders and TextureNet model Related work items: #16357 * Merged PR 290: A demo notebook on local train/eval on F3 data set Notebook and associated files + minor change in a patch_deconvnet_skip.py model file. Related work items: #17432 * Merged PR 312: moved dutchf3_section to experiments/interpretation moved dutchf3_section to experiments/interpretation Related work items: #17683 * Merged PR 309: minor change to README to reflect the changes in prepare_data script minor change to README to reflect the changes in prepare_data script Related work items: #17681 * Merged PR 315: Removing voxel exp Related work items: #17702 * sync with new experiment structure * sync with new experiment structure * added a logging handler for array metrics * added a logging handler for array metrics * first draft of metrics based on the ignite confusion matrix * first draft of metrics based on the ignite confusion matrix * metrics now based on ignite.metrics * metrics now based on ignite.metrics * modified patch train.py with new metrics * modified patch train.py with new metrics * Merged PR 361: VOXEL: fixes to original voxel2pixel code to make it work with the rest of the repo. Realized there was one bug in the code and the rest of the functions did not work with the different versions of libraries which we have listed in the conda yaml file. Also updated the download script. Related work items: #18264 * modified metrics with ignore_index * modified metrics with ignore_index * Merged PR 405: minor mods to notebook, more documentation A very small PR - Just a few more lines of documentation in the notebook, to improve clarity. Related work items: #17432 * Merged PR 368: Adds penobscot Adds for penobscot - Dataset reader - Training script - Testing script - Section depth augmentation - Patch depth augmentation - Iinline visualisation for Tensorboard Related work items: #14560, #17697, #17699, #17700 * Merged PR 407: Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Related work items: #16362 * Merged PR 452: decouple docker image creation from azureml removed all azureml dependencies from 010_CreateExperimentationDockerImage_GeophysicsTutorial_FWI_Azure_devito.ipynb All other changes are due to trivial reruns Related work items: #18346 * Merged PR 512: Pre-commit hooks for formatting and style checking Opening this PR to start the discussion - I added the required dotenv files and instructions for setting up pre-commit hooks for formatting and style checking. For formatting, we are using black, and style checking flake8. The following files are added: - .pre-commit-config.yaml - defines git hooks to be installed - .flake8 - settings for flake8 linter - pyproject.toml - settings for black formatter The last two files define the formatting and linting style we want to enforce on the repo. All of us would set up the pre-commit hooks locally, so regardless of what formatting/linting settings we have in our local editors, the settings specified by the git hooks would still be enforced prior to the commit, to ensure consistency among contributors. Some questions to start the discussion: - Do you want to change any of the default settings in the dotenv files - like the line lengths, error messages we exclude or include, or anything like that. - Do we want to have a requirements-dev.txt file for contributors? This setup uses pre-commit package, I didn't include it in the environment.yaml file, but instead instructed the user to install it in the CONTRIBUTING.MD file. - Once you have the hooks installed, it will only affect the files you are committing in the future. A big chunk of our codebase does not conform to the formatting/style settings. We will have to run the hooks on the codebase retrospectively. I'm happy to do that, but it will create many changes and a significant looking PR :) Any thoughts on how we should approach this? Thanks! Related work items: #18350 * Merged PR 513: 3D training script for Waldeland's model with Ignite Related work items: #16356 * Merged PR 565: Demo notebook updated with 3D graph Changes: 1) Updated demo notebook with the 3D visualization 2) Formatting changes due to new black/flake8 git hook Related work items: #17432 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * merged tests into this branch * merged tests into this branch * Merged PR 569: Minor PR: change to pre-commit configuration files Related work items: #18350 * Merged PR 586: Purging unused files and experiments Purging unused files and experiments Related work items: #20499 * moved prepare data under scripts * moved prepare data under scripts * removed untested model configs * removed untested model configs * fixed weird bug in penobscot data loader * fixed weird bug in penobscot data loader * penobscot experiments working for hrnet, seresnet, no depth and patch depth * penobscot experiments working for hrnet, seresnet, no depth and patch depth * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * fixed bugs in my previous 'fix' * fixed bugs in my previous 'fix' * removed redundant _open_mask from subclasses * removed redundant _open_mask from subclasses * Merged PR 601: Fixes to penobscot experiments A few changes: - Instructions in README on how to download and process Penobscot and F3 2D data sets - moved prepare_data scripts to the scripts/ directory - fixed a weird issue with a class method in Penobscot data loader - fixed a bug in section loader (_add_extra_channel in section loader was not necessary and was causing an issue) - removed config files that were not tested or working in Penobscot experiments - modified default.py so it's working if train.py ran without a config file Related work items: #20694 * Merged PR 605: added common metrics to Waldeland model in Ignite Related work items: #19550 * Removed redundant extract_metric_from * Removed redundant extract_metric_from * formatting changes in metrics * formatting changes in metrics * modified penobscot experiment to use new local metrics * modified penobscot experiment to use new local metrics * modified section experimen to pass device to metrics * modified section experimen to pass device to metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * fixed other experiments after new metrics * fixed other experiments after new metrics * removed apex metrics from distributed train.py * removed apex metrics from distributed train.py * added ignite-based metrics to dutch voxel experiment * added ignite-based metrics to dutch voxel experiment * removed apex metrics * removed apex metrics * modified penobscot test script to use new metrics * pytorch-ignite pre-release with new metrics until stable available * removed cell output from the F3 notebook * deleted .vscode * modified metric import in test_metrics.py * separated metrics out as a module * relative logger file path, modified section experiment * removed the REPO_PATH from init * created util logging function, and moved logging file to each experiment * modified demo experiment * modified penobscot experiment * modified dutchf3_voxel experiment * no logging in voxel2pixel * modified dutchf3 patch local experiment * modified patch distributed experiment * modified interpretation notebook * minor changes to comments * DOC: forking dislaimer and new build names. (#9) * Updating README.md with introduction material (#10) * Update README with introduction to DeepSeismic Add intro material for DeepSeismic * Adding logo file * Adding image to readme * Update README.md * Updates the 3D visualisation to use itkwidgets (#11) * Updates notebook to use itkwidgets for interactive visualisation * Adds jupytext to pre-commit (#12) * Add jupytext * Adds demo notebook for HRNet (#13) * Adding TF 2.0 to allow for tensorboard vis in notebooks * Modifies hrnet config for notebook * Add HRNet notebook for demo * Updates HRNet notebook and tidies F3 * removed my username references (#15) * moving 3D models into contrib folder (#16) * Weetok (#17) * Update it to include sections for imaging * Update README.md * Update README.md * added MODEL.PRETRAINED key to default.py * Update README.md (#59) * Update README.md (#58) * MINOR: addressing broken F3 download link (#73) * Update main_build.yml for Azure Pipelines * Update main_build.yml for Azure Pipelines * BUILD: added build status badges (#6) * Adds dataloader for numpy datasets as well as demo pipeline for such a dataset (#7) * Finished version of numpy data loader * Working training script for demo * Adds the new metrics * Fixes docstrings and adds header * Removing extra setup.py * Log config file now experiment specific (#8) * Merging work on salt dataset * Adds computer vision to dependencies * Updates dependecies * Update * Updates the environemnt files * Updates readme and envs * Initial running version of dutchf3 * INFRA: added structure templates. * VOXEL: initial rough code push - need to clean up before PRing. * Working version * Working version before refactor * quick minor fixes in README * 3D SEG: first commit for PR. * 3D SEG: removed data files to avoid redistribution. * Updates * 3D SEG: restyled batch file, moving onto others. * Working HRNet * 3D SEG: finished going through Waldeland code * Updates test scripts and makes it take processing arguments * minor update * Fixing imports * Refactoring the experiments * Removing .vscode * Updates gitignore * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * minor wording fix * minor wording fix * enabled splitting dataset into sections, rather than only patches * enabled splitting dataset into sections, rather than only patches * merged duplicate ifelse blocks * merged duplicate ifelse blocks * refactored prepare_data.py * refactored prepare_data.py * added scripts for section train test * added scripts for section train test * section train/test works for single channel input * section train/test works for single channel input * Merged PR 174: F3 Dutch README, and fixed issues in prepare_data.py This PR includes the following changes: - added README instructions for running f3dutch experiments - prepare_dataset.py didn't work for creating section-based splits, so I fixed a few issues. There are no changes to the patch-based splitting logic. - ran black formatter on the file, which created all the formatting changes (sorry!) * Merged PR 204: Adds loaders to deepseismic from cv_lib * train and test script for section based training/testing * train and test script for section based training/testing * Merged PR 209: changes to section loaders in data.py Changes in this PR will affect patch scripts as well. The following are required changes in patch scripts: - get_train_loader() in train.py should be changed to get_patch_loader(). I created separate function to load section and patch loaders. - SectionLoader now swaps H and W dims. When loading test data in patch, this line can be removed (and tested) from test.py h, w = img.shape[-2], img.shape[-1] # height and width * Merged PR 210: BENCHMARKS: added placeholder for benchmarks. BENCHMARKS: added placeholder for benchmarks. * Merged PR 211: Fixes issues left over from changes to data.py * removing experiments from deep_seismic, following the new struct * removing experiments from deep_seismic, following the new struct * Merged PR 220: Adds Horovod and fixes Add Horovod training script Updates dependencies in Horovod docker file Removes hard coding of path in data.py * section train/test scripts * section train/test scripts * Add cv_lib to repo and updates instructions * Add cv_lib to repo and updates instructions * Removes data.py and updates readme * Removes data.py and updates readme * Updates requirements * Updates requirements * Merged PR 222: Moves cv_lib into repo and updates setup instructions * renamed train/test scripts * renamed train/test scripts * train test works on alaudah section experiments, a few minor bugs left * train test works on alaudah section experiments, a few minor bugs left * cleaning up loaders * cleaning up loaders * Merged PR 236: Cleaned up dutchf3 data loaders @<Mathew Salvaris> , @<Ilia Karmanov> , @<Max Kaznady> , please check out if this PR will affect your experiments. The main change is with the initialization of sections/patches attributes of loaders. Previously, we were unnecessarily assigning all train/val splits to train loaders, rather than only those belonging to the given split for that loader. Similar for test loaders. This will affect your code if you access these attributes. E.g. if you have something like this in your experiments: ``` train_set = TrainPatchLoader(…) patches = train_set.patches[train_set.split] ``` or ``` train_set = TrainSectionLoader(…) sections = train_set.sections[train_set.split] ``` * training testing for sections works * training testing for sections works * minor changes * minor changes * reverting changes on dutchf3/local/default.py file * reverting changes on dutchf3/local/default.py file * added config file * added config file * Updates the repo with preliminary results for 2D segmentation * Merged PR 248: Experiment: section-based Alaudah training/testing This PR includes the section-based experiments on dutchf3 to replicate Alaudah's work. No changes were introduced to the code outside this experiment. * Merged PR 253: Waldeland based voxel loaders and TextureNet model Related work items: #16357 * Merged PR 290: A demo notebook on local train/eval on F3 data set Notebook and associated files + minor change in a patch_deconvnet_skip.py model file. Related work items: #17432 * Merged PR 312: moved dutchf3_section to experiments/interpretation moved dutchf3_section to experiments/interpretation Related work items: #17683 * Merged PR 309: minor change to README to reflect the changes in prepare_data script minor change to README to reflect the changes in prepare_data script Related work items: #17681 * Merged PR 315: Removing voxel exp Related work items: #17702 * sync with new experiment structure * sync with new experiment structure * added a logging handler for array metrics * added a logging handler for array metrics * first draft of metrics based on the ignite confusion matrix * first draft of metrics based on the ignite confusion matrix * metrics now based on ignite.metrics * metrics now based on ignite.metrics * modified patch train.py with new metrics * modified patch train.py with new metrics * Merged PR 361: VOXEL: fixes to original voxel2pixel code to make it work with the rest of the repo. Realized there was one bug in the code and the rest of the functions did not work with the different versions of libraries which we have listed in the conda yaml file. Also updated the download script. Related work items: #18264 * modified metrics with ignore_index * modified metrics with ignore_index * Merged PR 405: minor mods to notebook, more documentation A very small PR - Just a few more lines of documentation in the notebook, to improve clarity. Related work items: #17432 * Merged PR 368: Adds penobscot Adds for penobscot - Dataset reader - Training script - Testing script - Section depth augmentation - Patch depth augmentation - Iinline visualisation for Tensorboard Related work items: #14560, #17697, #17699, #17700 * Merged PR 407: Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Related work items: #16362 * Merged PR 452: decouple docker image creation from azureml removed all azureml dependencies from 010_CreateExperimentationDockerImage_GeophysicsTutorial_FWI_Azure_devito.ipynb All other changes are due to trivial reruns Related work items: #18346 * Merged PR 512: Pre-commit hooks for formatting and style checking Opening this PR to start the discussion - I added the required dotenv files and instructions for setting up pre-commit hooks for formatting and style checking. For formatting, we are using black, and style checking flake8. The following files are added: - .pre-commit-config.yaml - defines git hooks to be installed - .flake8 - settings for flake8 linter - pyproject.toml - settings for black formatter The last two files define the formatting and linting style we want to enforce on the repo. All of us would set up the pre-commit hooks locally, so regardless of what formatting/linting settings we have in our local editors, the settings specified by the git hooks would still be enforced prior to the commit, to ensure consistency among contributors. Some questions to start the discussion: - Do you want to change any of the default settings in the dotenv files - like the line lengths, error messages we exclude or include, or anything like that. - Do we want to have a requirements-dev.txt file for contributors? This setup uses pre-commit package, I didn't include it in the environment.yaml file, but instead instructed the user to install it in the CONTRIBUTING.MD file. - Once you have the hooks installed, it will only affect the files you are committing in the future. A big chunk of our codebase does not conform to the formatting/style settings. We will have to run the hooks on the codebase retrospectively. I'm happy to do that, but it will create many changes and a significant looking PR :) Any thoughts on how we should approach this? Thanks! Related work items: #18350 * Merged PR 513: 3D training script for Waldeland's model with Ignite Related work items: #16356 * Merged PR 565: Demo notebook updated with 3D graph Changes: 1) Updated demo notebook with the 3D visualization 2) Formatting changes due to new black/flake8 git hook Related work items: #17432 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * merged tests into this branch * merged tests into this branch * Merged PR 569: Minor PR: change to pre-commit configuration files Related work items: #18350 * Merged PR 586: Purging unused files and experiments Purging unused files and experiments Related work items: #20499 * moved prepare data under scripts * moved prepare data under scripts * removed untested model configs * removed untested model configs * fixed weird bug in penobscot data loader * fixed weird bug in penobscot data loader * penobscot experiments working for hrnet, seresnet, no depth and patch depth * penobscot experiments working for hrnet, seresnet, no depth and patch depth * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * fixed bugs in my previous 'fix' * fixed bugs in my previous 'fix' * removed redundant _open_mask from subclasses * removed redundant _open_mask from subclasses * Merged PR 601: Fixes to penobscot experiments A few changes: - Instructions in README on how to download and process Penobscot and F3 2D data sets - moved prepare_data scripts to the scripts/ directory - fixed a weird issue with a class method in Penobscot data loader - fixed a bug in section loader (_add_extra_channel in section loader was not necessary and was causing an issue) - removed config files that were not tested or working in Penobscot experiments - modified default.py so it's working if train.py ran without a config file Related work items: #20694 * Merged PR 605: added common metrics to Waldeland model in Ignite Related work items: #19550 * Removed redundant extract_metric_from * Removed redundant extract_metric_from * formatting changes in metrics * formatting changes in metrics * modified penobscot experiment to use new local metrics * modified penobscot experiment to use new local metrics * modified section experimen to pass device to metrics * modified section experimen to pass device to metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * fixed other experiments after new metrics * fixed other experiments after new metrics * removed apex metrics from distributed train.py * removed apex metrics from distributed train.py * added ignite-based metrics to dutch voxel experiment * added ignite-based metrics to dutch voxel experiment * removed apex metrics * removed apex metrics * modified penobscot test script to use new metrics * pytorch-ignite pre-release with new metrics until stable available * removed cell output from the F3 notebook * deleted .vscode * modified metric import in test_metrics.py * separated metrics out as a module * relative logger file path, modified section experiment * removed the REPO_PATH from init * created util logging function, and moved logging file to each experiment * modified demo experiment * modified penobscot experiment * modified dutchf3_voxel experiment * no logging in voxel2pixel * modified dutchf3 patch local experiment * modified patch distributed experiment * modified interpretation notebook * minor changes to comments * DOC: forking dislaimer and new build names. (#9) * Updating README.md with introduction material (#10) * Update README with introduction to DeepSeismic Add intro material for DeepSeismic * Adding logo file * Adding image to readme * Update README.md * Updates the 3D visualisation to use itkwidgets (#11) * Updates notebook to use itkwidgets for interactive visualisation * Adds jupytext to pre-commit (#12) * Add jupytext * Adds demo notebook for HRNet (#13) * Adding TF 2.0 to allow for tensorboard vis in notebooks * Modifies hrnet config for notebook * Add HRNet notebook for demo * Updates HRNet notebook and tidies F3 * removed my username references (#15) * moving 3D models into contrib folder (#16) * Weetok (#17) * Update it to include sections for imaging * Update README.md * Update README.md * fixed link for F3 download * MINOR: python version fix to 3.6.7 (#72) * Adding system requirements in README (#74) * Update main_build.yml for Azure Pipelines * Update main_build.yml for Azure Pipelines * BUILD: added build status badges (#6) * Adds dataloader for numpy datasets as well as demo pipeline for such a dataset (#7) * Finished version of numpy data loader * Working training script for demo * Adds the new metrics * Fixes docstrings and adds header * Removing extra setup.py * Log config file now experiment specific (#8) * Merging work on salt dataset * Adds computer vision to dependencies * Updates dependecies * Update * Updates the environemnt files * Updates readme and envs * Initial running version of dutchf3 * INFRA: added structure templates. * VOXEL: initial rough code push - need to clean up before PRing. * Working version * Working version before refactor * quick minor fixes in README * 3D SEG: first commit for PR. * 3D SEG: removed data files to avoid redistribution. * Updates * 3D SEG: restyled batch file, moving onto others. * Working HRNet * 3D SEG: finished going through Waldeland code * Updates test scripts and makes it take processing arguments * minor update * Fixing imports * Refactoring the experiments * Removing .vscode * Updates gitignore * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * minor wording fix * minor wording fix * enabled splitting dataset into sections, rather than only patches * enabled splitting dataset into sections, rather than only patches * merged duplicate ifelse blocks * merged duplicate ifelse blocks * refactored prepare_data.py * refactored prepare_data.py * added scripts for section train test * added scripts for section train test * section train/test works for single channel input * section train/test works for single channel input * Merged PR 174: F3 Dutch README, and fixed issues in prepare_data.py This PR includes the following changes: - added README instructions for running f3dutch experiments - prepare_dataset.py didn't work for creating section-based splits, so I fixed a few issues. There are no changes to the patch-based splitting logic. - ran black formatter on the file, which created all the formatting changes (sorry!) * Merged PR 204: Adds loaders to deepseismic from cv_lib * train and test script for section based training/testing * train and test script for section based training/testing * Merged PR 209: changes to section loaders in data.py Changes in this PR will affect patch scripts as well. The following are required changes in patch scripts: - get_train_loader() in train.py should be changed to get_patch_loader(). I created separate function to load section and patch loaders. - SectionLoader now swaps H and W dims. When loading test data in patch, this line can be removed (and tested) from test.py h, w = img.shape[-2], img.shape[-1] # height and width * Merged PR 210: BENCHMARKS: added placeholder for benchmarks. BENCHMARKS: added placeholder for benchmarks. * Merged PR 211: Fixes issues left over from changes to data.py * removing experiments from deep_seismic, following the new struct * removing experiments from deep_seismic, following the new struct * Merged PR 220: Adds Horovod and fixes Add Horovod training script Updates dependencies in Horovod docker file Removes hard coding of path in data.py * section train/test scripts * section train/test scripts * Add cv_lib to repo and updates instructions * Add cv_lib to repo and updates instructions * Removes data.py and updates readme * Removes data.py and updates readme * Updates requirements * Updates requirements * Merged PR 222: Moves cv_lib into repo and updates setup instructions * renamed train/test scripts * renamed train/test scripts * train test works on alaudah section experiments, a few minor bugs left * train test works on alaudah section experiments, a few minor bugs left * cleaning up loaders * cleaning up loaders * Merged PR 236: Cleaned up dutchf3 data loaders @<Mathew Salvaris> , @<Ilia Karmanov> , @<Max Kaznady> , please check out if this PR will affect your experiments. The main change is with the initialization of sections/patches attributes of loaders. Previously, we were unnecessarily assigning all train/val splits to train loaders, rather than only those belonging to the given split for that loader. Similar for test loaders. This will affect your code if you access these attributes. E.g. if you have something like this in your experiments: ``` train_set = TrainPatchLoader(…) patches = train_set.patches[train_set.split] ``` or ``` train_set = TrainSectionLoader(…) sections = train_set.sections[train_set.split] ``` * training testing for sections works * training testing for sections works * minor changes * minor changes * reverting changes on dutchf3/local/default.py file * reverting changes on dutchf3/local/default.py file * added config file * added config file * Updates the repo with preliminary results for 2D segmentation * Merged PR 248: Experiment: section-based Alaudah training/testing This PR includes the section-based experiments on dutchf3 to replicate Alaudah's work. No changes were introduced to the code outside this experiment. * Merged PR 253: Waldeland based voxel loaders and TextureNet model Related work items: #16357 * Merged PR 290: A demo notebook on local train/eval on F3 data set Notebook and associated files + minor change in a patch_deconvnet_skip.py model file. Related work items: #17432 * Merged PR 312: moved dutchf3_section to experiments/interpretation moved dutchf3_section to experiments/interpretation Related work items: #17683 * Merged PR 309: minor change to README to reflect the changes in prepare_data script minor change to README to reflect the changes in prepare_data script Related work items: #17681 * Merged PR 315: Removing voxel exp Related work items: #17702 * sync with new experiment structure * sync with new experiment structure * added a logging handler for array metrics * added a logging handler for array metrics * first draft of metrics based on the ignite confusion matrix * first draft of metrics based on the ignite confusion matrix * metrics now based on ignite.metrics * metrics now based on ignite.metrics * modified patch train.py with new metrics * modified patch train.py with new metrics * Merged PR 361: VOXEL: fixes to original voxel2pixel code to make it work with the rest of the repo. Realized there was one bug in the code and the rest of the functions did not work with the different versions of libraries which we have listed in the conda yaml file. Also updated the download script. Related work items: #18264 * modified metrics with ignore_index * modified metrics with ignore_index * Merged PR 405: minor mods to notebook, more documentation A very small PR - Just a few more lines of documentation in the notebook, to improve clarity. Related work items: #17432 * Merged PR 368: Adds penobscot Adds for penobscot - Dataset reader - Training script - Testing script - Section depth augmentation - Patch depth augmentation - Iinline visualisation for Tensorboard Related work items: #14560, #17697, #17699, #17700 * Merged PR 407: Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Related work items: #16362 * Merged PR 452: decouple docker image creation from azureml removed all azureml dependencies from 010_CreateExperimentationDockerImage_GeophysicsTutorial_FWI_Azure_devito.ipynb All other changes are due to trivial reruns Related work items: #18346 * Merged PR 512: Pre-commit hooks for formatting and style checking Opening this PR to start the discussion - I added the required dotenv files and instructions for setting up pre-commit hooks for formatting and style checking. For formatting, we are using black, and style checking flake8. The following files are added: - .pre-commit-config.yaml - defines git hooks to be installed - .flake8 - settings for flake8 linter - pyproject.toml - settings for black formatter The last two files define the formatting and linting style we want to enforce on the repo. All of us would set up the pre-commit hooks locally, so regardless of what formatting/linting settings we have in our local editors, the settings specified by the git hooks would still be enforced prior to the commit, to ensure consistency among contributors. Some questions to start the discussion: - Do you want to change any of the default settings in the dotenv files - like the line lengths, error messages we exclude or include, or anything like that. - Do we want to have a requirements-dev.txt file for contributors? This setup uses pre-commit package, I didn't include it in the environment.yaml file, but instead instructed the user to install it in the CONTRIBUTING.MD file. - Once you have the hooks installed, it will only affect the files you are committing in the future. A big chunk of our codebase does not conform to the formatting/style settings. We will have to run the hooks on the codebase retrospectively. I'm happy to do that, but it will create many changes and a significant looking PR :) Any thoughts on how we should approach this? Thanks! Related work items: #18350 * Merged PR 513: 3D training script for Waldeland's model with Ignite Related work items: #16356 * Merged PR 565: Demo notebook updated with 3D graph Changes: 1) Updated demo notebook with the 3D visualization 2) Formatting changes due to new black/flake8 git hook Related work items: #17432 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * merged tests into this branch * merged tests into this branch * Merged PR 569: Minor PR: change to pre-commit configuration files Related work items: #18350 * Merged PR 586: Purging unused files and experiments Purging unused files and experiments Related work items: #20499 * moved prepare data under scripts * moved prepare data under scripts * removed untested model configs * removed untested model configs * fixed weird bug in penobscot data loader * fixed weird bug in penobscot data loader * penobscot experiments working for hrnet, seresnet, no depth and patch depth * penobscot experiments working for hrnet, seresnet, no depth and patch depth * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * fixed bugs in my previous 'fix' * fixed bugs in my previous 'fix' * removed redundant _open_mask from subclasses * removed redundant _open_mask from subclasses * Merged PR 601: Fixes to penobscot experiments A few changes: - Instructions in README on how to download and process Penobscot and F3 2D data sets - moved prepare_data scripts to the scripts/ directory - fixed a weird issue with a class method in Penobscot data loader - fixed a bug in section loader (_add_extra_channel in section loader was not necessary and was causing an issue) - removed config files that were not tested or working in Penobscot experiments - modified default.py so it's working if train.py ran without a config file Related work items: #20694 * Merged PR 605: added common metrics to Waldeland model in Ignite Related work items: #19550 * Removed redundant extract_metric_from * Removed redundant extract_metric_from * formatting changes in metrics * formatting changes in metrics * modified penobscot experiment to use new local metrics * modified penobscot experiment to use new local metrics * modified section experimen to pass device to metrics * modified section experimen to pass device to metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * fixed other experiments after new metrics * fixed other experiments after new metrics * removed apex metrics from distributed train.py * removed apex metrics from distributed train.py * added ignite-based metrics to dutch voxel experiment * added ignite-based metrics to dutch voxel experiment * removed apex metrics * removed apex metrics * modified penobscot test script to use new metrics * pytorch-ignite pre-release with new metrics until stable available * removed cell output from the F3 notebook * deleted .vscode * modified metric import in test_metrics.py * separated metrics out as a module * relative logger file path, modified section experiment * removed the REPO_PATH from init * created util logging function, and moved logging file to each experiment * modified demo experiment * modified penobscot experiment * modified dutchf3_voxel experiment * no logging in voxel2pixel * modified dutchf3 patch local experiment * modified patch distributed experiment * modified interpretation notebook * minor changes to comments * DOC: forking dislaimer and new build names. (#9) * Updating README.md with introduction material (#10) * Update README with introduction to DeepSeismic Add intro material for DeepSeismic * Adding logo file * Adding image to readme * Update README.md * Updates the 3D visualisation to use itkwidgets (#11) * Updates notebook to use itkwidgets for interactive visualisation * Adds jupytext to pre-commit (#12) * Add jupytext * Adds demo notebook for HRNet (#13) * Adding TF 2.0 to allow for tensorboard vis in notebooks * Modifies hrnet config for notebook * Add HRNet notebook for demo * Updates HRNet notebook and tidies F3 * removed my username references (#15) * moving 3D models into contrib folder (#16) * Weetok (#17) * Update it to include sections for imaging * Update README.md * Update README.md * added system requirements to readme * merge upstream into my fork (#1) * MINOR: addressing broken F3 download link (#73) * Update main_build.yml for Azure Pipelines * Update main_build.yml for Azure Pipelines * BUILD: added build status badges (#6) * Adds dataloader for numpy datasets as well as demo pipeline for such a dataset (#7) * Finished version of numpy data loader * Working training script for demo * Adds the new metrics * Fixes docstrings and adds header * Removing extra setup.py * Log config file now experiment specific (#8) * Merging work on salt dataset * Adds computer vision to dependencies * Updates dependecies * Update * Updates the environemnt files * Updates readme and envs * Initial running version of dutchf3 * INFRA: added structure templates. * VOXEL: initial rough code push - need to clean up before PRing. * Working version * Working version before refactor * quick minor fixes in README * 3D SEG: first commit for PR. * 3D SEG: removed data files to avoid redistribution. * Updates * 3D SEG: restyled batch file, moving onto others. * Working HRNet * 3D SEG: finished going through Waldeland code * Updates test scripts and makes it take processing arguments * minor update * Fixing imports * Refactoring the experiments * Removing .vscode * Updates gitignore * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * minor wording fix * minor wording fix * enabled splitting dataset into sections, rather than only patches * enabled splitting dataset into sections, rather than only patches * merged duplicate ifelse blocks * merged duplicate ifelse blocks * refactored prepare_data.py * refactored prepare_data.py * added scripts for section train test * added scripts for section train test * section train/test works for single channel input * section train/test works for single channel input * Merged PR 174: F3 Dutch README, and fixed issues in prepare_data.py This PR includes the following changes: - added README instructions for running f3dutch experiments - prepare_dataset.py didn't work for creating section-based splits, so I fixed a few issues. There are no changes to the patch-based splitting logic. - ran black formatter on the file, which created all the formatting changes (sorry!) * Merged PR 204: Adds loaders to deepseismic from cv_lib * train and test script for section based training/testing * train and test script for section based training/testing * Merged PR 209: changes to section loaders in data.py Changes in this PR will affect patch scripts as well. The following are required changes in patch scripts: - get_train_loader() in train.py should be changed to get_patch_loader(). I created separate function to load section and patch loaders. - SectionLoader now swaps H and W dims. When loading test data in patch, this line can be removed (and tested) from test.py h, w = img.shape[-2], img.shape[-1] # height and width * Merged PR 210: BENCHMARKS: added placeholder for benchmarks. BENCHMARKS: added placeholder for benchmarks. * Merged PR 211: Fixes issues left over from changes to data.py * removing experiments from deep_seismic, following the new struct * removing experiments from deep_seismic, following the new struct * Merged PR 220: Adds Horovod and fixes Add Horovod training script Updates dependencies in Horovod docker file Removes hard coding of path in data.py * section train/test scripts * section train/test scripts * Add cv_lib to repo and updates instructions * Add cv_lib to repo and updates instructions * Removes data.py and updates readme * Removes data.py and updates readme * Updates requirements * Updates requirements * Merged PR 222: Moves cv_lib into repo and updates setup instructions * renamed train/test scripts * renamed train/test scripts * train test works on alaudah section experiments, a few minor bugs left * train test works on alaudah section experiments, a few minor bugs left * cleaning up loaders * cleaning up loaders * Merged PR 236: Cleaned up dutchf3 data loaders @<Mathew Salvaris> , @<Ilia Karmanov> , @<Max Kaznady> , please check out if this PR will affect your experiments. The main change is with the initialization of sections/patches attributes of loaders. Previously, we were unnecessarily assigning all train/val splits to train loaders, rather than only those belonging to the given split for that loader. Similar for test loaders. This will affect your code if you access these attributes. E.g. if you have something like this in your experiments: ``` train_set = TrainPatchLoader(…) patches = train_set.patches[train_set.split] ``` or ``` train_set = TrainSectionLoader(…) sections = train_set.sections[train_set.split] ``` * training testing for sections works * training testing for sections works * minor changes * minor changes * reverting changes on dutchf3/local/default.py file * reverting changes on dutchf3/local/default.py file * added config file * added config file * Updates the repo with preliminary results for 2D segmentation * Merged PR 248: Experiment: section-based Alaudah training/testing This PR includes the section-based experiments on dutchf3 to replicate Alaudah's work. No changes were introduced to the code outside this experiment. * Merged PR 253: Waldeland based voxel loaders and TextureNet model Related work items: #16357 * Merged PR 290: A demo notebook on local train/eval on F3 data set Notebook and associated files + minor change in a patch_deconvnet_skip.py model file. Related work items: #17432 * Merged PR 312: moved dutchf3_section to experiments/interpretation moved dutchf3_section to experiments/interpretation Related work items: #17683 * Merged PR 309: minor change to README to reflect the changes in prepare_data script minor change to README to reflect the changes in prepare_data script Related work items: #17681 * Merged PR 315: Removing voxel exp Related work items: #17702 * sync with new experiment structure * sync with new experiment structure * added a logging handler for array metrics * added a logging handler for array metrics * first draft of metrics based on the ignite confusion matrix * first draft of metrics based on the ignite confusion matrix * metrics now based on ignite.metrics * metrics now based on ignite.metrics * modified patch train.py with new metrics * modified patch train.py with new metrics * Merged PR 361: VOXEL: fixes to original voxel2pixel code to make it work with the rest of the repo. Realized there was one bug in the code and the rest of the functions did not work with the different versions of libraries which we have listed in the conda yaml file. Also updated the download script. Related work items: #18264 * modified metrics with ignore_index * modified metrics with ignore_index * Merged PR 405: minor mods to notebook, more documentation A very small PR - Just a few more lines of documentation in the notebook, to improve clarity. Related work items: #17432 * Merged PR 368: Adds penobscot Adds for penobscot - Dataset reader - Training script - Testing script - Section depth augmentation - Patch depth augmentation - Iinline visualisation for Tensorboard Related work items: #14560, #17697, #17699, #17700 * Merged PR 407: Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Related work items: #16362 * Merged PR 452: decouple docker image creation from azureml removed all azureml dependencies from 010_CreateExperimentationDockerImage_GeophysicsTutorial_FWI_Azure_devito.ipynb All other changes are due to trivial reruns Related work items: #18346 * Merged PR 512: Pre-commit hooks for formatting and style checking Opening this PR to start the discussion - I added the required dotenv files and instructions for setting up pre-commit hooks for formatting and style checking. For formatting, we are using black, and style checking flake8. The following files are added: - .pre-commit-config.yaml - defines git hooks to be installed - .flake8 - settings for flake8 linter - pyproject.toml - settings for black formatter The last two files define the formatting and linting style we want to enforce on the repo. All of us would set up the pre-commit hooks locally, so regardless of what formatting/linting settings we have in our local editors, the settings specified by the git hooks would still be enforced prior to the commit, to ensure consistency among contributors. Some questions to start the discussion: - Do you want to change any of the default settings in the dotenv files - like the line lengths, error messages we exclude or include, or anything like that. - Do we want to have a requirements-dev.txt file for contributors? This setup uses pre-commit package, I didn't include it in the environment.yaml file, but instead instructed the user to install it in the CONTRIBUTING.MD file. - Once you have the hooks installed, it will only affect the files you are committing in the future. A big chunk of our codebase does not conform to the formatting/style settings. We will have to run the hooks on the codebase retrospectively. I'm happy to do that, but it will create many changes and a significant looking PR :) Any thoughts on how we should approach this? Thanks! Related work items: #18350 * Merged PR 513: 3D training script for Waldeland's model with Ignite Related work items: #16356 * Merged PR 565: Demo notebook updated with 3D graph Changes: 1) Updated demo notebook with the 3D visualization 2) Formatting changes due to new black/flake8 git hook Related work items: #17432 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * merged tests into this branch * merged tests into this branch * Merged PR 569: Minor PR: change to pre-commit configuration files Related work items: #18350 * Merged PR 586: Purging unused files and experiments Purging unused files and experiments Related work items: #20499 * moved prepare data under scripts * moved prepare data under scripts * removed untested model configs * removed untested model configs * fixed weird bug in penobscot data loader * fixed weird bug in penobscot data loader * penobscot experiments working for hrnet, seresnet, no depth and patch depth * penobscot experiments working for hrnet, seresnet, no depth and patch depth * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * fixed bugs in my previous 'fix' * fixed bugs in my previous 'fix' * removed redundant _open_mask from subclasses * removed redundant _open_mask from subclasses * Merged PR 601: Fixes to penobscot experiments A few changes: - Instructions in README on how to download and process Penobscot and F3 2D data sets - moved prepare_data scripts to the scripts/ directory - fixed a weird issue with a class method in Penobscot data loader - fixed a bug in section loader (_add_extra_channel in section loader was not necessary and was causing an issue) - removed config files that were not tested or working in Penobscot experiments - modified default.py so it's working if train.py ran without a config file Related work items: #20694 * Merged PR 605: added common metrics to Waldeland model in Ignite Related work items: #19550 * Removed redundant extract_metric_from * Removed redundant extract_metric_from * formatting changes in metrics * formatting changes in metrics * modified penobscot experiment to use new local metrics * modified penobscot experiment to use new local metrics * modified section experimen to pass device to metrics * modified section experimen to pass device to metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * fixed other experiments after new metrics * fixed other experiments after new metrics * removed apex metrics from distributed train.py * removed apex metrics from distributed train.py * added ignite-based metrics to dutch voxel experiment * added ignite-based metrics to dutch voxel experiment * removed apex metrics * removed apex metrics * modified penobscot test script to use new metrics * pytorch-ignite pre-release with new metrics until stable available * removed cell output from the F3 notebook * deleted .vscode * modified metric import in test_metrics.py * separated metrics out as a module * relative logger file path, modified section experiment * removed the REPO_PATH from init * created util logging function, and moved logging file to each experiment * modified demo experiment * modified penobscot experiment * modified dutchf3_voxel experiment * no logging in voxel2pixel * modified dutchf3 patch local experiment * modified patch distributed experiment * modified interpretation notebook * minor changes to comments * DOC: forking dislaimer and new build names. (#9) * Updating README.md with introduction material (#10) * Update README with introduction to DeepSeismic Add intro material for DeepSeismic * Adding logo file * Adding image to readme * Update README.md * Updates the 3D visualisation to use itkwidgets (#11) * Updates notebook to use itkwidgets for interactive visualisation * Adds jupytext to pre-commit (#12) * Add jupytext * Adds demo notebook for HRNet (#13) * Adding TF 2.0 to allow for tensorboard vis in notebooks * Modifies hrnet config for notebook * Add HRNet notebook for demo * Updates HRNet notebook and tidies F3 * removed my username references (#15) * moving 3D models into contrib folder (#16) * Weetok (#17) * Update it to include sections for imaging * Update README.md * Update README.md * fixed link for F3 download * MINOR: python version fix to 3.6.7 (#72) * Adding system requirements in README (#74) * Update main_build.yml for Azure Pipelines * Update main_build.yml for Azure Pipelines * BUILD: added build status badges (#6) * Adds dataloader for numpy datasets as well as demo pipeline for such a dataset (#7) * Finished version of numpy data loader * Working training script for demo * Adds the new metrics * Fixes docstrings and adds header * Removing extra setup.py * Log config file now experiment specific (#8) * Merging work on salt dataset * Adds computer vision to dependencies * Updates dependecies * Update * Updates the environemnt files * Updates readme and envs * Initial running version of dutchf3 * INFRA: added structure templates. * VOXEL: initial rough code push - need to clean up before PRing. * Working version * Working version before refactor * quick minor fixes in README * 3D SEG: first commit for PR. * 3D SEG: removed data files to avoid redistribution. * Updates * 3D SEG: restyled batch file, moving onto others. * Working HRNet * 3D SEG: finished going through Waldeland code * Updates test scripts and makes it take processing arguments * minor update * Fixing imports * Refactoring the experiments * Removing .vscode * Updates gitignore * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * minor wording fix * minor wording fix * enabled splitting dataset into sections, rather than only patches * enabled splitting dataset into sections, rather than only patches * merged duplicate ifelse blocks * merged duplicate ifelse blocks * refactored prepare_data.py * refactored prepare_data.py * added scripts for section train test * added scripts for section train test * section train/test works for single channel input * section train/test works for single channel input * Merged PR 174: F3 Dutch README, and fixed issues in prepare_data.py This PR includes the following changes: - added README instructions for running f3dutch experiments - prepare_dataset.py didn't work for creating section-based splits, so I fixed a few issues. There are no changes to the patch-based splitting logic. - ran black formatter on the file, which created all the formatting changes (sorry!) * Merged PR 204: Adds loaders to deepseismic from cv_lib * train and test script for section based training/testing * train and test script for section based training/testing * Merged PR 209: changes to section loaders in data.py Changes in this PR will affect patch scripts as well. The following are required changes in patch scripts: - get_train_loader() in train.py should be changed to get_patch_loader(). I created separate function to load section and patch loaders. - SectionLoader now swaps H and W dims. When loading test data in patch, this line can be removed (and tested) from test.py h, w = img.shape[-2], img.shape[-1] # height and width * Merged PR 210: BENCHMARKS: added placeholder for benchmarks. BENCHMARKS: added placeholder for benchmarks. * Merged PR 211: Fixes issues left over from changes to data.py * removing experiments from deep_seismic, following the new struct * removing experiments from deep_seismic, following the new struct * Merged PR 220: Adds Horovod and fixes Add Horovod training script Updates dependencies in Horovod docker file Removes hard coding of path in data.py * section train/test scripts * section train/test scripts * Add cv_lib to repo and updates instructions * Add cv_lib to repo and updates instructions * Removes data.py and updates readme * Removes data.py and updates readme * Updates requirements * Updates requirements * Merged PR 222: Moves cv_lib into repo and updates setup instructions * renamed train/test scripts * renamed train/test scripts * train test works on alaudah section experiments, a few minor bugs left * train test works on alaudah section experiments, a few minor bugs left * cleaning up loaders * cleaning up loaders * Merged PR 236: Cleaned up dutchf3 data loaders @<Mathew Salvaris> , @<Ilia Karmanov> , @<Max Kaznady> , please check out if this PR will affect your experiments. The main change is with the initialization of sections/patches attributes of loaders. Previously, we were unnecessarily assigning all train/val splits to train loaders, rather than only those belonging to the given split for that loader. Similar for test loaders. This will affect your code if you access these attributes. E.g. if you have something like this in your experiments: ``` train_set = TrainPatchLoader(…) patches = train_set.patches[train_set.split] ``` or ``` train_set = TrainSectionLoader(…) sections = train_set.sections[train_set.split] ``` * training testing for sections works * training testing for sections works * minor changes * minor changes * reverting changes on dutchf3/local/default.py file * reverting changes on dutchf3/local/default.py file * added config file * added config file * Updates the repo with preliminary results for 2D segmentation * Merged PR 248: Experiment: section-based Alaudah training/testing This PR includes the section-based experiments on dutchf3 to replicate Alaudah's work. No changes were introduced to the code outside this experiment. * Merged PR 253: Waldeland based voxel loaders and TextureNet model Related work items: #16357 * Merged PR 290: A demo notebook on local train/eval on F3 data set Notebook and associated files + minor change in a patch_deconvnet_skip.py model file. Related work items: #17432 * Merged PR 312: moved dutchf3_section to experiments/interpretation moved dutchf3_section to experiments/interpretation Related work items: #17683 * Merged PR 309: minor change to README to reflect the changes in prepare_data script minor change to README to reflect the changes in prepare_data script Related work items: #17681 * Merged PR 315: Removing voxel exp Related work items: #17702 * sync with new experiment structure * sync with new experiment structure * added a logging handler for array metrics * added a logging handler for array metrics * first draft of metrics based on the ignite confusion matrix * first draft of metrics based on the ignite confusion matrix * metrics now based on ignite.metrics * metrics now based on ignite.metrics * modified patch train.py with new metrics * modified patch train.py with new metrics * Merged PR 361: VOXEL: fixes to original voxel2pixel code to make it work with the rest of the repo. Realized there was one bug in the code and the rest of the functions did not work with the different versions of libraries which we have listed in the conda yaml file. Also updated the download script. Related work items: #18264 * modified metrics with ignore_index * modified metrics with ignore_index * Merged PR 405: minor mods to notebook, more documentation A very small PR - Just a few more lines of documentation in the notebook, to improve clarity. Related work items: #17432 * Merged PR 368: Adds penobscot Adds for penobscot - Dataset reader - Training script - Testing script - Section depth augmentation - Patch depth augmentation - Iinline visualisation for Tensorboard Related work items: #14560, #17697, #17699, #17700 * Merged PR 407: Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Related work items: #16362 * Merged PR 452: decouple docker image creation from azureml removed all azureml dependencies from 010_CreateExperimentationDockerImage_GeophysicsTutorial_FWI_Azure_devito.ipynb All other changes are due to trivial reruns Related work items: #18346 * Merged PR 512: Pre-commit hooks for formatting and style checking Opening this PR to start the discussion - I added the required dotenv files and instructions for setting up pre-commit hooks for formatting and style checking. For formatting, we are using black, and style checking flake8. The following files are added: - .pre-commit-config.yaml - defines git hooks to be installed - .flake8 - settings for flake8 linter - pyproject.toml - settings for black formatter The last two files define the formatting and linting style we want to enforce on the repo. All of us would set up the pre-commit hooks locally, so regardless of what formatting/linting settings we have in our local editors, the settings specified by the git hooks would still be enforced prior to the commit, to ensure consistency among contributors. Some questions to start the discussion: - Do you want to change any of the default settings in the dotenv files - like the line lengths, error messages we exclude or include, or anything like that. - Do we want to have a requirements-dev.txt file for contributors? This setup uses pre-commit package, I didn't include it in the environment.yaml file, but instead instructed the user to install it in the CONTRIBUTING.MD file. - Once you have the hooks installed, it will only affect the files you are committing in the future. A big chunk of our codebase does not conform to the formatting/style settings. We will have to run the hooks on the codebase retrospectively. I'm happy to do that, but it will create many changes and a significant looking PR :) Any thoughts on how we should approach this? Thanks! Related work items: #18350 * Merged PR 513: 3D training script for Waldeland's model with Ignite Related work items: #16356 * Merged PR 565: Demo notebook updated with 3D graph Changes: 1) Updated demo notebook with the 3D visualization 2) Formatting changes due to new black/flake8 git hook Related work items: #17432 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * merged tests into this branch * merged tests into this branch * Merged PR 569: Minor PR: change to pre-commit configuration files Related work items: #18350 * Merged PR 586: Purging unused files and experiments Purging unused files and experiments Related work items: #20499 * moved prepare data under scripts * moved prepare data under scripts * removed untested model configs * removed untested model configs * fixed weird bug in penobscot data loader * fixed weird bug in penobscot data loader * penobscot experiments working for hrnet, seresnet, no depth and patch depth * penobscot experiments working for hrnet, seresnet, no depth and patch depth * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * fixed bugs in my previous 'fix' * fixed bugs in my previous 'fix' * removed redundant _open_mask from subclasses * removed redundant _open_mask from subclasses * Merged PR 601: Fixes to penobscot experiments A few changes: - Instructions in README on how to download and process Penobscot and F3 2D data sets - moved prepare_data scripts to the scripts/ directory - fixed a weird issue with a class method in Penobscot data loader - fixed a bug in section loader (_add_extra_channel in section loader was not necessary and was causing an issue) - removed config files that were not tested or working in Penobscot experiments - modified default.py so it's working if train.py ran without a config file Related work items: #20694 * Merged PR 605: added common metrics to Waldeland model in Ignite Related work items: #19550 * Removed redundant extract_metric_from * Removed redundant extract_metric_from * formatting changes in metrics * formatting changes in metrics * modified penobscot experiment to use new local metrics * modified penobscot experiment to use new local metrics * modified section experimen to pass device to metrics * modified section experimen to pass device to metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * fixed other experiments after new metrics * fixed other experiments after new metrics * removed apex metrics from distributed train.py * removed apex metrics from distributed train.py * added ignite-based metrics to dutch voxel experiment * added ignite-based metrics to dutch voxel experiment * removed apex metrics * removed apex metrics * modified penobscot test script to use new metrics * pytorch-ignite pre-release with new metrics until stable available * removed cell output from the F3 notebook * deleted .vscode * modified metric import in test_metrics.py * separated metrics out as a module * relative logger file path, modified section experiment * removed the REPO_PATH from init * created util logging function, and moved logging file to each experiment * modified demo experiment * modified penobscot experiment * modified dutchf3_voxel experiment * no logging in voxel2pixel * modified dutchf3 patch local experiment * modified patch distributed experiment * modified interpretation notebook * minor changes to comments * DOC: forking dislaimer and new build names. (#9) * Updating README.md with introduction material (#10) * Update README with introduction to DeepSeismic Add intro material for DeepSeismic * Adding logo file * Adding image to readme * Update README.md * Updates the 3D visualisation to use itkwidgets (#11) * Updates notebook to use itkwidgets for interactive visualisation * Adds jupytext to pre-commit (#12) * Add jupytext * Adds demo notebook for HRNet (#13) * Adding TF 2.0 to allow for tensorboard vis in notebooks * Modifies hrnet config for notebook * Add HRNet notebook for demo * Updates HRNet notebook and tidies F3 * removed my username references (#15) * moving 3D models into contrib folder (#16) * Weetok (#17) * Update it to include sections for imaging * Update README.md * Update README.md * added system requirements to readme * Adds premium storage (#79) * Adds premium storage method * update test.py for section based approach to use command line arguments (#76) * added README documentation per bug bush feedback (#78) * sdk 1.0.76; tested conda env vs docker image; extented readme * removed reference to imaging * minor md formatting * minor md formatting * https://github.com/microsoft/DeepSeismic/issues/71 (#80) * azureml sdk 1.0.74; foxed a few issues around ACR access; added nb 030 for scalability testing * azureml sdk 1.0.74; foxed a few issues around ACR access; added nb 030 for scalability testing * merge upstream into my fork (#1) * MINOR: addressing broken F3 download link (#73) * Update main_build.yml for Azure Pipelines * Update main_build.yml for Azure Pipelines * BUILD: added build status badges (#6) * Adds dataloader for numpy datasets as well as demo pipeline for such a dataset (#7) * Finished version of numpy data loader * Working training script for demo * Adds the new metrics * Fixes docstrings and adds header * Removing extra setup.py * Log config file now experiment specific (#8) * Merging work on salt dataset * Adds computer vision to dependencies * Updates dependecies * Update * Updates the environemnt files * Updates readme and envs * Initial running version of dutchf3 * INFRA: added structure templates. * VOXEL: initial rough code push - need to clean up before PRing. * Working version * Working version before refactor * quick minor fixes in README * 3D SEG: first commit for PR. * 3D SEG: removed data files to avoid redistribution. * Updates * 3D SEG: restyled batch file, moving onto others. * Working HRNet * 3D SEG: finished going through Waldeland code * Updates test scripts and makes it take processing arguments * minor update * Fixing imports * Refactoring the experiments * Removing .vscode * Updates gitignore * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * minor wording fix * minor wording fix * enabled splitting dataset into sections, rather than only patches * enabled splitting dataset into sections, rather than only patches * merged duplicate ifelse blocks * merged duplicate ifelse blocks * refactored prepare_data.py * refactored prepare_data.py * added scripts for section train test * added scripts for section train test * section train/test works for single channel input * section train/test works for single channel input * Merged PR 174: F3 Dutch README, and fixed issues in prepare_data.py This PR includes the following changes: - added README instructions for running f3dutch experiments - prepare_dataset.py didn't work for creating section-based splits, so I fixed a few issues. There are no changes to the patch-based splitting logic. - ran black formatter on the file, which created all the formatting changes (sorry!) * Merged PR 204: Adds loaders to deepseismic from cv_lib * train and test script for section based training/testing * train and test script for section based training/testing * Merged PR 209: changes to section loaders in data.py Changes in this PR will affect patch scripts as well. The following are required changes in patch scripts: - get_train_loader() in train.py should be changed to get_patch_loader(). I created separate function to load section and patch loaders. - SectionLoader now swaps H and W dims. When loading test data in patch, this line can be removed (and tested) from test.py h, w = img.shape[-2], img.shape[-1] # height and width * Merged PR 210: BENCHMARKS: added placeholder for benchmarks. BENCHMARKS: added placeholder for benchmarks. * Merged PR 211: Fixes issues left over from changes to data.py * removing experiments from deep_seismic, following the new struct * removing experiments from deep_seismic, following the new struct * Merged PR 220: Adds Horovod and fixes Add Horovod training script Updates dependencies in Horovod docker file Removes hard coding of path in data.py * section train/test scripts * section train/test scripts * Add cv_lib to repo and updates instructions * Add cv_lib to repo and updates instructions * Removes data.py and updates readme * Removes data.py and updates readme * Updates requirements * Updates requirements * Merged PR 222: Moves cv_lib into repo and updates setup instructions * renamed train/test scripts * renamed train/test scripts * train test works on alaudah section experiments, a few minor bugs left * train test works on alaudah section experiments, a few minor bugs left * cleaning up loaders * cleaning up loaders * Merged PR 236: Cleaned up dutchf3 data loaders @<Mathew Salvaris> , @<Ilia Karmanov> , @<Max Kaznady> , please check out if this PR will affect your experiments. The main change is with the initialization of sections/patches attributes of loaders. Previously, we were unnecessarily assigning all train/val splits to train loaders, rather than only those belonging to the given split for that loader. Similar for test loaders. This will affect your code if you access these attributes. E.g. if you have something like this in your experiments: ``` train_set = TrainPatchLoader(…) patches = train_set.patches[train_set.split] ``` or ``` train_set = TrainSectionLoader(…) sections = train_set.sections[train_set.split] ``` * training testing for sections works * training testing for sections works * minor changes * minor changes * reverting changes on dutchf3/local/default.py file * reverting changes on dutchf3/local/default.py file * added config file * added config file * Updates the repo with preliminary results for 2D segmentation * Merged PR 248: Experiment: section-based Alaudah training/testing This PR includes the section-based experiments on dutchf3 to replicate Alaudah's work. No changes were introduced to the code outside this experiment. * Merged PR 253: Waldeland based voxel loaders and TextureNet model Related work items: #16357 * Merged PR 290: A demo notebook on local train/eval on F3 data set Notebook and associated files + minor change in a patch_deconvnet_skip.py model file. Related work items: #17432 * Merged PR 312: moved dutchf3_section to experiments/interpretation moved dutchf3_section to experiments/interpretation Related work items: #17683 * Merged PR 309: minor change to README to reflect the changes in prepare_data script minor change to README to reflect the changes in prepare_data script Related work items: #17681 * Merged PR 315: Removing voxel exp Related work items: #17702 * sync with new experiment structure * sync with new experiment structure * added a logging handler for array metrics * added a logging handler for array metrics * first draft of metrics based on the ignite confusion matrix * first draft of metrics based on the ignite confusion matrix * metrics now based on ignite.metrics * metrics now based on ignite.metrics * modified patch train.py with new metrics * modified patch train.py with new metrics * Merged PR 361: VOXEL: fixes to original voxel2pixel code to make it work with the rest of the repo. Realized there was one bug in the code and the rest of the functions did not work with the different versions of libraries which we have listed in the conda yaml file. Also updated the download script. Related work items: #18264 * modified metrics with ignore_index * modified metrics with ignore_index * Merged PR 405: minor mods to notebook, more documentation A very small PR - Just a few more lines of documentation in the notebook, to improve clarity. Related work items: #17432 * Merged PR 368: Adds penobscot Adds for penobscot - Dataset reader - Training script - Testing script - Section depth augmentation - Patch depth augmentation - Iinline visualisation for Tensorboard Related work items: #14560, #17697, #17699, #17700 * Merged PR 407: Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Related work items: #16362 * Merged PR 452: decouple docker image creation from azureml removed all azureml dependencies from 010_CreateExperimentationDockerImage_GeophysicsTutorial_FWI_Azure_devito.ipynb All other changes are due to trivial reruns Related work items: #18346 * Merged PR 512: Pre-commit hooks for formatting and style checking Opening this PR to start the discussion - I added the required dotenv files and instructions for setting up pre-commit hooks for formatting and style checking. For formatting, we are using black, and style checking flake8. The following files are added: - .pre-commit-config.yaml - defines git hooks to be installed - .flake8 - settings for flake8 linter - pyproject.toml - settings for black formatter The last two files define the formatting and linting style we want to enforce on the repo. All of us would set up the pre-commit hooks locally, so regardless of what formatting/linting settings we have in our local editors, the settings specified by the git hooks would still be enforced prior to the commit, to ensure consistency among contributors. Some questions to start the discussion: - Do you want to change any of the default settings in the dotenv files - like the line lengths, error messages we exclude or include, or anything like that. - Do we want to have a requirements-dev.txt file for contributors? This setup uses pre-commit package, I didn't include it in the environment.yaml file, but instead instructed the user to install it in the CONTRIBUTING.MD file. - Once you have the hooks installed, it will only affect the files you are committing in the future. A big chunk of our codebase does not conform to the formatting/style settings. We will have to run the hooks on the codebase retrospectively. I'm happy to do that, but it will create many changes and a significant looking PR :) Any thoughts on how we should approach this? Thanks! Related work items: #18350 * Merged PR 513: 3D training script for Waldeland's model with Ignite Related work items: #16356 * Merged PR 565: Demo notebook updated with 3D graph Changes: 1) Updated demo notebook with the 3D visualization 2) Formatting changes due to new black/flake8 git hook Related work items: #17432 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * merged tests into this branch * merged tests into this branch * Merged PR 569: Minor PR: change to pre-commit configuration files Related work items: #18350 * Merged PR 586: Purging unused files and experiments Purging unused files and experiments Related work items: #20499 * moved prepare data under scripts * moved prepare data under scripts * removed untested model configs * removed untested model configs * fixed weird bug in penobscot data loader * fixed weird bug in penobscot data loader * penobscot experiments working for hrnet, seresnet, no depth and patch depth * penobscot experiments working for hrnet, seresnet, no depth and patch depth * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * fixed bugs in my previous 'fix' * fixed bugs in my previous 'fix' * removed redundant _open_mask from subclasses * removed redundant _open_mask from subclasses * Merged PR 601: Fixes to penobscot experiments A few changes: - Instructions in README on how to download and process Penobscot and F3 2D data sets - moved prepare_data scripts to the scripts/ directory - fixed a weird issue with a class method in Penobscot data loader - fixed a bug in section loader (_add_extra_channel in section loader was not necessary and was causing an issue) - removed config files that were not tested or working in Penobscot experiments - modified default.py so it's working if train.py ran without a config file Related work items: #20694 * Merged PR 605: added common metrics to Waldeland model in Ignite Related work items: #19550 * Removed redundant extract_metric_from * Removed redundant extract_metric_from * formatting changes in metrics * formatting changes in metrics * modified penobscot experiment to use new local metrics * modified penobscot experiment to use new local metrics * modified section experimen to pass device to metrics * modified section experimen to pass device to metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * fixed other experiments after new metrics * fixed other experiments after new metrics * removed apex metrics from distributed train.py * removed apex metrics from distributed train.py * added ignite-based metrics to dutch voxel experiment * added ignite-based metrics to dutch voxel experiment * removed apex metrics * removed apex metrics * modified penobscot test script to use new metrics * pytorch-ignite pre-release with new metrics until stable available * removed cell output from the F3 notebook * deleted .vscode * modified metric import in test_metrics.py * separated metrics out as a module * relative logger file path, modified section experiment * removed the REPO_PATH from init * created util logging function, and moved logging file to each experiment * modified demo experiment * modified penobscot experiment * modified dutchf3_voxel experiment * no logging in voxel2pixel * modified dutchf3 patch local experiment * modified patch distributed experiment * modified interpretation notebook * minor changes to comments * DOC: forking dislaimer and new build names. (#9) * Updating README.md with introduction material (#10) * Update README with introduction to DeepSeismic Add intro material for DeepSeismic * Adding logo file * Adding image to readme * Update README.md * Updates the 3D visualisation to use itkwidgets (#11) * Updates notebook to use itkwidgets for interactive visualisation * Adds jupytext to pre-commit (#12) * Add jupytext * Adds demo notebook for HRNet (#13) * Adding TF 2.0 to allow for tensorboard vis in notebooks * Modifies hrnet config for notebook * Add HRNet notebook for demo * Updates HRNet notebook and tidies F3 * removed my username references (#15) * moving 3D models into contrib folder (#16) * Weetok (#17) * Update it to include sections for imaging * Update README.md * Update README.md * fixed link for F3 download * MINOR: python version fix to 3.6.7 (#72) * Adding system requirements in README (#74) * Update main_build.yml for Azure Pipelines * Update main_build.yml for Azure Pipelines * BUILD: added build status badges (#6) * Adds dataloader for numpy datasets as well as demo pipeline for such a dataset (#7) * Finished version of numpy data loader * Working training script for demo * Adds the new metrics * Fixes docstrings and adds header * Removing extra setup.py * Log config file now experiment specific (#8) * Merging work on salt dataset * Adds computer vision to dependencies * Updates dependecies * Update * Updates the environemnt files * Updates readme and envs * Initial running version of dutchf3 * INFRA: added structure templates. * VOXEL: initial rough code push - need to clean up before PRing. * Working version * Working version before refactor * quick minor fixes in README * 3D SEG: first commit for PR. * 3D SEG: removed data files to avoid redistribution. * Updates * 3D SEG: restyled batch file, moving onto others. * Working HRNet * 3D SEG: finished going through Waldeland code * Updates test scripts and makes it take processing arguments * minor update * Fixing imports * Refactoring the experiments * Removing .vscode * Updates gitignore * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * minor wording fix * minor wording fix * enabled splitting dataset into sections, rather than only patches * enabled splitting dataset into sections, rather than only patches * merged duplicate ifelse blocks * merged duplicate ifelse blocks * refactored prepare_data.py * refactored prepare_data.py * added scripts for section train test * added scripts for section train test * section train/test works for single channel input * section train/test works for single channel input * Merged PR 174: F3 Dutch README, and fixed issues in prepare_data.py This PR includes the following changes: - added README instructions for running f3dutch experiments - prepare_dataset.py didn't work for creating section-based splits, so I fixed a few issues. There are no changes to the patch-based splitting logic. - ran black formatter on the file, which created all the formatting changes (sorry!) * Merged PR 204: Adds loaders to deepseismic from cv_lib * train and test script for section based training/testing * train and test script for section based training/testing * Merged PR 209: changes to section loaders in data.py Changes in this PR will affect patch scripts as well. The following are required changes in patch scripts: - get_train_loader() in train.py should be changed to get_patch_loader(). I created separate function to load section and patch loaders. - SectionLoader now swaps H and W dims. When loading test data in patch, this line can be removed (and tested) from test.py h, w = img.shape[-2], img.shape[-1] # height and width * Merged PR 210: BENCHMARKS: added placeholder for benchmarks. BENCHMARKS: added placeholder for benchmarks. * Merged PR 211: Fixes issues left over from changes to data.py * removing experiments from deep_seismic, following the new struct * removing experiments from deep_seismic, following the new struct * Merged PR 220: Adds Horovod and fixes Add Horovod training script Updates dependencies in Horovod docker file Removes hard coding of path in data.py * section train/test scripts * section train/test scripts * Add cv_lib to repo and updates instructions * Add cv_lib to repo and updates instructions * Removes data.py and updates readme * Removes data.py and updates readme * Updates requirements * Updates requirements * Merged PR 222: Moves cv_lib into repo and updates setup instructions * renamed train/test scripts * renamed train/test scripts * train test works on alaudah section experiments, a few minor bugs left * train test works on alaudah section experiments, a few minor bugs left * cleaning up loaders * cleaning up loaders * Merged PR 236: Cleaned up dutchf3 data loaders @<Mathew Salvaris> , @<Ilia Karmanov> , @<Max Kaznady> , please check out if this PR will affect your experiments. The main change is with the initialization of sections/patches attributes of loaders. Previously, we were unnecessarily assigning all train/val splits to train loaders, rather than only those belonging to the given split for that loader. Similar for test loaders. This will affect your code if you access these attributes. E.g. if you have something like this in your experiments: ``` train_set = TrainPatchLoader(…) patches = train_set.patches[train_set.split] ``` or ``` train_set = TrainSectionLoader(…) sections = train_set.sections[train_set.split] ``` * training testing for sections works * training testing for sections works * minor changes * minor changes * reverting changes on dutchf3/local/default.py file * reverting changes on dutchf3/local/default.py file * added config file * added config file * Updates the repo with preliminary results for 2D segmentation * Merged PR 248: Experiment: section-based Alaudah training/testing This PR includes the section-based experiments on dutchf3 to replicate Alaudah's work. No changes were introduced to the code outside this experiment. * Merged PR 253: Waldeland based voxel loaders and TextureNet model Related work items: #16357 * Merged PR 290: A demo notebook on local train/eval on F3 data set Notebook and associated files + minor change in a patch_deconvnet_skip.py model file. Related work items: #17432 * Merged PR 312: moved dutchf3_section to experiments/interpretation moved dutchf3_section to experiments/interpretation Related work items: #17683 * Merged PR 309: minor change to README to reflect the changes in prepare_data script minor change to README to reflect the changes in prepare_data script Related work items: #17681 * Merged PR 315: Removing voxel exp Related work items: #17702 * sync with new experiment structure * sync with new experiment structure * added a logging handler for array metrics * added a logging handler for array metrics * first draft of metrics based on the ignite confusion matrix * first draft of metrics based on the ignite confusion matrix * metrics now based on ignite.metrics * metrics now based on ignite.metrics * modified patch train.py with new metrics * modified patch train.py with new metrics * Merged PR 361: VOXEL: fixes to original voxel2pixel code to make it work with the rest of the repo. Realized there was one bug in the code and the rest of the functions did not work with the different versions of libraries which we have listed in the conda yaml file. Also updated the download script. Related work items: #18264 * modified metrics with ignore_index * modified metrics with ignore_index * Merged PR 405: minor mods to notebook, more documentation A very small PR - Just a few more lines of documentation in the notebook, to improve clarity. Related work items: #17432 * Merged PR 368: Adds penobscot Adds for penobscot - Dataset reader - Training script - Testing script - Section depth augmentation - Patch depth augmentation - Iinline visualisation for Tensorboard Related work items: #14560, #17697, #17699, #17700 * Merged PR 407: Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Related work items: #16362 * Merged PR 452: decouple docker image creation from azureml removed all azureml dependencies from 010_CreateExperimentationDockerImage_GeophysicsTutorial_FWI_Azure_devito.ipynb All other changes are due to trivial reruns Related work items: #18346 * Merged PR 512: Pre-commit hooks for formatting and style checking Opening this PR to start the discussion - I added the required dotenv files and instructions for setting up pre-commit hooks for formatting and style checking. For formatting, we are using black, and style checking flake8. The following files are added: - .pre-commit-config.yaml - defines git hooks to be installed - .flake8 - settings for flake8 linter - pyproject.toml - settings for black formatter The last two files define the formatting and linting style we want to enforce on the repo. All of us would set up the pre-commit hooks locally, so regardless of what formatting/linting settings we have in our local editors, the settings specified by the git hooks would still be enforced prior to the commit, to ensure consistency among contributors. Some questions to start the discussion: - Do you want to change any of the default settings in the dotenv files - like the line lengths, error messages we exclude or include, or anything like that. - Do we want to have a requirements-dev.txt file for contributors? This setup uses pre-commit package, I didn't include it in the environment.yaml file, but instead instructed the user to install it in the CONTRIBUTING.MD file. - Once you have the hooks installed, it will only affect the files you are committing in the future. A big chunk of our codebase does not conform to the formatting/style settings. We will have to run the hooks on the codebase retrospectively. I'm happy to do that, but it will create many changes and a significant looking PR :) Any thoughts on how we should approach this? Thanks! Related work items: #18350 * Merged PR 513: 3D training script for Waldeland's model with Ignite Related work items: #16356 * Merged PR 565: Demo notebook updated with 3D graph Changes: 1) Updated demo notebook with the 3D visualization 2) Formatting changes due to new black/flake8 git hook Related work items: #17432 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * merged tests into this branch * merged tests into this branch * Merged PR 569: Minor PR: change to pre-commit configuration files Related work items: #18350 * Merged PR 586: Purging unused files and experiments Purging unused files and experiments Related work items: #20499 * moved prepare data under scripts * moved prepare data under scripts * removed untested model configs * removed untested model configs * fixed weird bug in penobscot data loader * fixed weird bug in penobscot data loader * penobscot experiments working for hrnet, seresnet, no depth and patch depth * penobscot experiments working for hrnet, seresnet, no depth and patch depth * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * fixed bugs in my previous 'fix' * fixed bugs in my previous 'fix' * removed redundant _open_mask from subclasses * removed redundant _open_mask from subclasses * Merged PR 601: Fixes to penobscot experiments A few changes: - Instructions in README on how to download and process Penobscot and F3 2D data sets - moved prepare_data scripts to the scripts/ directory - fixed a weird issue with a class method in Penobscot data loader - fixed a bug in section loader (_add_extra_channel in section loader was not necessary and was causing an issue) - removed config files that were not tested or working in Penobscot experiments - modified default.py so it's working if train.py ran without a config file Related work items: #20694 * Merged PR 605: added common metrics to Waldeland model in Ignite Related work items: #19550 * Removed redundant extract_metric_from * Removed redundant extract_metric_from * formatting changes in metrics * formatting changes in metrics * modified penobscot experiment to use new local metrics * modified penobscot experiment to use new local metrics * modified section experimen to pass device to metrics * modified section experimen to pass device to metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * fixed other experiments after new metrics * fixed other experiments after new metrics * removed apex metrics from distributed train.py * removed apex metrics from distributed train.py * added ignite-based metrics to dutch voxel experiment * added ignite-based metrics to dutch voxel experiment * removed apex metrics * removed apex metrics * modified penobscot test script to use new metrics * pytorch-ignite pre-release with new metrics until stable available * removed cell output from the F3 notebook * deleted .vscode * modified metric import in test_metrics.py * separated metrics out as a module * relative logger file path, modified section experiment * removed the REPO_PATH from init * created util logging function, and moved logging file to each experiment * modified demo experiment * modified penobscot experiment * modified dutchf3_voxel experiment * no logging in voxel2pixel * modified dutchf3 patch local experiment * modified patch distributed experiment * modified interpretation notebook * minor changes to comments * DOC: forking dislaimer and new build names. (#9) * Updating README.md with introduction material (#10) * Update README with introduction to DeepSeismic Add intro material for DeepSeismic * Adding logo file * Adding image to readme * Update README.md * Updates the 3D visualisation to use itkwidgets (#11) * Updates notebook to use itkwidgets for interactive visualisation * Adds jupytext to pre-commit (#12) * Add jupytext * Adds demo notebook for HRNet (#13) * Adding TF 2.0 to allow for tensorboard vis in notebooks * Modifies hrnet config for notebook * Add HRNet notebook for demo * Updates HRNet notebook and tidies F3 * removed my username references (#15) * moving 3D models into contrib folder (#16) * Weetok (#17) * Update it to include sections for imaging * Update README.md * Update README.md * added system requirements to readme * sdk 1.0.76; tested conda env vs docker image; extented readme * removed reference to imaging * minor md formatting * minor md formatting * addressing multiple issues from first bug bash (#81) * added README documentation per bug bush feedback * DOC: added HRNET download info to README * added hrnet download script and tested it * added legal headers to a few scripts. * changed /data to ~data in the main README * added Troubleshooting section to the README * Dciborow/build bug (#68) * Update unit_test_steps.yml * Update environment.yml * Update setup_step.yml * Update setup_step.yml * Update unit_test_steps.yml * Update setup_step.yml * Adds AzureML libraries (#82) * Adds azure dependencies * Adds AzureML components * Fixes download script (#84) * Fixes download script * Updates readme * clarify which DSVM we want to use - Ubuntu GPU-enabled VM, preferably NC12 - Issue #83 * Add Troubleshooting section for DSVM warnings #89 * Add Troubleshooting section for DSVM warnings, plus typo #89 * modified hrnet notebook, addressing bug bash issues (#95) * Update environment.yml (#93) * Update environment.yml * Update environment.yml * tested both yml conda env and docker; udated conda yml to have docker sdk * tested both yml conda env and docker; udated conda yml to have docker sdk; added * NVIDIA Tesla K80 (or V100 GPU for NCv2 series) - per Vanja's comment * notebook integration tests complete (#106) * added README documentation per bug bush feedback * HRNet notebook works with tests now * removed debug material from the notebook * corrected duplicate build names * conda init fix * changed setup deps * fixed F3 notebook - merge conflict and pytorch bug * main and notebook builds have functional setup now * Mat/test (#105) * added README documentation per bug bush feedback * Modifies scripts to run for only afew iterations when in debug/test mode * Updates training scripts and build * Making names unique * Fixes conda issue * HRNet notebook works with tests now * removed debug material from the notebook * corrected duplicate build names * conda init fix * Adds docstrings to training script * Testing somehting out * testing * test * test * test * test * test * test * test * test * test * test * test * adds seresnet * Modifies to work outside of git env * test * test * Fixes typo in DATASET * reducing steps * test * test * fixes the argument * Altering batch size to fit k80 * reducing batch size further * test * test * test * test * fixes distributed * test * test * adds missing import * Adds further tests * test * updates * test * Fixes section script * test * testing everyting once through * Final run for badge * changed setup deps, fixed F3 notebook * Adds missing tests (#111) * added missing tests * Adding fixes for test * reinstating all tests * Maxkaz/issues (#110) * added README documentation per bug bush feedback * added missing tests * closing out multiple post bug bash issues with single PR * Addressed comments * minor change * Adds Readme information to experiments (#112) * Adds readmes to experiments * Updates instructions based on feedback * Update README.md * BugBash2 Issue #83 and #89: clarify which DSVM we want to use - Ubuntu GPU-enabled VM, preferably NC12 (#88) * azureml sdk 1.0.74; foxed a few issues around ACR access; added nb 030 for scalability testing * azureml sdk 1.0.74; foxed a few issues around ACR access; added nb 030 for scalability testing * merge upstream into my fork (#1) * MINOR: addressing broken F3 download link (#73) * Update main_build.yml for Azure Pipelines * Update main_build.yml for Azure Pipelines * BUILD: added build status badges (#6) * Adds dataloader for numpy datasets as well as demo pipeline for such a dataset (#7) * Finished version of numpy data loader * Working training script for demo * Adds the new metrics * Fixes docstrings and adds header * Removing extra setup.py * Log config file now experiment specific (#8) * Merging work on salt dataset * Adds computer vision to dependencies * Updates dependecies * Update * Updates the environemnt files * Updates readme and envs * Initial running version of dutchf3 * INFRA: added structure templates. * VOXEL: initial rough code push - need to clean up before PRing. * Working version * Working version before refactor * quick minor fixes in README * 3D SEG: first commit for PR. * 3D SEG: removed data files to avoid redistribution. * Updates * 3D SEG: restyled batch file, moving onto others. * Working HRNet * 3D SEG: finished going through Waldeland code * Updates test scripts and makes it take processing arguments * minor update * Fixing imports * Refactoring the experiments * Removing .vscode * Updates gitignore * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * minor wording fix * minor wording fix * enabled splitting dataset into sections, rather than only patches * enabled splitting dataset into sections, rather than only patches * merged duplicate ifelse blocks * merged duplicate ifelse blocks * refactored prepare_data.py * refactored prepare_data.py * added scripts for section train test * added scripts for section train test * section train/test works for single channel input * section train/test works for single channel input * Merged PR 174: F3 Dutch README, and fixed issues in prepare_data.py This PR includes the following changes: - added README instructions for running f3dutch experiments - prepare_dataset.py didn't work for creating section-based splits, so I fixed a few issues. There are no changes to the patch-based splitting logic. - ran black formatter on the file, which created all the formatting changes (sorry!) * Merged PR 204: Adds loaders to deepseismic from cv_lib * train and test script for section based training/testing * train and test script for section based training/testing * Merged PR 209: changes to section loaders in data.py Changes in this PR will affect patch scripts as well. The following are required changes in patch scripts: - get_train_loader() in train.py should be changed to get_patch_loader(). I created separate function to load section and patch loaders. - SectionLoader now swaps H and W dims. When loading test data in patch, this line can be removed (and tested) from test.py h, w = img.shape[-2], img.shape[-1] # height and width * Merged PR 210: BENCHMARKS: added placeholder for benchmarks. BENCHMARKS: added placeholder for benchmarks. * Merged PR 211: Fixes issues left over from changes to data.py * removing experiments from deep_seismic, following the new struct * removing experiments from deep_seismic, following the new struct * Merged PR 220: Adds Horovod and fixes Add Horovod training script Updates dependencies in Horovod docker file Removes hard coding of path in data.py * section train/test scripts * section train/test scripts * Add cv_lib to repo and updates instructions * Add cv_lib to repo and updates instructions * Removes data.py and updates readme * Removes data.py and updates readme * Updates requirements * Updates requirements * Merged PR 222: Moves cv_lib into repo and updates setup instructions * renamed train/test scripts * renamed train/test scripts * train test works on alaudah section experiments, a few minor bugs left * train test works on alaudah section experiments, a few minor bugs left * cleaning up loaders * cleaning up loaders * Merged PR 236: Cleaned up dutchf3 data loaders @<Mathew Salvaris> , @<Ilia Karmanov> , @<Max Kaznady> , please check out if this PR will affect your experiments. The main change is with the initialization of sections/patches attributes of loaders. Previously, we were unnecessarily assigning all train/val splits to train loaders, rather than only those belonging to the given split for that loader. Similar for test loaders. This will affect your code if you access these attributes. E.g. if you have something like this in your experiments: ``` train_set = TrainPatchLoader(…) patches = train_set.patches[train_set.split] ``` or ``` train_set = TrainSectionLoader(…) sections = train_set.sections[train_set.split] ``` * training testing for sections works * training testing for sections works * minor changes * minor changes * reverting changes on dutchf3/local/default.py file * reverting changes on dutchf3/local/default.py file * added config file * added config file * Updates the repo with preliminary results for 2D segmentation * Merged PR 248: Experiment: section-based Alaudah training/testing This PR includes the section-based experiments on dutchf3 to replicate Alaudah's work. No changes were introduced to the code outside this experiment. * Merged PR 253: Waldeland based voxel loaders and TextureNet model Related work items: #16357 * Merged PR 290: A demo notebook on local train/eval on F3 data set Notebook and associated files + minor change in a patch_deconvnet_skip.py model file. Related work items: #17432 * Merged PR 312: moved dutchf3_section to experiments/interpretation moved dutchf3_section to experiments/interpretation Related work items: #17683 * Merged PR 309: minor change to README to reflect the changes in prepare_data script minor change to README to reflect the changes in prepare_data script Related work items: #17681 * Merged PR 315: Removing voxel exp Related work items: #17702 * sync with new experiment structure * sync with new experiment structure * added a logging handler for array metrics * added a logging handler for array metrics * first draft of metrics based on the ignite confusion matrix * first draft of metrics based on the ignite confusion matrix * metrics now based on ignite.metrics * metrics now based on ignite.metrics * modified patch train.py with new metrics * modified patch train.py with new metrics * Merged PR 361: VOXEL: fixes to original voxel2pixel code to make it work with the rest of the repo. Realized there was one bug in the code and the rest of the functions did not work with the different versions of libraries which we have listed in the conda yaml file. Also updated the download script. Related work items: #18264 * modified metrics with ignore_index * modified metrics with ignore_index * Merged PR 405: minor mods to notebook, more documentation A very small PR - Just a few more lines of documentation in the notebook, to improve clarity. Related work items: #17432 * Merged PR 368: Adds penobscot Adds for penobscot - Dataset reader - Training script - Testing script - Section depth augmentation - Patch depth augmentation - Iinline visualisation for Tensorboard Related work items: #14560, #17697, #17699, #17700 * Merged PR 407: Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Related work items: #16362 * Merged PR 452: decouple docker image creation from azureml removed all azureml dependencies from 010_CreateExperimentationDockerImage_GeophysicsTutorial_FWI_Azure_devito.ipynb All other changes are due to trivial reruns Related work items: #18346 * Merged PR 512: Pre-commit hooks for formatting and style checking Opening this PR to start the discussion - I added the required dotenv files and instructions for setting up pre-commit hooks for formatting and style checking. For formatting, we are using black, and style checking flake8. The following files are added: - .pre-commit-config.yaml - defines git hooks to be installed - .flake8 - settings for flake8 linter - pyproject.toml - settings for black formatter The last two files define the formatting and linting style we want to enforce on the repo. All of us would set up the pre-commit hooks locally, so regardless of what formatting/linting settings we have in our local editors, the settings specified by the git hooks would still be enforced prior to the commit, to ensure consistency among contributors. Some questions to start the discussion: - Do you want to change any of the default settings in the dotenv files - like the line lengths, error messages we exclude or include, or anything like that. - Do we want to have a requirements-dev.txt file for contributors? This setup uses pre-commit package, I didn't include it in the environment.yaml file, but instead instructed the user to install it in the CONTRIBUTING.MD file. - Once you have the hooks installed, it will only affect the files you are committing in the future. A big chunk of our codebase does not conform to the formatting/style settings. We will have to run the hooks on the codebase retrospectively. I'm happy to do that, but it will create many changes and a significant looking PR :) Any thoughts on how we should approach this? Thanks! Related work items: #18350 * Merged PR 513: 3D training script for Waldeland's model with Ignite Related work items: #16356 * Merged PR 565: Demo notebook updated with 3D graph Changes: 1) Updated demo notebook with the 3D visualization 2) Formatting changes due to new black/flake8 git hook Related work items: #17432 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * merged tests into this branch * merged tests into this branch * Merged PR 569: Minor PR: change to pre-commit configuration files Related work items: #18350 * Merged PR 586: Purging unused files and experiments Purging unused files and experiments Related work items: #20499 * moved prepare data under scripts * moved prepare data under scripts * removed untested model configs * removed untested model configs * fixed weird bug in penobscot data loader * fixed weird bug in penobscot data loader * penobscot experiments working for hrnet, seresnet, no depth and patch depth * penobscot experiments working for hrnet, seresnet, no depth and patch depth * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * fixed bugs in my previous 'fix' * fixed bugs in my previous 'fix' * removed redundant _open_mask from subclasses * removed redundant _open_mask from subclasses * Merged PR 601: Fixes to penobscot experiments A few changes: - Instructions in README on how to download and process Penobscot and F3 2D data sets - moved prepare_data scripts to the scripts/ directory - fixed a weird issue with a class method in Penobscot data loader - fixed a bug in section loader (_add_extra_channel in section loader was not necessary and was causing an issue) - removed config files that were not tested or working in Penobscot experiments - modified default.py so it's working if train.py ran without a config file Related work items: #20694 * Merged PR 605: added common metrics to Waldeland model in Ignite Related work items: #19550 * Removed redundant extract_metric_from * Removed redundant extract_metric_from * formatting changes in metrics * formatting changes in metrics * modified penobscot experiment to use new local metrics * modified penobscot experiment to use new local metrics * modified section experimen to pass device to metrics * modified section experimen to pass device to metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * fixed other experiments after new metrics * fixed other experiments after new metrics * removed apex metrics from distributed train.py * removed apex metrics from distributed train.py * added ignite-based metrics to dutch voxel experiment * added ignite-based metrics to dutch voxel experiment * removed apex metrics * removed apex metrics * modified penobscot test script to use new metrics * pytorch-ignite pre-release with new metrics until stable available * removed cell output from the F3 notebook * deleted .vscode * modified metric import in test_metrics.py * separated metrics out as a module * relative logger file path, modified section experiment * removed the REPO_PATH from init * created util logging function, and moved logging file to each experiment * modified demo experiment * modified penobscot experiment * modified dutchf3_voxel experiment * no logging in voxel2pixel * modified dutchf3 patch local experiment * modified patch distributed experiment * modified interpretation notebook * minor changes to comments * DOC: forking dislaimer and new build names. (#9) * Updating README.md with introduction material (#10) * Update README with introduction to DeepSeismic Add intro material for DeepSeismic * Adding logo file * Adding image to readme * Update README.md * Updates the 3D visualisation to use itkwidgets (#11) * Updates notebook to use itkwidgets for interactive visualisation * Adds jupytext to pre-commit (#12) * Add jupytext * Adds demo notebook for HRNet (#13) * Adding TF 2.0 to allow for tensorboard vis in notebooks * Modifies hrnet config for notebook * Add HRNet notebook for demo * Updates HRNet notebook and tidies F3 * removed my username references (#15) * moving 3D models into contrib folder (#16) * Weetok (#17) * Update it to include sections for imaging * Update README.md * Update README.md * fixed link for F3 download * MINOR: python version fix to 3.6.7 (#72) * Adding system requirements in README (#74) * Update main_build.yml for Azure Pipelines * Update main_build.yml for Azure Pipelines * BUILD: added build status badges (#6) * Adds dataloader for numpy datasets as well as demo pipeline for such a dataset (#7) * Finished version of numpy data loader * Working training script for demo * Adds the new metrics * Fixes docstrings and adds header * Removing extra setup.py * Log config file now experiment specific (#8) * Merging work on salt dataset * Adds computer vision to dependencies * Updates dependecies * Update * Updates the environemnt files * Updates readme and envs * Initial running version of dutchf3 * INFRA: added structure templates. * VOXEL: initial rough code push - need to clean up before PRing. * Working version * Working version before refactor * quick minor fixes in README * 3D SEG: first commit for PR. * 3D SEG: removed data files to avoid redistribution. * Updates * 3D SEG: restyled batch file, moving onto others. * Working HRNet * 3D SEG: finished going through Waldeland code * Updates test scripts and makes it take processing arguments * minor update * Fixing imports * Refactoring the experiments * Removing .vscode * Updates gitignore * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * minor wording fix * minor wording fix * enabled splitting dataset into sections, rather than only patches * enabled splitting dataset into sections, rather than only patches * merged duplicate ifelse blocks * merged duplicate ifelse blocks * refactored prepare_data.py * refactored prepare_data.py * added scripts for section train test * added scripts for section train test * section train/test works for single channel input * section train/test works for single channel input * Merged PR 174: F3 Dutch README, and fixed issues in prepare_data.py This PR includes the following changes: - added README instructions for running f3dutch experiments - prepare_dataset.py didn't work for creating section-based splits, so I fixed a few issues. There are no changes to the patch-based splitting logic. - ran black formatter on the file, which created all the formatting changes (sorry!) * Merged PR 204: Adds loaders to deepseismic from cv_lib * train and test script for section based training/testing * train and test script for section based training/testing * Merged PR 209: changes to section loaders in data.py Changes in this PR will affect patch scripts as well. The following are required changes in patch scripts: - get_train_loader() in train.py should be changed to get_patch_loader(). I created separate function to load section and patch loaders. - SectionLoader now swaps H and W dims. When loading test data in patch, this line can be removed (and tested) from test.py h, w = img.shape[-2], img.shape[-1] # height and width * Merged PR 210: BENCHMARKS: added placeholder for benchmarks. BENCHMARKS: added placeholder for benchmarks. * Merged PR 211: Fixes issues left over from changes to data.py * removing experiments from deep_seismic, following the new struct * removing experiments from deep_seismic, following the new struct * Merged PR 220: Adds Horovod and fixes Add Horovod training script Updates dependencies in Horovod docker file Removes hard coding of path in data.py * section train/test scripts * section train/test scripts * Add cv_lib to repo and updates instructions * Add cv_lib to repo and updates instructions * Removes data.py and updates readme * Removes data.py and updates readme * Updates requirements * Updates requirements * Merged PR 222: Moves cv_lib into repo and updates setup instructions * renamed train/test scripts * renamed train/test scripts * train test works on alaudah section experiments, a few minor bugs left * train test works on alaudah section experiments, a few minor bugs left * cleaning up loaders * cleaning up loaders * Merged PR 236: Cleaned up dutchf3 data loaders @<Mathew Salvaris> , @<Ilia Karmanov> , @<Max Kaznady> , please check out if this PR will affect your experiments. The main change is with the initialization of sections/patches attributes of loaders. Previously, we were unnecessarily assigning all train/val splits to train loaders, rather than only those belonging to the given split for that loader. Similar for test loaders. This will affect your code if you access these attributes. E.g. if you have something like this in your experiments: ``` train_set = TrainPatchLoader(…) patches = train_set.patches[train_set.split] ``` or ``` train_set = TrainSectionLoader(…) sections = train_set.sections[train_set.split] ``` * training testing for sections works * training testing for sections works * minor changes * minor changes * reverting changes on dutchf3/local/default.py file * reverting changes on dutchf3/local/default.py file * added config file * added config file * Updates the repo with preliminary results for 2D segmentation * Merged PR 248: Experiment: section-based Alaudah training/testing This PR includes the section-based experiments on dutchf3 to replicate Alaudah's work. No changes were introduced to the code outside this experiment. * Merged PR 253: Waldeland based voxel loaders and TextureNet model Related work items: #16357 * Merged PR 290: A demo notebook on local train/eval on F3 data set Notebook and associated files + minor change in a patch_deconvnet_skip.py model file. Related work items: #17432 * Merged PR 312: moved dutchf3_section to experiments/interpretation moved dutchf3_section to experiments/interpretation Related work items: #17683 * Merged PR 309: minor change to README to reflect the changes in prepare_data script minor change to README to reflect the changes in prepare_data script Related work items: #17681 * Merged PR 315: Removing voxel exp Related work items: #17702 * sync with new experiment structure * sync with new experiment structure * added a logging handler for array metrics * added a logging handler for array metrics * first draft of metrics based on the ignite confusion matrix * first draft of metrics based on the ignite confusion matrix * metrics now based on ignite.metrics * metrics now based on ignite.metrics * modified patch train.py with new metrics * modified patch train.py with new metrics * Merged PR 361: VOXEL: fixes to original voxel2pixel code to make it work with the rest of the repo. Realized there was one bug in the code and the rest of the functions did not work with the different versions of libraries which we have listed in the conda yaml file. Also updated the download script. Related work items: #18264 * modified metrics with ignore_index * modified metrics with ignore_index * Merged PR 405: minor mods to notebook, more documentation A very small PR - Just a few more lines of documentation in the notebook, to improve clarity. Related work items: #17432 * Merged PR 368: Adds penobscot Adds for penobscot - Dataset reader - Training script - Testing script - Section depth augmentation - Patch depth augmentation - Iinline visualisation for Tensorboard Related work items: #14560, #17697, #17699, #17700 * Merged PR 407: Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Related work items: #16362 * Merged PR 452: decouple docker image creation from azureml removed all azureml dependencies from 010_CreateExperimentationDockerImage_GeophysicsTutorial_FWI_Azure_devito.ipynb All other changes are due to trivial reruns Related work items: #18346 * Merged PR 512: Pre-commit hooks for formatting and style checking Opening this PR to start the discussion - I added the required dotenv files and instructions for setting up pre-commit hooks for formatting and style checking. For formatting, we are using black, and style checking flake8. The following files are added: - .pre-commit-config.yaml - defines git hooks to be installed - .flake8 - settings for flake8 linter - pyproject.toml - settings for black formatter The last two files define the formatting and linting style we want to enforce on the repo. All of us would set up the pre-commit hooks locally, so regardless of what formatting/linting settings we have in our local editors, the settings specified by the git hooks would still be enforced prior to the commit, to ensure consistency among contributors. Some questions to start the discussion: - Do you want to change any of the default settings in the dotenv files - like the line lengths, error messages we exclude or include, or anything like that. - Do we want to have a requirements-dev.txt file for contributors? This setup uses pre-commit package, I didn't include it in the environment.yaml file, but instead instructed the user to install it in the CONTRIBUTING.MD file. - Once you have the hooks installed, it will only affect the files you are committing in the future. A big chunk of our codebase does not conform to the formatting/style settings. We will have to run the hooks on the codebase retrospectively. I'm happy to do that, but it will create many changes and a significant looking PR :) Any thoughts on how we should approach this? Thanks! Related work items: #18350 * Merged PR 513: 3D training script for Waldeland's model with Ignite Related work items: #16356 * Merged PR 565: Demo notebook updated with 3D graph Changes: 1) Updated demo notebook with the 3D visualization 2) Formatting changes due to new black/flake8 git hook Related work items: #17432 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * merged tests into this branch * merged tests into this branch * Merged PR 569: Minor PR: change to pre-commit configuration files Related work items: #18350 * Merged PR 586: Purging unused files and experiments Purging unused files and experiments Related work items: #20499 * moved prepare data under scripts * moved prepare data under scripts * removed untested model configs * removed untested model configs * fixed weird bug in penobscot data loader * fixed weird bug in penobscot data loader * penobscot experiments working for hrnet, seresnet, no depth and patch depth * penobscot experiments working for hrnet, seresnet, no depth and patch depth * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * fixed bugs in my previous 'fix' * fixed bugs in my previous 'fix' * removed redundant _open_mask from subclasses * removed redundant _open_mask from subclasses * Merged PR 601: Fixes to penobscot experiments A few changes: - Instructions in README on how to download and process Penobscot and F3 2D data sets - moved prepare_data scripts to the scripts/ directory - fixed a weird issue with a class method in Penobscot data loader - fixed a bug in section loader (_add_extra_channel in section loader was not necessary and was causing an issue) - removed config files that were not tested or working in Penobscot experiments - modified default.py so it's working if train.py ran without a config file Related work items: #20694 * Merged PR 605: added common metrics to Waldeland model in Ignite Related work items: #19550 * Removed redundant extract_metric_from * Removed redundant extract_metric_from * formatting changes in metrics * formatting changes in metrics * modified penobscot experiment to use new local metrics * modified penobscot experiment to use new local metrics * modified section experimen to pass device to metrics * modified section experimen to pass device to metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * fixed other experiments after new metrics * fixed other experiments after new metrics * removed apex metrics from distributed train.py * removed apex metrics from distributed train.py * added ignite-based metrics to dutch voxel experiment * added ignite-based metrics to dutch voxel experiment * removed apex metrics * removed apex metrics * modified penobscot test script to use new metrics * pytorch-ignite pre-release with new metrics until stable available * removed cell output from the F3 notebook * deleted .vscode * modified metric import in test_metrics.py * separated metrics out as a module * relative logger file path, modified section experiment * removed the REPO_PATH from init * created util logging function, and moved logging file to each experiment * modified demo experiment * modified penobscot experiment * modified dutchf3_voxel experiment * no logging in voxel2pixel * modified dutchf3 patch local experiment * modified patch distributed experiment * modified interpretation notebook * minor changes to comments * DOC: forking dislaimer and new build names. (#9) * Updating README.md with introduction material (#10) * Update README with introduction to DeepSeismic Add intro material for DeepSeismic * Adding logo file * Adding image to readme * Update README.md * Updates the 3D visualisation to use itkwidgets (#11) * Updates notebook to use itkwidgets for interactive visualisation * Adds jupytext to pre-commit (#12) * Add jupytext * Adds demo notebook for HRNet (#13) * Adding TF 2.0 to allow for tensorboard vis in notebooks * Modifies hrnet config for notebook * Add HRNet notebook for demo * Updates HRNet notebook and tidies F3 * removed my username references (#15) * moving 3D models into contrib folder (#16) * Weetok (#17) * Update it to include sections for imaging * Update README.md * Update README.md * added system requirements to readme * sdk 1.0.76; tested conda env vs docker image; extented readme * removed reference to imaging * minor md formatting * minor md formatting * clarify which DSVM we want to use - Ubuntu GPU-enabled VM, preferably NC12 - Issue #83 * Add Troubleshooting section for DSVM warnings #89 * Add Troubleshooting section for DSVM warnings, plus typo #89 * tested both yml conda env and docker; udated conda yml to have docker sdk * tested both yml conda env and docker; udated conda yml to have docker sdk; added * NVIDIA Tesla K80 (or V100 GPU for NCv2 series) - per Vanja's comment * Update README.md * BugBash2 Issue #83 and #89: clarify which DSVM we want to use - Ubuntu GPU-enabled VM, preferably NC12 (#88) (#2) * azureml sdk 1.0.74; foxed a few issues around ACR access; added nb 030 for scalability testing * azureml sdk 1.0.74; foxed a few issues around ACR access; added nb 030 for scalability testing * merge upstream into my fork (#1) * MINOR: addressing broken F3 download link (#73) * Update main_build.yml for Azure Pipelines * Update main_build.yml for Azure Pipelines * BUILD: added build status badges (#6) * Adds dataloader for numpy datasets as well as demo pipeline for such a dataset (#7) * Finished version of numpy data loader * Working training script for demo * Adds the new metrics * Fixes docstrings and adds header * Removing extra setup.py * Log config file now experiment specific (#8) * Merging work on salt dataset * Adds computer vision to dependencies * Updates dependecies * Update * Updates the environemnt files * Updates readme and envs * Initial running version of dutchf3 * INFRA: added structure templates. * VOXEL: initial rough code push - need to clean up before PRing. * Working version * Working version before refactor * quick minor fixes in README * 3D SEG: first commit for PR. * 3D SEG: removed data files to avoid redistribution. * Updates * 3D SEG: restyled batch file, moving onto others. * Working HRNet * 3D SEG: finished going through Waldeland code * Updates test scripts and makes it take processing arguments * minor update * Fixing imports * Refactoring the experiments * Removing .vscode * Updates gitignore * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * minor wording fix * minor wording fix * enabled splitting dataset into sections, rather than only patches * enabled splitting dataset into sections, rather than only patches * merged duplicate ifelse blocks * merged duplicate ifelse blocks * refactored prepare_data.py * refactored prepare_data.py * added scripts for section train test * added scripts for section train test * section train/test works for single channel input * section train/test works for single channel input * Merged PR 174: F3 Dutch README, and fixed issues in prepare_data.py This PR includes the following changes: - added README instructions for running f3dutch experiments - prepare_dataset.py didn't work for creating section-based splits, so I fixed a few issues. There are no changes to the patch-based splitting logic. - ran black formatter on the file, which created all the formatting changes (sorry!) * Merged PR 204: Adds loaders to deepseismic from cv_lib * train and test script for section based training/testing * train and test script for section based training/testing * Merged PR 209: changes to section loaders in data.py Changes in this PR will affect patch scripts as well. The following are required changes in patch scripts: - get_train_loader() in train.py should be changed to get_patch_loader(). I created separate function to load section and patch loaders. - SectionLoader now swaps H and W dims. When loading test data in patch, this line can be removed (and tested) from test.py h, w = img.shape[-2], img.shape[-1] # height and width * Merged PR 210: BENCHMARKS: added placeholder for benchmarks. BENCHMARKS: added placeholder for benchmarks. * Merged PR 211: Fixes issues left over from changes to data.py * removing experiments from deep_seismic, following the new struct * removing experiments from deep_seismic, following the new struct * Merged PR 220: Adds Horovod and fixes Add Horovod training script Updates dependencies in Horovod docker file Removes hard coding of path in data.py * section train/test scripts * section train/test scripts * Add cv_lib to repo and updates instructions * Add cv_lib to repo and updates instructions * Removes data.py and updates readme * Removes data.py and updates readme * Updates requirements * Updates requirements * Merged PR 222: Moves cv_lib into repo and updates setup instructions * renamed train/test scripts * renamed train/test scripts * train test works on alaudah section experiments, a few minor bugs left * train test works on alaudah section experiments, a few minor bugs left * cleaning up loaders * cleaning up loaders * Merged PR 236: Cleaned up dutchf3 data loaders @<Mathew Salvaris> , @<Ilia Karmanov> , @<Max Kaznady> , please check out if this PR will affect your experiments. The main change is with the initialization of sections/patches attributes of loaders. Previously, we were unnecessarily assigning all train/val splits to train loaders, rather than only those belonging to the given split for that loader. Similar for test loaders. This will affect your code if you access these attributes. E.g. if you have something like this in your experiments: ``` train_set = TrainPatchLoader(…) patches = train_set.patches[train_set.split] ``` or ``` train_set = TrainSectionLoader(…) sections = train_set.sections[train_set.split] ``` * training testing for sections works * training testing for sections works * minor changes * minor changes * reverting changes on dutchf3/local/default.py file * reverting changes on dutchf3/local/default.py file * added config file * added config file * Updates the repo with preliminary results for 2D segmentation * Merged PR 248: Experiment: section-based Alaudah training/testing This PR includes the section-based experiments on dutchf3 to replicate Alaudah's work. No changes were introduced to the code outside this experiment. * Merged PR 253: Waldeland based voxel loaders and TextureNet model Related work items: #16357 * Merged PR 290: A demo notebook on local train/eval on F3 data set Notebook and associated files + minor change in a patch_deconvnet_skip.py model file. Related work items: #17432 * Merged PR 312: moved dutchf3_section to experiments/interpretation moved dutchf3_section to experiments/interpretation Related work items: #17683 * Merged PR 309: minor change to README to reflect the changes in prepare_data script minor change to README to reflect the changes in prepare_data script Related work items: #17681 * Merged PR 315: Removing voxel exp Related work items: #17702 * sync with new experiment structure * sync with new experiment structure * added a logging handler for array metrics * added a logging handler for array metrics * first draft of metrics based on the ignite confusion matrix * first draft of metrics based on the ignite confusion matrix * metrics now based on ignite.metrics * metrics now based on ignite.metrics * modified patch train.py with new metrics * modified patch train.py with new metrics * Merged PR 361: VOXEL: fixes to original voxel2pixel code to make it work with the rest of the repo. Realized there was one bug in the code and the rest of the functions did not work with the different versions of libraries which we have listed in the conda yaml file. Also updated the download script. Related work items: #18264 * modified metrics with ignore_index * modified metrics with ignore_index * Merged PR 405: minor mods to notebook, more documentation A very small PR - Just a few more lines of documentation in the notebook, to improve clarity. Related work items: #17432 * Merged PR 368: Adds penobscot Adds for penobscot - Dataset reader - Training script - Testing script - Section depth augmentation - Patch depth augmentation - Iinline visualisation for Tensorboard Related work items: #14560, #17697, #17699, #17700 * Merged PR 407: Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Related work items: #16362 * Merged PR 452: decouple docker image creation from azureml removed all azureml dependencies from 010_CreateExperimentationDockerImage_GeophysicsTutorial_FWI_Azure_devito.ipynb All other changes are due to trivial reruns Related work items: #18346 * Merged PR 512: Pre-commit hooks for formatting and style checking Opening this PR to start the discussion - I added the required dotenv files and instructions for setting up pre-commit hooks for formatting and style checking. For formatting, we are using black, and style checking flake8. The following files are added: - .pre-commit-config.yaml - defines git hooks to be installed - .flake8 - settings for flake8 linter - pyproject.toml - settings for black formatter The last two files define the formatting and linting style we want to enforce on the repo. All of us would set up the pre-commit hooks locally, so regardless of what formatting/linting settings we have in our local editors, the settings specified by the git hooks would still be enforced prior to the commit, to ensure consistency among contributors. Some questions to start the discussion: - Do you want to change any of the default settings in the dotenv files - like the line lengths, error messages we exclude or include, or anything like that. - Do we want to have a requirements-dev.txt file for contributors? This setup uses pre-commit package, I didn't include it in the environment.yaml file, but instead instructed the user to install it in the CONTRIBUTING.MD file. - Once you have the hooks installed, it will only affect the files you are committing in the future. A big chunk of our codebase does not conform to the formatting/style settings. We will have to run the hooks on the codebase retrospectively. I'm happy to do that, but it will create many changes and a significant looking PR :) Any thoughts on how we should approach this? Thanks! Related work items: #18350 * Merged PR 513: 3D training script for Waldeland's model with Ignite Related work items: #16356 * Merged PR 565: Demo notebook updated with 3D graph Changes: 1) Updated demo notebook with the 3D visualization 2) Formatting changes due to new black/flake8 git hook Related work items: #17432 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * merged tests into this branch * merged tests into this branch * Merged PR 569: Minor PR: change to pre-commit configuration files Related work items: #18350 * Merged PR 586: Purging unused files and experiments Purging unused files and experiments Related work items: #20499 * moved prepare data under scripts * moved prepare data under scripts * removed untested model configs * removed untested model configs * fixed weird bug in penobscot data loader * fixed weird bug in penobscot data loader * penobscot experiments working for hrnet, seresnet, no depth and patch depth * penobscot experiments working for hrnet, seresnet, no depth and patch depth * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * fixed bugs in my previous 'fix' * fixed bugs in my previous 'fix' * removed redundant _open_mask from subclasses * removed redundant _open_mask from subclasses * Merged PR 601: Fixes to penobscot experiments A few changes: - Instructions in README on how to download and process Penobscot and F3 2D data sets - moved prepare_data scripts to the scripts/ directory - fixed a weird issue with a class method in Penobscot data loader - fixed a bug in section loader (_add_extra_channel in section loader was not necessary and was causing an issue) - removed config files that were not tested or working in Penobscot experiments - modified default.py so it's working if train.py ran without a config file Related work items: #20694 * Merged PR 605: added common metrics to Waldeland model in Ignite Related work items: #19550 * Removed redundant extract_metric_from * Removed redundant extract_metric_from * formatting changes in metrics * formatting changes in metrics * modified penobscot experiment to use new local metrics * modified penobscot experiment to use new local metrics * modified section experimen to pass device to metrics * modified section experimen to pass device to metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * fixed other experiments after new metrics * fixed other experiments after new metrics * removed apex metrics from distributed train.py * removed apex metrics from distributed train.py * added ignite-based metrics to dutch voxel experiment * added ignite-based metrics to dutch voxel experiment * removed apex metrics * removed apex metrics * modified penobscot test script to use new metrics * pytorch-ignite pre-release with new metrics until stable available * removed cell output from the F3 notebook * deleted .vscode * modified metric import in test_metrics.py * separated metrics out as a module * relative logger file path, modified section experiment * removed the REPO_PATH from init * created util logging function, and moved logging file to each experiment * modified demo experiment * modified penobscot experiment * modified dutchf3_voxel experiment * no logging in voxel2pixel * modified dutchf3 patch local experiment * modified patch distributed experiment * modified interpretation notebook * minor changes to comments * DOC: forking dislaimer and new build names. (#9) * Updating README.md with introduction material (#10) * Update README with introduction to DeepSeismic Add intro material for DeepSeismic * Adding logo file * Adding image to readme * Update README.md * Updates the 3D visualisation to use itkwidgets (#11) * Updates notebook to use itkwidgets for interactive visualisation * Adds jupytext to pre-commit (#12) * Add jupytext * Adds demo notebook for HRNet (#13) * Adding TF 2.0 to allow for tensorboard vis in notebooks * Modifies hrnet config for notebook * Add HRNet notebook for demo * Updates HRNet notebook and tidies F3 * removed my username references (#15) * moving 3D models into contrib folder (#16) * Weetok (#17) * Update it to include sections for imaging * Update README.md * Update README.md * fixed link for F3 download * MINOR: python version fix to 3.6.7 (#72) * Adding system requirements in README (#74) * Update main_build.yml for Azure Pipelines * Update main_build.yml for Azure Pipelines * BUILD: added build status badges (#6) * Adds dataloader for numpy datasets as well as demo pipeline for such a dataset (#7) * Finished version of numpy data loader * Working training script for demo * Adds the new metrics * Fixes docstrings and adds header * Removing extra setup.py * Log config file now experiment specific (#8) * Merging work on salt dataset * Adds computer vision to dependencies * Updates dependecies * Update * Updates the environemnt files * Updates readme and envs * Initial running version of dutchf3 * INFRA: added structure templates. * VOXEL: initial rough code push - need to clean up before PRing. * Working version * Working version before refactor * quick minor fixes in README * 3D SEG: first commit for PR. * 3D SEG: removed data files to avoid redistribution. * Updates * 3D SEG: restyled batch file, moving onto others. * Working HRNet * 3D SEG: finished going through Waldeland code * Updates test scripts and makes it take processing arguments * minor update * Fixing imports * Refactoring the experiments * Removing .vscode * Updates gitignore * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * minor wording fix * minor wording fix * enabled splitting dataset into sections, rather than only patches * enabled splitting dataset into sections, rather than only patches * merged duplicate ifelse blocks * merged duplicate ifelse blocks * refactored prepare_data.py * refactored prepare_data.py * added scripts for section train test * added scripts for section train test * section train/test works for single channel input * section train/test works for single channel input * Merged PR 174: F3 Dutch README, and fixed issues in prepare_data.py This PR includes the following changes: - added README instructions for running f3dutch experiments - prepare_dataset.py didn't work for creating section-based splits, so I fixed a few issues. There are no changes to the patch-based splitting logic. - ran black formatter on the file, which created all the formatting changes (sorry!) * Merged PR 204: Adds loaders to deepseismic from cv_lib * train and test script for section based training/testing * train and test script for section based training/testing * Merged PR 209: changes to section loaders in data.py Changes in this PR will affect patch scripts as well. The following are required changes in patch scripts: - get_train_loader() in train.py should be changed to get_patch_loader(). I created separate function to load section and patch loaders. - SectionLoader now swaps H and W dims. When loading test data in patch, this line can be removed (and tested) from test.py h, w = img.shape[-2], img.shape[-1] # height and width * Merged PR 210: BENCHMARKS: added placeholder for benchmarks. BENCHMARKS: added placeholder for benchmarks. * Merged PR 211: Fixes issues left over from changes to data.py * removing experiments from deep_seismic, following the new struct * removing experiments from deep_seismic, following the new struct * Merged PR 220: Adds Horovod and fixes Add Horovod training script Updates dependencies in Horovod docker file Removes hard coding of path in data.py * section train/test scripts * section train/test scripts * Add cv_lib to repo and updates instructions * Add cv_lib to repo and updates instructions * Removes data.py and updates readme * Removes data.py and updates readme * Updates requirements * Updates requirements * Merged PR 222: Moves cv_lib into repo and updates setup instructions * renamed train/test scripts * renamed train/test scripts * train test works on alaudah section experiments, a few minor bugs left * train test works on alaudah section experiments, a few minor bugs left * cleaning up loaders * cleaning up loaders * Merged PR 236: Cleaned up dutchf3 data loaders @<Mathew Salvaris> , @<Ilia Karmanov> , @<Max Kaznady> , please check out if this PR will affect your experiments. The main change is with the initialization of sections/patches attributes of loaders. Previously, we were unnecessarily assigning all train/val splits to train loaders, rather than only those belonging to the given split for that loader. Similar for test loaders. This will affect your code if you access these attributes. E.g. if you have something like this in your experiments: ``` train_set = TrainPatchLoader(…) patches = train_set.patches[train_set.split] ``` or ``` train_set = TrainSectionLoader(…) sections = train_set.sections[train_set.split] ``` * training testing for sections works * training testing for sections works * minor changes * minor changes * reverting changes on dutchf3/local/default.py file * reverting changes on dutchf3/local/default.py file * added config file * added config file * Updates the repo with preliminary results for 2D segmentation * Merged PR 248: Experiment: section-based Alaudah training/testing This PR includes the section-based experiments on dutchf3 to replicate Alaudah's work. No changes were introduced to the code outside this experiment. * Merged PR 253: Waldeland based voxel loaders and TextureNet model Related work items: #16357 * Merged PR 290: A demo notebook on local train/eval on F3 data set Notebook and associated files + minor change in a patch_deconvnet_skip.py model file. Related work items: #17432 * Merged PR 312: moved dutchf3_section to experiments/interpretation moved dutchf3_section to experiments/interpretation Related work items: #17683 * Merged PR 309: minor change to README to reflect the changes in prepare_data script minor change to README to reflect the changes in prepare_data script Related work items: #17681 * Merged PR 315: Removing voxel exp Related work items: #17702 * sync with new experiment structure * sync with new experiment structure * added a logging handler for array metrics * added a logging handler for array metrics * first draft of metrics based on the ignite confusion matrix * first draft of metrics based on the ignite confusion matrix * metrics now based on ignite.metrics * metrics now based on ignite.metrics * modified patch train.py with new metrics * modified patch train.py with new metrics * Merged PR 361: VOXEL: fixes to original voxel2pixel code to make it work with the rest of the repo. Realized there was one bug in the code and the rest of the functions did not work with the different versions of libraries which we have listed in the conda yaml file. Also updated the download script. Related work items: #18264 * modified metrics with ignore_index * modified metrics with ignore_index * Merged PR 405: minor mods to notebook, more documentation A very small PR - Just a few more lines of documentation in the notebook, to improve clarity. Related work items: #17432 * Merged PR 368: Adds penobscot Adds for penobscot - Dataset reader - Training script - Testing script - Section depth augmentation - Patch depth augmentation - Iinline visualisation for Tensorboard Related work items: #14560, #17697, #17699, #17700 * Merged PR 407: Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Related work items: #16362 * Merged PR 452: decouple docker image creation from azureml removed all azureml dependencies from 010_CreateExperimentationDockerImage_GeophysicsTutorial_FWI_Azure_devito.ipynb All other changes are due to trivial reruns Related work items: #18346 * Merged PR 512: Pre-commit hooks for formatting and style checking Opening this PR to start the discussion - I added the required dotenv files and instructions for setting up pre-commit hooks for formatting and style checking. For formatting, we are using black, and style checking flake8. The following files are added: - .pre-commit-config.yaml - defines git hooks to be installed - .flake8 - settings for flake8 linter - pyproject.toml - settings for black formatter The last two files define the formatting and linting style we want to enforce on the repo. All of us would set up the pre-commit hooks locally, so regardless of what formatting/linting settings we have in our local editors, the settings specified by the git hooks would still be enforced prior to the commit, to ensure consistency among contributors. Some questions to start the discussion: - Do you want to change any of the default settings in the dotenv files - like the line lengths, error messages we exclude or include, or anything like that. - Do we want to have a requirements-dev.txt file for contributors? This setup uses pre-commit package, I didn't include it in the environment.yaml file, but instead instructed the user to install it in the CONTRIBUTING.MD file. - Once you have the hooks installed, it will only affect the files you are committing in the future. A big chunk of our codebase does not conform to the formatting/style settings. We will have to run the hooks on the codebase retrospectively. I'm happy to do that, but it will create many changes and a significant looking PR :) Any thoughts on how we should approach this? Thanks! Related work items: #18350 * Merged PR 513: 3D training script for Waldeland's model with Ignite Related work items: #16356 * Merged PR 565: Demo notebook updated with 3D graph Changes: 1) Updated demo notebook with the 3D visualization 2) Formatting changes due to new black/flake8 git hook Related work items: #17432 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * merged tests into this branch * merged tests into this branch * Merged PR 569: Minor PR: change to pre-commit configuration files Related work items: #18350 * Merged PR 586: Purging unused files and experiments Purging unused files and experiments Related work items: #20499 * moved prepare data under scripts * moved prepare data under scripts * removed untested model configs * removed untested model configs * fixed weird bug in penobscot data loader * fixed weird bug in penobscot data loader * penobscot experiments working for hrnet, seresnet, no depth and patch depth * penobscot experiments working for hrnet, seresnet, no depth and patch depth * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * fixed bugs in my previous 'fix' * fixed bugs in my previous 'fix' * removed redundant _open_mask from subclasses * removed redundant _open_mask from subclasses * Merged PR 601: Fixes to penobscot experiments A few changes: - Instructions in README on how to download and process Penobscot and F3 2D data sets - moved prepare_data scripts to the scripts/ directory - fixed a weird issue with a class method in Penobscot data loader - fixed a bug in section loader (_add_extra_channel in section loader was not necessary and was causing an issue) - removed config files that were not tested or working in Penobscot experiments - modified default.py so it's working if train.py ran without a config file Related work items: #20694 * Merged PR 605: added common metrics to Waldeland model in Ignite Related work items: #19550 * Removed redundant extract_metric_from * Removed redundant extract_metric_from * formatting changes in metrics * formatting changes in metrics * modified penobscot experiment to use new local metrics * modified penobscot experiment to use new local metrics * modified section experimen to pass device to metrics * modified section experimen to pass device to metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * fixed other experiments after new metrics * fixed other experiments after new metrics * removed apex metrics from distributed train.py * removed apex metrics from distributed train.py * added ignite-based metrics to dutch voxel experiment * added ignite-based metrics to dutch voxel experiment * removed apex metrics * removed apex metrics * modified penobscot test script to use new metrics * pytorch-ignite pre-release with new metrics until stable available * removed cell output from the F3 notebook * deleted .vscode * modified metric import in test_metrics.py * separated metrics out as a module * relative logger file path, modified section experiment * removed the REPO_PATH from init * created util logging function, and moved logging file to each experiment * modified demo experiment * modified penobscot experiment * modified dutchf3_voxel experiment * no logging in voxel2pixel * modified dutchf3 patch local experiment * modified patch distributed experiment * modified interpretation notebook * minor changes to comments * DOC: forking dislaimer and new build names. (#9) * Updating README.md with introduction material (#10) * Update README with introduction to DeepSeismic Add intro material for DeepSeismic * Adding logo file * Adding image to readme * Update README.md * Updates the 3D visualisation to use itkwidgets (#11) * Updates notebook to use itkwidgets for interactive visualisation * Adds jupytext to pre-commit (#12) * Add jupytext * Adds demo notebook for HRNet (#13) * Adding TF 2.0 to allow for tensorboard vis in notebooks * Modifies hrnet config for notebook * Add HRNet notebook for demo * Updates HRNet notebook and tidies F3 * removed my username references (#15) * moving 3D models into contrib folder (#16) * Weetok (#17) * Update it to include sections for imaging * Update README.md * Update README.md * added system requirements to readme * sdk 1.0.76; tested conda env vs docker image; extented readme * removed reference to imaging * minor md formatting * minor md formatting * clarify which DSVM we want to use - Ubuntu GPU-enabled VM, preferably NC12 - Issue #83 * Add Troubleshooting section for DSVM warnings #89 * Add Troubleshooting section for DSVM warnings, plus typo #89 * tested both yml conda env and docker; udated conda yml to have docker sdk * tested both yml conda env and docker; udated conda yml to have docker sdk; added * NVIDIA Tesla K80 (or V100 GPU for NCv2 series) - per Vanja's comment * Update README.md * BugBash2 Issue #83 and #89: clarify which DSVM we want to use - Ubuntu GPU-enabled VM, preferably NC12 (#88) (#3) * azureml sdk 1.0.74; foxed a few issues around ACR access; added nb 030 for scalability testing * azureml sdk 1.0.74; foxed a few issues around ACR access; added nb 030 for scalability testing * merge upstream into my fork (#1) * MINOR: addressing broken F3 download link (#73) * Update main_build.yml for Azure Pipelines * Update main_build.yml for Azure Pipelines * BUILD: added build status badges (#6) * Adds dataloader for numpy datasets as well as demo pipeline for such a dataset (#7) * Finished version of numpy data loader * Working training script for demo * Adds the new metrics * Fixes docstrings and adds header * Removing extra setup.py * Log config file now experiment specific (#8) * Merging work on salt dataset * Adds computer vision to dependencies * Updates dependecies * Update * Updates the environemnt files * Updates readme and envs * Initial running version of dutchf3 * INFRA: added structure templates. * VOXEL: initial rough code push - need to clean up before PRing. * Working version * Working version before refactor * quick minor fixes in README * 3D SEG: first commit for PR. * 3D SEG: removed data files to avoid redistribution. * Updates * 3D SEG: restyled batch file, moving onto others. * Working HRNet * 3D SEG: finished going through Waldeland code * Updates test scripts and makes it take processing arguments * minor update * Fixing imports * Refactoring the experiments * Removing .vscode * Updates gitignore * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * minor wording fix * minor wording fix * enabled splitting dataset into sections, rather than only patches * enabled splitting dataset into sections, rather than only patches * merged duplicate ifelse blocks * merged duplicate ifelse blocks * refactored prepare_data.py * refactored prepare_data.py * added scripts for section train test * added scripts for section train test * section train/test works for single channel input * section train/test works for single channel input * Merged PR 174: F3 Dutch README, and fixed issues in prepare_data.py This PR includes the following changes: - added README instructions for running f3dutch experiments - prepare_dataset.py didn't work for creating section-based splits, so I fixed a few issues. There are no changes to the patch-based splitting logic. - ran black formatter on the file, which created all the formatting changes (sorry!) * Merged PR 204: Adds loaders to deepseismic from cv_lib * train and test script for section based training/testing * train and test script for section based training/testing * Merged PR 209: changes to section loaders in data.py Changes in this PR will affect patch scripts as well. The following are required changes in patch scripts: - get_train_loader() in train.py should be changed to get_patch_loader(). I created separate function to load section and patch loaders. - SectionLoader now swaps H and W dims. When loading test data in patch, this line can be removed (and tested) from test.py h, w = img.shape[-2], img.shape[-1] # height and width * Merged PR 210: BENCHMARKS: added placeholder for benchmarks. BENCHMARKS: added placeholder for benchmarks. * Merged PR 211: Fixes issues left over from changes to data.py * removing experiments from deep_seismic, following the new struct * removing experiments from deep_seismic, following the new struct * Merged PR 220: Adds Horovod and fixes Add Horovod training script Updates dependencies in Horovod docker file Removes hard coding of path in data.py * section train/test scripts * section train/test scripts * Add cv_lib to repo and updates instructions * Add cv_lib to repo and updates instructions * Removes data.py and updates readme * Removes data.py and updates readme * Updates requirements * Updates requirements * Merged PR 222: Moves cv_lib into repo and updates setup instructions * renamed train/test scripts * renamed train/test scripts * train test works on alaudah section experiments, a few minor bugs left * train test works on alaudah section experiments, a few minor bugs left * cleaning up loaders * cleaning up loaders * Merged PR 236: Cleaned up dutchf3 data loaders @<Mathew Salvaris> , @<Ilia Karmanov> , @<Max Kaznady> , please check out if this PR will affect your experiments. The main change is with the initialization of sections/patches attributes of loaders. Previously, we were unnecessarily assigning all train/val splits to train loaders, rather than only those belonging to the given split for that loader. Similar for test loaders. This will affect your code if you access these attributes. E.g. if you have something like this in your experiments: ``` train_set = TrainPatchLoader(…) patches = train_set.patches[train_set.split] ``` or ``` train_set = TrainSectionLoader(…) sections = train_set.sections[train_set.split] ``` * training testing for sections works * training testing for sections works * minor changes * minor changes * reverting changes on dutchf3/local/default.py file * reverting changes on dutchf3/local/default.py file * added config file * added config file * Updates the repo with preliminary results for 2D segmentation * Merged PR 248: Experiment: section-based Alaudah training/testing This PR includes the section-based experiments on dutchf3 to replicate Alaudah's work. No changes were introduced to the code outside this experiment. * Merged PR 253: Waldeland based voxel loaders and TextureNet model Related work items: #16357 * Merged PR 290: A demo notebook on local train/eval on F3 data set Notebook and associated files + minor change in a patch_deconvnet_skip.py model file. Related work items: #17432 * Merged PR 312: moved dutchf3_section to experiments/interpretation moved dutchf3_section to experiments/interpretation Related work items: #17683 * Merged PR 309: minor change to README to reflect the changes in prepare_data script minor change to README to reflect the changes in prepare_data script Related work items: #17681 * Merged PR 315: Removing voxel exp Related work items: #17702 * sync with new experiment structure * sync with new experiment structure * added a logging handler for array metrics * added a logging handler for array metrics * first draft of metrics based on the ignite confusion matrix * first draft of metrics based on the ignite confusion matrix * metrics now based on ignite.metrics * metrics now based on ignite.metrics * modified patch train.py with new metrics * modified patch train.py with new metrics * Merged PR 361: VOXEL: fixes to original voxel2pixel code to make it work with the rest of the repo. Realized there was one bug in the code and the rest of the functions did not work with the different versions of libraries which we have listed in the conda yaml file. Also updated the download script. Related work items: #18264 * modified metrics with ignore_index * modified metrics with ignore_index * Merged PR 405: minor mods to notebook, more documentation A very small PR - Just a few more lines of documentation in the notebook, to improve clarity. Related work items: #17432 * Merged PR 368: Adds penobscot Adds for penobscot - Dataset reader - Training script - Testing script - Section depth augmentation - Patch depth augmentation - Iinline visualisation for Tensorboard Related work items: #14560, #17697, #17699, #17700 * Merged PR 407: Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Related work items: #16362 * Merged PR 452: decouple docker image creation from azureml removed all azureml dependencies from 010_CreateExperimentationDockerImage_GeophysicsTutorial_FWI_Azure_devito.ipynb All other changes are due to trivial reruns Related work items: #18346 * Merged PR 512: Pre-commit hooks for formatting and style checking Opening this PR to start the discussion - I added the required dotenv files and instructions for setting up pre-commit hooks for formatting and style checking. For formatting, we are using black, and style checking flake8. The following files are added: - .pre-commit-config.yaml - defines git hooks to be installed - .flake8 - settings for flake8 linter - pyproject.toml - settings for black formatter The last two files define the formatting and linting style we want to enforce on the repo. All of us would set up the pre-commit hooks locally, so regardless of what formatting/linting settings we have in our local editors, the settings specified by the git hooks would still be enforced prior to the commit, to ensure consistency among contributors. Some questions to start the discussion: - Do you want to change any of the default settings in the dotenv files - like the line lengths, error messages we exclude or include, or anything like that. - Do we want to have a requirements-dev.txt file for contributors? This setup uses pre-commit package, I didn't include it in the environment.yaml file, but instead instructed the user to install it in the CONTRIBUTING.MD file. - Once you have the hooks installed, it will only affect the files you are committing in the future. A big chunk of our codebase does not conform to the formatting/style settings. We will have to run the hooks on the codebase retrospectively. I'm happy to do that, but it will create many changes and a significant looking PR :) Any thoughts on how we should approach this? Thanks! Related work items: #18350 * Merged PR 513: 3D training script for Waldeland's model with Ignite Related work items: #16356 * Merged PR 565: Demo notebook updated with 3D graph Changes: 1) Updated demo notebook with the 3D visualization 2) Formatting changes due to new black/flake8 git hook Related work items: #17432 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * merged tests into this branch * merged tests into this branch * Merged PR 569: Minor PR: change to pre-commit configuration files Related work items: #18350 * Merged PR 586: Purging unused files and experiments Purging unused files and experiments Related work items: #20499 * moved prepare data under scripts * moved prepare data under scripts * removed untested model configs * removed untested model configs * fixed weird bug in penobscot data loader * fixed weird bug in penobscot data loader * penobscot experiments working for hrnet, seresnet, no depth and patch depth * penobscot experiments working for hrnet, seresnet, no depth and patch depth * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * fixed bugs in my previous 'fix' * fixed bugs in my previous 'fix' * removed redundant _open_mask from subclasses * removed redundant _open_mask from subclasses * Merged PR 601: Fixes to penobscot experiments A few changes: - Instructions in README on how to download and process Penobscot and F3 2D data sets - moved prepare_data scripts to the scripts/ directory - fixed a weird issue with a class method in Penobscot data loader - fixed a bug in section loader (_add_extra_channel in section loader was not necessary and was causing an issue) - removed config files that were not tested or working in Penobscot experiments - modified default.py so it's working if train.py ran without a config file Related work items: #20694 * Merged PR 605: added common metrics to Waldeland model in Ignite Related work items: #19550 * Removed redundant extract_metric_from * Removed redundant extract_metric_from * formatting changes in metrics * formatting changes in metrics * modified penobscot experiment to use new local metrics * modified penobscot experiment to use new local metrics * modified section experimen to pass device to metrics * modified section experimen to pass device to metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * fixed other experiments after new metrics * fixed other experiments after new metrics * removed apex metrics from distributed train.py * removed apex metrics from distributed train.py * added ignite-based metrics to dutch voxel experiment * added ignite-based metrics to dutch voxel experiment * removed apex metrics * removed apex metrics * modified penobscot test script to use new metrics * pytorch-ignite pre-release with new metrics until stable available * removed cell output from the F3 notebook * deleted .vscode * modified metric import in test_metrics.py * separated metrics out as a module * relative logger file path, modified section experiment * removed the REPO_PATH from init * created util logging function, and moved logging file to each experiment * modified demo experiment * modified penobscot experiment * modified dutchf3_voxel experiment * no logging in voxel2pixel * modified dutchf3 patch local experiment * modified patch distributed experiment * modified interpretation notebook * minor changes to comments * DOC: forking dislaimer and new build names. (#9) * Updating README.md with introduction material (#10) * Update README with introduction to DeepSeismic Add intro material for DeepSeismic * Adding logo file * Adding image to readme * Update README.md * Updates the 3D visualisation to use itkwidgets (#11) * Updates notebook to use itkwidgets for interactive visualisation * Adds jupytext to pre-commit (#12) * Add jupytext * Adds demo notebook for HRNet (#13) * Adding TF 2.0 to allow for tensorboard vis in notebooks * Modifies hrnet config for notebook * Add HRNet notebook for demo * Updates HRNet notebook and tidies F3 * removed my username references (#15) * moving 3D models into contrib folder (#16) * Weetok (#17) * Update it to include sections for imaging * Update README.md * Update README.md * fixed link for F3 download * MINOR: python version fix to 3.6.7 (#72) * Adding system requirements in README (#74) * Update main_build.yml for Azure Pipelines * Update main_build.yml for Azure Pipelines * BUILD: added build status badges (#6) * Adds dataloader for numpy datasets as well as demo pipeline for such a dataset (#7) * Finished version of numpy data loader * Working training script for demo * Adds the new metrics * Fixes docstrings and adds header * Removing extra setup.py * Log config file now experiment specific (#8) * Merging work on salt dataset * Adds computer vision to dependencies * Updates dependecies * Update * Updates the environemnt files * Updates readme and envs * Initial running version of dutchf3 * INFRA: added structure templates. * VOXEL: initial rough code push - need to clean up before PRing. * Working version * Working version before refactor * quick minor fixes in README * 3D SEG: first commit for PR. * 3D SEG: removed data files to avoid redistribution. * Updates * 3D SEG: restyled batch file, moving onto others. * Working HRNet * 3D SEG: finished going through Waldeland code * Updates test scripts and makes it take processing arguments * minor update * Fixing imports * Refactoring the experiments * Removing .vscode * Updates gitignore * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * minor wording fix * minor wording fix * enabled splitting dataset into sections, rather than only patches * enabled splitting dataset into sections, rather than only patches * merged duplicate ifelse blocks * merged duplicate ifelse blocks * refactored prepare_data.py * refactored prepare_data.py * added scripts for section train test * added scripts for section train test * section train/test works for single channel input * section train/test works for single channel input * Merged PR 174: F3 Dutch README, and fixed issues in prepare_data.py This PR includes the following changes: - added README instructions for running f3dutch experiments - prepare_dataset.py didn't work for creating section-based splits, so I fixed a few issues. There are no changes to the patch-based splitting logic. - ran black formatter on the file, which created all the formatting changes (sorry!) * Merged PR 204: Adds loaders to deepseismic from cv_lib * train and test script for section based training/testing * train and test script for section based training/testing * Merged PR 209: changes to section loaders in data.py Changes in this PR will affect patch scripts as well. The following are required changes in patch scripts: - get_train_loader() in train.py should be changed to get_patch_loader(). I created separate function to load section and patch loaders. - SectionLoader now swaps H and W dims. When loading test data in patch, this line can be removed (and tested) from test.py h, w = img.shape[-2], img.shape[-1] # height and width * Merged PR 210: BENCHMARKS: added placeholder for benchmarks. BENCHMARKS: added placeholder for benchmarks. * Merged PR 211: Fixes issues left over from changes to data.py * removing experiments from deep_seismic, following the new struct * removing experiments from deep_seismic, following the new struct * Merged PR 220: Adds Horovod and fixes Add Horovod training script Updates dependencies in Horovod docker file Removes hard coding of path in data.py * section train/test scripts * section train/test scripts * Add cv_lib to repo and updates instructions * Add cv_lib to repo and updates instructions * Removes data.py and updates readme * Removes data.py and updates readme * Updates requirements * Updates requirements * Merged PR 222: Moves cv_lib into repo and updates setup instructions * renamed train/test scripts * renamed train/test scripts * train test works on alaudah section experiments, a few minor bugs left * train test works on alaudah section experiments, a few minor bugs left * cleaning up loaders * cleaning up loaders * Merged PR 236: Cleaned up dutchf3 data loaders @<Mathew Salvaris> , @<Ilia Karmanov> , @<Max Kaznady> , please check out if this PR will affect your experiments. The main change is with the initialization of sections/patches attributes of loaders. Previously, we were unnecessarily assigning all train/val splits to train loaders, rather than only those belonging to the given split for that loader. Similar for test loaders. This will affect your code if you access these attributes. E.g. if you have something like this in your experiments: ``` train_set = TrainPatchLoader(…) patches = train_set.patches[train_set.split] ``` or ``` train_set = TrainSectionLoader(…) sections = train_set.sections[train_set.split] ``` * training testing for sections works * training testing for sections works * minor changes * minor changes * reverting changes on dutchf3/local/default.py file * reverting changes on dutchf3/local/default.py file * added config file * added config file * Updates the repo with preliminary results for 2D segmentation * Merged PR 248: Experiment: section-based Alaudah training/testing This PR includes the section-based experiments on dutchf3 to replicate Alaudah's work. No changes were introduced to the code outside this experiment. * Merged PR 253: Waldeland based voxel loaders and TextureNet model Related work items: #16357 * Merged PR 290: A demo notebook on local train/eval on F3 data set Notebook and associated files + minor change in a patch_deconvnet_skip.py model file. Related work items: #17432 * Merged PR 312: moved dutchf3_section to experiments/interpretation moved dutchf3_section to experiments/interpretation Related work items: #17683 * Merged PR 309: minor change to README to reflect the changes in prepare_data script minor change to README to reflect the changes in prepare_data script Related work items: #17681 * Merged PR 315: Removing voxel exp Related work items: #17702 * sync with new experiment structure * sync with new experiment structure * added a logging handler for array metrics * added a logging handler for array metrics * first draft of metrics based on the ignite confusion matrix * first draft of metrics based on the ignite confusion matrix * metrics now based on ignite.metrics * metrics now based on ignite.metrics * modified patch train.py with new metrics * modified patch train.py with new metrics * Merged PR 361: VOXEL: fixes to original voxel2pixel code to make it work with the rest of the repo. Realized there was one bug in the code and the rest of the functions did not work with the different versions of libraries which we have listed in the conda yaml file. Also updated the download script. Related work items: #18264 * modified metrics with ignore_index * modified metrics with ignore_index * Merged PR 405: minor mods to notebook, more documentation A very small PR - Just a few more lines of documentation in the notebook, to improve clarity. Related work items: #17432 * Merged PR 368: Adds penobscot Adds for penobscot - Dataset reader - Training script - Testing script - Section depth augmentation - Patch depth augmentation - Iinline visualisation for Tensorboard Related work items: #14560, #17697, #17699, #17700 * Merged PR 407: Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Related work items: #16362 * Merged PR 452: decouple docker image creation from azureml removed all azureml dependencies from 010_CreateExperimentationDockerImage_GeophysicsTutorial_FWI_Azure_devito.ipynb All other changes are due to trivial reruns Related work items: #18346 * Merged PR 512: Pre-commit hooks for formatting and style checking Opening this PR to start the discussion - I added the required dotenv files and instructions for setting up pre-commit hooks for formatting and style checking. For formatting, we are using black, and style checking flake8. The following files are added: - .pre-commit-config.yaml - defines git hooks to be installed - .flake8 - settings for flake8 linter - pyproject.toml - settings for black formatter The last two files define the formatting and linting style we want to enforce on the repo. All of us would set up the pre-commit hooks locally, so regardless of what formatting/linting settings we have in our local editors, the settings specified by the git hooks would still be enforced prior to the commit, to ensure consistency among contributors. Some questions to start the discussion: - Do you want to change any of the default settings in the dotenv files - like the line lengths, error messages we exclude or include, or anything like that. - Do we want to have a requirements-dev.txt file for contributors? This setup uses pre-commit package, I didn't include it in the environment.yaml file, but instead instructed the user to install it in the CONTRIBUTING.MD file. - Once you have the hooks installed, it will only affect the files you are committing in the future. A big chunk of our codebase does not conform to the formatting/style settings. We will have to run the hooks on the codebase retrospectively. I'm happy to do that, but it will create many changes and a significant looking PR :) Any thoughts on how we should approach this? Thanks! Related work items: #18350 * Merged PR 513: 3D training script for Waldeland's model with Ignite Related work items: #16356 * Merged PR 565: Demo notebook updated with 3D graph Changes: 1) Updated demo notebook with the 3D visualization 2) Formatting changes due to new black/flake8 git hook Related work items: #17432 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * merged tests into this branch * merged tests into this branch * Merged PR 569: Minor PR: change to pre-commit configuration files Related work items: #18350 * Merged PR 586: Purging unused files and experiments Purging unused files and experiments Related work items: #20499 * moved prepare data under scripts * moved prepare data under scripts * removed untested model configs * removed untested model configs * fixed weird bug in penobscot data loader * fixed weird bug in penobscot data loader * penobscot experiments working for hrnet, seresnet, no depth and patch depth * penobscot experiments working for hrnet, seresnet, no depth and patch depth * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * fixed bugs in my previous 'fix' * fixed bugs in my previous 'fix' * removed redundant _open_mask from subclasses * removed redundant _open_mask from subclasses * Merged PR 601: Fixes to penobscot experiments A few changes: - Instructions in README on how to download and process Penobscot and F3 2D data sets - moved prepare_data scripts to the scripts/ directory - fixed a weird issue with a class method in Penobscot data loader - fixed a bug in section loader (_add_extra_channel in section loader was not necessary and was causing an issue) - removed config files that were not tested or working in Penobscot experiments - modified default.py so it's working if train.py ran without a config file Related work items: #20694 * Merged PR 605: added common metrics to Waldeland model in Ignite Related work items: #19550 * Removed redundant extract_metric_from * Removed redundant extract_metric_from * formatting changes in metrics * formatting changes in metrics * modified penobscot experiment to use new local metrics * modified penobscot experiment to use new local metrics * modified section experimen to pass device to metrics * modified section experimen to pass device to metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * fixed other experiments after new metrics * fixed other experiments after new metrics * removed apex metrics from distributed train.py * removed apex metrics from distributed train.py * added ignite-based metrics to dutch voxel experiment * added ignite-based metrics to dutch voxel experiment * removed apex metrics * removed apex metrics * modified penobscot test script to use new metrics * pytorch-ignite pre-release with new metrics until stable available * removed cell output from the F3 notebook * deleted .vscode * modified metric import in test_metrics.py * separated metrics out as a module * relative logger file path, modified section experiment * removed the REPO_PATH from init * created util logging function, and moved logging file to each experiment * modified demo experiment * modified penobscot experiment * modified dutchf3_voxel experiment * no logging in voxel2pixel * modified dutchf3 patch local experiment * modified patch distributed experiment * modified interpretation notebook * minor changes to comments * DOC: forking dislaimer and new build names. (#9) * Updating README.md with introduction material (#10) * Update README with introduction to DeepSeismic Add intro material for DeepSeismic * Adding logo file * Adding image to readme * Update README.md * Updates the 3D visualisation to use itkwidgets (#11) * Updates notebook to use itkwidgets for interactive visualisation * Adds jupytext to pre-commit (#12) * Add jupytext * Adds demo notebook for HRNet (#13) * Adding TF 2.0 to allow for tensorboard vis in notebooks * Modifies hrnet config for notebook * Add HRNet notebook for demo * Updates HRNet notebook and tidies F3 * removed my username references (#15) * moving 3D models into contrib folder (#16) * Weetok (#17) * Update it to include sections for imaging * Update README.md * Update README.md * added system requirements to readme * sdk 1.0.76; tested conda env vs docker image; extented readme * removed reference to imaging * minor md formatting * minor md formatting * clarify which DSVM we want to use - Ubuntu GPU-enabled VM, preferably NC12 - Issue #83 * Add Troubleshooting section for DSVM warnings #89 * Add Troubleshooting section for DSVM warnings, plus typo #89 * tested both yml conda env and docker; udated conda yml to have docker sdk * tested both yml conda env and docker; udated conda yml to have docker sdk; added * NVIDIA Tesla K80 (or V100 GPU for NCv2 series) - per Vanja's comment * Update README.md * Remove related projects on AI Labs * Added a reference to Azure machine learning (#115) Added a reference to Azure machine learning to show how folks can get started with using Azure Machine Learning * Update README.md * update fork from upstream (#4) * fixed merge conflict resolution in LICENSE * BugBash2 Issue #83 and #89: clarify which DSVM we want to use - Ubuntu GPU-enabled VM, preferably NC12 (#88) * azureml sdk 1.0.74; foxed a few issues around ACR access; added nb 030 for scalability testing * azureml sdk 1.0.74; foxed a few issues around ACR access; added nb 030 for scalability testing * merge upstream into my fork (#1) * MINOR: addressing broken F3 download link (#73) * Update main_build.yml for Azure Pipelines * Update main_build.yml for Azure Pipelines * BUILD: added build status badges (#6) * Adds dataloader for numpy datasets as well as demo pipeline for such a dataset (#7) * Finished version of numpy data loader * Working training script for demo * Adds the new metrics * Fixes docstrings and adds header * Removing extra setup.py * Log config file now experiment specific (#8) * Merging work on salt dataset * Adds computer vision to dependencies * Updates dependecies * Update * Updates the environemnt files * Updates readme and envs * Initial running version of dutchf3 * INFRA: added structure templates. * VOXEL: initial rough code push - need to clean up before PRing. * Working version * Working version before refactor * quick minor fixes in README * 3D SEG: first commit for PR. * 3D SEG: removed data files to avoid redistribution. * Updates * 3D SEG: restyled batch file, moving onto others. * Working HRNet * 3D SEG: finished going through Waldeland code * Updates test scripts and makes it take processing arguments * minor update * Fixing imports * Refactoring the experiments * Removing .vscode * Updates gitignore * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * minor wording fix * minor wording fix * enabled splitting dataset into sections, rather than only patches * enabled splitting dataset into sections, rather than only patches * merged duplicate ifelse blocks * merged duplicate ifelse blocks * refactored prepare_data.py * refactored prepare_data.py * added scripts for section train test * added scripts for section train test * section train/test works for single channel input * section train/test works for single channel input * Merged PR 174: F3 Dutch README, and fixed issues in prepare_data.py This PR includes the following changes: - added README instructions for running f3dutch experiments - prepare_dataset.py didn't work for creating section-based splits, so I fixed a few issues. There are no changes to the patch-based splitting logic. - ran black formatter on the file, which created all the formatting changes (sorry!) * Merged PR 204: Adds loaders to deepseismic from cv_lib * train and test script for section based training/testing * train and test script for section based training/testing * Merged PR 209: changes to section loaders in data.py Changes in this PR will affect patch scripts as well. The following are required changes in patch scripts: - get_train_loader() in train.py should be changed to get_patch_loader(). I created separate function to load section and patch loaders. - SectionLoader now swaps H and W dims. When loading test data in patch, this line can be removed (and tested) from test.py h, w = img.shape[-2], img.shape[-1] # height and width * Merged PR 210: BENCHMARKS: added placeholder for benchmarks. BENCHMARKS: added placeholder for benchmarks. * Merged PR 211: Fixes issues left over from changes to data.py * removing experiments from deep_seismic, following the new struct * removing experiments from deep_seismic, following the new struct * Merged PR 220: Adds Horovod and fixes Add Horovod training script Updates dependencies in Horovod docker file Removes hard coding of path in data.py * section train/test scripts * section train/test scripts * Add cv_lib to repo and updates instructions * Add cv_lib to repo and updates instructions * Removes data.py and updates readme * Removes data.py and updates readme * Updates requirements * Updates requirements * Merged PR 222: Moves cv_lib into repo and updates setup instructions * renamed train/test scripts * renamed train/test scripts * train test works on alaudah section experiments, a few minor bugs left * train test works on alaudah section experiments, a few minor bugs left * cleaning up loaders * cleaning up loaders * Merged PR 236: Cleaned up dutchf3 data loaders @<Mathew Salvaris> , @<Ilia Karmanov> , @<Max Kaznady> , please check out if this PR will affect your experiments. The main change is with the initialization of sections/patches attributes of loaders. Previously, we were unnecessarily assigning all train/val splits to train loaders, rather than only those belonging to the given split for that loader. Similar for test loaders. This will affect your code if you access these attributes. E.g. if you have something like this in your experiments: ``` train_set = TrainPatchLoader(…) patches = train_set.patches[train_set.split] ``` or ``` train_set = TrainSectionLoader(…) sections = train_set.sections[train_set.split] ``` * training testing for sections works * training testing for sections works * minor changes * minor changes * reverting changes on dutchf3/local/default.py file * reverting changes on dutchf3/local/default.py file * added config file * added config file * Updates the repo with preliminary results for 2D segmentation * Merged PR 248: Experiment: section-based Alaudah training/testing This PR includes the section-based experiments on dutchf3 to replicate Alaudah's work. No changes were introduced to the code outside this experiment. * Merged PR 253: Waldeland based voxel loaders and TextureNet model Related work items: #16357 * Merged PR 290: A demo notebook on local train/eval on F3 data set Notebook and associated files + minor change in a patch_deconvnet_skip.py model file. Related work items: #17432 * Merged PR 312: moved dutchf3_section to experiments/interpretation moved dutchf3_section to experiments/interpretation Related work items: #17683 * Merged PR 309: minor change to README to reflect the changes in prepare_data script minor change to README to reflect the changes in prepare_data script Related work items: #17681 * Merged PR 315: Removing voxel exp Related work items: #17702 * sync with new experiment structure * sync with new experiment structure * added a logging handler for array metrics * added a logging handler for array metrics * first draft of metrics based on the ignite confusion matrix * first draft of metrics based on the ignite confusion matrix * metrics now based on ignite.metrics * metrics now based on ignite.metrics * modified patch train.py with new metrics * modified patch train.py with new metrics * Merged PR 361: VOXEL: fixes to original voxel2pixel code to make it work with the rest of the repo. Realized there was one bug in the code and the rest of the functions did not work with the different versions of libraries which we have listed in the conda yaml file. Also updated the download script. Related work items: #18264 * modified metrics with ignore_index * modified metrics with ignore_index * Merged PR 405: minor mods to notebook, more documentation A very small PR - Just a few more lines of documentation in the notebook, to improve clarity. Related work items: #17432 * Merged PR 368: Adds penobscot Adds for penobscot - Dataset reader - Training script - Testing script - Section depth augmentation - Patch depth augmentation - Iinline visualisation for Tensorboard Related work items: #14560, #17697, #17699, #17700 * Merged PR 407: Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Related work items: #16362 * Merged PR 452: decouple docker image creation from azureml removed all azureml dependencies from 010_CreateExperimentationDockerImage_GeophysicsTutorial_FWI_Azure_devito.ipynb All other changes are due to trivial reruns Related work items: #18346 * Merged PR 512: Pre-commit hooks for formatting and style checking Opening this PR to start the discussion - I added the required dotenv files and instructions for setting up pre-commit hooks for formatting and style checking. For formatting, we are using black, and style checking flake8. The following files are added: - .pre-commit-config.yaml - defines git hooks to be installed - .flake8 - settings for flake8 linter - pyproject.toml - settings for black formatter The last two files define the formatting and linting style we want to enforce on the repo. All of us would set up the pre-commit hooks locally, so regardless of what formatting/linting settings we have in our local editors, the settings specified by the git hooks would still be enforced prior to the commit, to ensure consistency among contributors. Some questions to start the discussion: - Do you want to change any of the default settings in the dotenv files - like the line lengths, error messages we exclude or include, or anything like that. - Do we want to have a requirements-dev.txt file for contributors? This setup uses pre-commit package, I didn't include it in the environment.yaml file, but instead instructed the user to install it in the CONTRIBUTING.MD file. - Once you have the hooks installed, it will only affect the files you are committing in the future. A big chunk of our codebase does not conform to the formatting/style settings. We will have to run the hooks on the codebase retrospectively. I'm happy to do that, but it will create many changes and a significant looking PR :) Any thoughts on how we should approach this? Thanks! Related work items: #18350 * Merged PR 513: 3D training script for Waldeland's model with Ignite Related work items: #16356 * Merged PR 565: Demo notebook updated with 3D graph Changes: 1) Updated demo notebook with the 3D visualization 2) Formatting changes due to new black/flake8 git hook Related work items: #17432 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * merged tests into this branch * merged tests into this branch * Merged PR 569: Minor PR: change to pre-commit configuration files Related work items: #18350 * Merged PR 586: Purging unused files and experiments Purging unused files and experiments Related work items: #20499 * moved prepare data under scripts * moved prepare data under scripts * removed untested model configs * removed untested model configs * fixed weird bug in penobscot data loader * fixed weird bug in penobscot data loader * penobscot experiments working for hrnet, seresnet, no depth and patch depth * penobscot experiments working for hrnet, seresnet, no depth and patch depth * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * fixed bugs in my previous 'fix' * fixed bugs in my previous 'fix' * removed redundant _open_mask from subclasses * removed redundant _open_mask from subclasses * Merged PR 601: Fixes to penobscot experiments A few changes: - Instructions in README on how to download and process Penobscot and F3 2D data sets - moved prepare_data scripts to the scripts/ directory - fixed a weird issue with a class method in Penobscot data loader - fixed a bug in section loader (_add_extra_channel in section loader was not necessary and was causing an issue) - removed config files that were not tested or working in Penobscot experiments - modified default.py so it's working if train.py ran without a config file Related work items: #20694 * Merged PR 605: added common metrics to Waldeland model in Ignite Related work items: #19550 * Removed redundant extract_metric_from * Removed redundant extract_metric_from * formatting changes in metrics * formatting changes in metrics * modified penobscot experiment to use new local metrics * modified penobscot experiment to use new local metrics * modified section experimen to pass device to metrics * modified section experimen to pass device to metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * fixed other experiments after new metrics * fixed other experiments after new metrics * removed apex metrics from distributed train.py * removed apex metrics from distributed train.py * added ignite-based metrics to dutch voxel experiment * added ignite-based metrics to dutch voxel experiment * removed apex metrics * removed apex metrics * modified penobscot test script to use new metrics * pytorch-ignite pre-release with new metrics until stable available * removed cell output from the F3 notebook * deleted .vscode * modified metric import in test_metrics.py * separated metrics out as a module * relative logger file path, modified section experiment * removed the REPO_PATH from init * created util logging function, and moved logging file to each experiment * modified demo experiment * modified penobscot experiment * modified dutchf3_voxel experiment * no logging in voxel2pixel * modified dutchf3 patch local experiment * modified patch distributed experiment * modified interpretation notebook * minor changes to comments * DOC: forking dislaimer and new build names. (#9) * Updating README.md with introduction material (#10) * Update README with introduction to DeepSeismic Add intro material for DeepSeismic * Adding logo file * Adding image to readme * Update README.md * Updates the 3D visualisation to use itkwidgets (#11) * Updates notebook to use itkwidgets for interactive visualisation * Adds jupytext to pre-commit (#12) * Add jupytext * Adds demo notebook for HRNet (#13) * Adding TF 2.0 to allow for tensorboard vis in notebooks * Modifies hrnet config for notebook * Add HRNet notebook for demo * Updates HRNet notebook and tidies F3 * removed my username references (#15) * moving 3D models into contrib folder (#16) * Weetok (#17) * Update it to include sections for imaging * Update README.md * Update README.md * fixed link for F3 download * MINOR: python version fix to 3.6.7 (#72) * Adding system requirements in README (#74) * Update main_build.yml for Azure Pipelines * Update main_build.yml for Azure Pipelines * BUILD: added build status badges (#6) * Adds dataloader for numpy datasets as well as demo pipeline for such a dataset (#7) * Finished version of numpy data loader * Working training script for demo * Adds the new metrics * Fixes docstrings and adds header * Removing extra setup.py * Log config file now experiment specific (#8) * Merging work on salt dataset * Adds computer vision to dependencies * Updates dependecies * Update * Updates the environemnt files * Updates readme and envs * Initial running version of dutchf3 * INFRA: added structure templates. * VOXEL: initial rough code push - need to clean up before PRing. * Working version * Working version before refactor * quick minor fixes in README * 3D SEG: first commit for PR. * 3D SEG: removed data files to avoid redistribution. * Updates * 3D SEG: restyled batch file, moving onto others. * Working HRNet * 3D SEG: finished going through Waldeland code * Updates test scripts and makes it take processing arguments * minor update * Fixing imports * Refactoring the experiments * Removing .vscode * Updates gitignore * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * minor wording fix * minor wording fix * enabled splitting dataset into sections, rather than only patches * enabled splitting dataset into sections, rather than only patches * merged duplicate ifelse blocks * merged duplicate ifelse blocks * refactored prepare_data.py * refactored prepare_data.py * added scripts for section train test * added scripts for section train test * section train/test works for single channel input * section train/test works for single channel input * Merged PR 174: F3 Dutch README, and fixed issues in prepare_data.py This PR includes the following changes: - added README instructions for running f3dutch experiments - prepare_dataset.py didn't work for creating section-based splits, so I fixed a few issues. There are no changes to the patch-based splitting logic. - ran black formatter on the file, which created all the formatting changes (sorry!) * Merged PR 204: Adds loaders to deepseismic from cv_lib * train and test script for section based training/testing * train and test script for section based training/testing * Merged PR 209: changes to section loaders in data.py Changes in this PR will affect patch scripts as well. The following are required changes in patch scripts: - get_train_loader() in train.py should be changed to get_patch_loader(). I created separate function to load section and patch loaders. - SectionLoader now swaps H and W dims. When loading test data in patch, this line can be removed (and tested) from test.py h, w = img.shape[-2], img.shape[-1] # height and width * Merged PR 210: BENCHMARKS: added placeholder for benchmarks. BENCHMARKS: added placeholder for benchmarks. * Merged PR 211: Fixes issues left over from changes to data.py * removing experiments from deep_seismic, following the new struct * removing experiments from deep_seismic, following the new struct * Merged PR 220: Adds Horovod and fixes Add Horovod training script Updates dependencies in Horovod docker file Removes hard coding of path in data.py * section train/test scripts * section train/test scripts * Add cv_lib to repo and updates instructions * Add cv_lib to repo and updates instructions * Removes data.py and updates readme * Removes data.py and updates readme * Updates requirements * Updates requirements * Merged PR 222: Moves cv_lib into repo and updates setup instructions * renamed train/test scripts * renamed train/test scripts * train test works on alaudah section experiments, a few minor bugs left * train test works on alaudah section experiments, a few minor bugs left * cleaning up loaders * cleaning up loaders * Merged PR 236: Cleaned up dutchf3 data loaders @<Mathew Salvaris> , @<Ilia Karmanov> , @<Max Kaznady> , please check out if this PR will affect your experiments. The main change is with the initialization of sections/patches attributes of loaders. Previously, we were unnecessarily assigning all train/val splits to train loaders, rather than only those belonging to the given split for that loader. Similar for test loaders. This will affect your code if you access these attributes. E.g. if you have something like this in your experiments: ``` train_set = TrainPatchLoader(…) patches = train_set.patches[train_set.split] ``` or ``` train_set = TrainSectionLoader(…) sections = train_set.sections[train_set.split] ``` * training testing for sections works * training testing for sections works * minor changes * minor changes * reverting changes on dutchf3/local/default.py file * reverting changes on dutchf3/local/default.py file * added config file * added config file * Updates the repo with preliminary results for 2D segmentation * Merged PR 248: Experiment: section-based Alaudah training/testing This PR includes the section-based experiments on dutchf3 to replicate Alaudah's work. No changes were introduced to the code outside this experiment. * Merged PR 253: Waldeland based voxel loaders and TextureNet model Related work items: #16357 * Merged PR 290: A demo notebook on local train/eval on F3 data set Notebook and associated files + minor change in a patch_deconvnet_skip.py model file. Related work items: #17432 * Merged PR 312: moved dutchf3_section to experiments/interpretation moved dutchf3_section to experiments/interpretation Related work items: #17683 * Merged PR 309: minor change to README to reflect the changes in prepare_data script minor change to README to reflect the changes in prepare_data script Related work items: #17681 * Merged PR 315: Removing voxel exp Related work items: #17702 * sync with new experiment structure * sync with new experiment structure * added a logging handler for array metrics * added a logging handler for array metrics * first draft of metrics based on the ignite confusion matrix * first draft of metrics based on the ignite confusion matrix * metrics now based on ignite.metrics * metrics now based on ignite.metrics * modified patch train.py with new metrics * modified patch train.py with new metrics * Merged PR 361: VOXEL: fixes to original voxel2pixel code to make it work with the rest of the repo. Realized there was one bug in the code and the rest of the functions did not work with the different versions of libraries which we have listed in the conda yaml file. Also updated the download script. Related work items: #18264 * modified metrics with ignore_index * modified metrics with ignore_index * Merged PR 405: minor mods to notebook, more documentation A very small PR - Just a few more lines of documentation in the notebook, to improve clarity. Related work items: #17432 * Merged PR 368: Adds penobscot Adds for penobscot - Dataset reader - Training script - Testing script - Section depth augmentation - Patch depth augmentation - Iinline visualisation for Tensorboard Related work items: #14560, #17697, #17699, #17700 * Merged PR 407: Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Related work items: #16362 * Merged PR 452: decouple docker image creation from azureml removed all azureml dependencies from 010_CreateExperimentationDockerImage_GeophysicsTutorial_FWI_Azure_devito.ipynb All other changes are due to trivial reruns Related work items: #18346 * Merged PR 512: Pre-commit hooks for formatting and style checking Opening this PR to start the discussion - I added the required dotenv files and instructions for setting up pre-commit hooks for formatting and style checking. For formatting, we are using black, and style checking flake8. The following files are added: - .pre-commit-config.yaml - defines git hooks to be installed - .flake8 - settings for flake8 linter - pyproject.toml - settings for black formatter The last two files define the formatting and linting style we want to enforce on the repo. All of us would set up the pre-commit hooks locally, so regardless of what formatting/linting settings we have in our local editors, the settings specified by the git hooks would still be enforced prior to the commit, to ensure consistency among contributors. Some questions to start the discussion: - Do you want to change any of the default settings in the dotenv files - like the line lengths, error messages we exclude or include, or anything like that. - Do we want to have a requirements-dev.txt file for contributors? This setup uses pre-commit package, I didn't include it in the environment.yaml file, but instead instructed the user to install it in the CONTRIBUTING.MD file. - Once you have the hooks installed, it will only affect the files you are committing in the future. A big chunk of our codebase does not conform to the formatting/style settings. We will have to run the hooks on the codebase retrospectively. I'm happy to do that, but it will create many changes and a significant looking PR :) Any thoughts on how we should approach this? Thanks! Related work items: #18350 * Merged PR 513: 3D training script for Waldeland's model with Ignite Related work items: #16356 * Merged PR 565: Demo notebook updated with 3D graph Changes: 1) Updated demo notebook with the 3D visualization 2) Formatting changes due to new black/flake8 git hook Related work items: #17432 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * merged tests into this branch * merged tests into this branch * Merged PR 569: Minor PR: change to pre-commit configuration files Related work items: #18350 * Merged PR 586: Purging unused files and experiments Purging unused files and experiments Related work items: #20499 * moved prepare data under scripts * moved prepare data under scripts * removed untested model configs * removed untested model configs * fixed weird bug in penobscot data loader * fixed weird bug in penobscot data loader * penobscot experiments working for hrnet, seresnet, no depth and patch depth * penobscot experiments working for hrnet, seresnet, no depth and patch depth * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * fixed bugs in my previous 'fix' * fixed bugs in my previous 'fix' * removed redundant _open_mask from subclasses * removed redundant _open_mask from subclasses * Merged PR 601: Fixes to penobscot experiments A few changes: - Instructions in README on how to download and process Penobscot and F3 2D data sets - moved prepare_data scripts to the scripts/ directory - fixed a weird issue with a class method in Penobscot data loader - fixed a bug in section loader (_add_extra_channel in section loader was not necessary and was causing an issue) - removed config files that were not tested or working in Penobscot experiments - modified default.py so it's working if train.py ran without a config file Related work items: #20694 * Merged PR 605: added common metrics to Waldeland model in Ignite Related work items: #19550 * Removed redundant extract_metric_from * Removed redundant extract_metric_from * formatting changes in metrics * formatting changes in metrics * modified penobscot experiment to use new local metrics * modified penobscot experiment to use new local metrics * modified section experimen to pass device to metrics * modified section experimen to pass device to metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * fixed other experiments after new metrics * fixed other experiments after new metrics * removed apex metrics from distributed train.py * removed apex metrics from distributed train.py * added ignite-based metrics to dutch voxel experiment * added ignite-based metrics to dutch voxel experiment * removed apex metrics * removed apex metrics * modified penobscot test script to use new metrics * pytorch-ignite pre-release with new metrics until stable available * removed cell output from the F3 notebook * deleted .vscode * modified metric import in test_metrics.py * separated metrics out as a module * relative logger file path, modified section experiment * removed the REPO_PATH from init * created util logging function, and moved logging file to each experiment * modified demo experiment * modified penobscot experiment * modified dutchf3_voxel experiment * no logging in voxel2pixel * modified dutchf3 patch local experiment * modified patch distributed experiment * modified interpretation notebook * minor changes to comments * DOC: forking dislaimer and new build names. (#9) * Updating README.md with introduction material (#10) * Update README with introduction to DeepSeismic Add intro material for DeepSeismic * Adding logo file * Adding image to readme * Update README.md * Updates the 3D visualisation to use itkwidgets (#11) * Updates notebook to use itkwidgets for interactive visualisation * Adds jupytext to pre-commit (#12) * Add jupytext * Adds demo notebook for HRNet (#13) * Adding TF 2.0 to allow for tensorboard vis in notebooks * Modifies hrnet config for notebook * Add HRNet notebook for demo * Updates HRNet notebook and tidies F3 * removed my username references (#15) * moving 3D models into contrib folder (#16) * Weetok (#17) * Update it to include sections for imaging * Update README.md * Update README.md * added system requirements to readme * sdk 1.0.76; tested conda env vs docker image; extented readme * removed reference to imaging * minor md formatting * minor md formatting * clarify which DSVM we want to use - Ubuntu GPU-enabled VM, preferably NC12 - Issue #83 * Add Troubleshooting section for DSVM warnings #89 * Add Troubleshooting section for DSVM warnings, plus typo #89 * tested both yml conda env and docker; udated conda yml to have docker sdk * tested both yml conda env and docker; udated conda yml to have docker sdk; added * NVIDIA Tesla K80 (or V100 GPU for NCv2 series) - per Vanja's comment * Update README.md * Remove related projects on AI Labs * Added a reference to Azure machine learning (#115) Added a reference to Azure machine learning to show how folks can get started with using Azure Machine Learning * Update README.md * Update AUTHORS.md (#117) * Update AUTHORS.md (#118) * pre-release items (#119) * added README documentation per bug bush feedback * added missing tests * closing out multiple post bug bash issues with single PR * new badges in README * cleared notebook output * notebooks links * fixed bad merge * forked branch name is misleading. (#116) * azureml sdk 1.0.74; foxed a few issues around ACR access; added nb 030 for scalability testing * azureml sdk 1.0.74; foxed a few issues around ACR access; added nb 030 for scalability testing * merge upstream into my fork (#1) * MINOR: addressing broken F3 download link (#73) * Update main_build.yml for Azure Pipelines * Update main_build.yml for Azure Pipelines * BUILD: added build status badges (#6) * Adds dataloader for numpy datasets as well as demo pipeline for such a dataset (#7) * Finished version of numpy data loader * Working training script for demo * Adds the new metrics * Fixes docstrings and adds header * Removing extra setup.py * Log config file now experiment specific (#8) * Merging work on salt dataset * Adds computer vision to dependencies * Updates dependecies * Update * Updates the environemnt files * Updates readme and envs * Initial running version of dutchf3 * INFRA: added structure templates. * VOXEL: initial rough code push - need to clean up before PRing. * Working version * Working version before refactor * quick minor fixes in README * 3D SEG: first commit for PR. * 3D SEG: removed data files to avoid redistribution. * Updates * 3D SEG: restyled batch file, moving onto others. * Working HRNet * 3D SEG: finished going through Waldeland code * Updates test scripts and makes it take processing arguments * minor update * Fixing imports * Refactoring the experiments * Removing .vscode * Updates gitignore * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * minor wording fix * minor wording fix * enabled splitting dataset into sections, rather than only patches * enabled splitting dataset into sections, rather than only patches * merged duplicate ifelse blocks * merged duplicate ifelse blocks * refactored prepare_data.py * refactored prepare_data.py * added scripts for section train test * added scripts for section train test * section train/test works for single channel input * section train/test works for single channel input * Merged PR 174: F3 Dutch README, and fixed issues in prepare_data.py This PR includes the following changes: - added README instructions for running f3dutch experiments - prepare_dataset.py didn't work for creating section-based splits, so I fixed a few issues. There are no changes to the patch-based splitting logic. - ran black formatter on the file, which created all the formatting changes (sorry!) * Merged PR 204: Adds loaders to deepseismic from cv_lib * train and test script for section based training/testing * train and test script for section based training/testing * Merged PR 209: changes to section loaders in data.py Changes in this PR will affect patch scripts as well. The following are required changes in patch scripts: - get_train_loader() in train.py should be changed to get_patch_loader(). I created separate function to load section and patch loaders. - SectionLoader now swaps H and W dims. When loading test data in patch, this line can be removed (and tested) from test.py h, w = img.shape[-2], img.shape[-1] # height and width * Merged PR 210: BENCHMARKS: added placeholder for benchmarks. BENCHMARKS: added placeholder for benchmarks. * Merged PR 211: Fixes issues left over from changes to data.py * removing experiments from deep_seismic, following the new struct * removing experiments from deep_seismic, following the new struct * Merged PR 220: Adds Horovod and fixes Add Horovod training script Updates dependencies in Horovod docker file Removes hard coding of path in data.py * section train/test scripts * section train/test scripts * Add cv_lib to repo and updates instructions * Add cv_lib to repo and updates instructions * Removes data.py and updates readme * Removes data.py and updates readme * Updates requirements * Updates requirements * Merged PR 222: Moves cv_lib into repo and updates setup instructions * renamed train/test scripts * renamed train/test scripts * train test works on alaudah section experiments, a few minor bugs left * train test works on alaudah section experiments, a few minor bugs left * cleaning up loaders * cleaning up loaders * Merged PR 236: Cleaned up dutchf3 data loaders @<Mathew Salvaris> , @<Ilia Karmanov> , @<Max Kaznady> , please check out if this PR will affect your experiments. The main change is with the initialization of sections/patches attributes of loaders. Previously, we were unnecessarily assigning all train/val splits to train loaders, rather than only those belonging to the given split for that loader. Similar for test loaders. This will affect your code if you access these attributes. E.g. if you have something like this in your experiments: ``` train_set = TrainPatchLoader(…) patches = train_set.patches[train_set.split] ``` or ``` train_set = TrainSectionLoader(…) sections = train_set.sections[train_set.split] ``` * training testing for sections works * training testing for sections works * minor changes * minor changes * reverting changes on dutchf3/local/default.py file * reverting changes on dutchf3/local/default.py file * added config file * added config file * Updates the repo with preliminary results for 2D segmentation * Merged PR 248: Experiment: section-based Alaudah training/testing This PR includes the section-based experiments on dutchf3 to replicate Alaudah's work. No changes were introduced to the code outside this experiment. * Merged PR 253: Waldeland based voxel loaders and TextureNet model Related work items: #16357 * Merged PR 290: A demo notebook on local train/eval on F3 data set Notebook and associated files + minor change in a patch_deconvnet_skip.py model file. Related work items: #17432 * Merged PR 312: moved dutchf3_section to experiments/interpretation moved dutchf3_section to experiments/interpretation Related work items: #17683 * Merged PR 309: minor change to README to reflect the changes in prepare_data script minor change to README to reflect the changes in prepare_data script Related work items: #17681 * Merged PR 315: Removing voxel exp Related work items: #17702 * sync with new experiment structure * sync with new experiment structure * added a logging handler for array metrics * added a logging handler for array metrics * first draft of metrics based on the ignite confusion matrix * first draft of metrics based on the ignite confusion matrix * metrics now based on ignite.metrics * metrics now based on ignite.metrics * modified patch train.py with new metrics * modified patch train.py with new metrics * Merged PR 361: VOXEL: fixes to original voxel2pixel code to make it work with the rest of the repo. Realized there was one bug in the code and the rest of the functions did not work with the different versions of libraries which we have listed in the conda yaml file. Also updated the download script. Related work items: #18264 * modified metrics with ignore_index * modified metrics with ignore_index * Merged PR 405: minor mods to notebook, more documentation A very small PR - Just a few more lines of documentation in the notebook, to improve clarity. Related work items: #17432 * Merged PR 368: Adds penobscot Adds for penobscot - Dataset reader - Training script - Testing script - Section depth augmentation - Patch depth augmentation - Iinline visualisation for Tensorboard Related work items: #14560, #17697, #17699, #17700 * Merged PR 407: Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Related work items: #16362 * Merged PR 452: decouple docker image creation from azureml removed all azureml dependencies from 010_CreateExperimentationDockerImage_GeophysicsTutorial_FWI_Azure_devito.ipynb All other changes are due to trivial reruns Related work items: #18346 * Merged PR 512: Pre-commit hooks for formatting and style checking Opening this PR to start the discussion - I added the required dotenv files and instructions for setting up pre-commit hooks for formatting and style checking. For formatting, we are using black, and style checking flake8. The following files are added: - .pre-commit-config.yaml - defines git hooks to be installed - .flake8 - settings for flake8 linter - pyproject.toml - settings for black formatter The last two files define the formatting and linting style we want to enforce on the repo. All of us would set up the pre-commit hooks locally, so regardless of what formatting/linting settings we have in our local editors, the settings specified by the git hooks would still be enforced prior to the commit, to ensure consistency among contributors. Some questions to start the discussion: - Do you want to change any of the default settings in the dotenv files - like the line lengths, error messages we exclude or include, or anything like that. - Do we want to have a requirements-dev.txt file for contributors? This setup uses pre-commit package, I didn't include it in the environment.yaml file, but instead instructed the user to install it in the CONTRIBUTING.MD file. - Once you have the hooks installed, it will only affect the files you are committing in the future. A big chunk of our codebase does not conform to the formatting/style settings. We will have to run the hooks on the codebase retrospectively. I'm happy to do that, but it will create many changes and a significant looking PR :) Any thoughts on how we should approach this? Thanks! Related work items: #18350 * Merged PR 513: 3D training script for Waldeland's model with Ignite Related work items: #16356 * Merged PR 565: Demo notebook updated with 3D graph Changes: 1) Updated demo notebook with the 3D visualization 2) Formatting changes due to new black/flake8 git hook Related work items: #17432 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * merged tests into this branch * merged tests into this branch * Merged PR 569: Minor PR: change to pre-commit configuration files Related work items: #18350 * Merged PR 586: Purging unused files and experiments Purging unused files and experiments Related work items: #20499 * moved prepare data under scripts * moved prepare data under scripts * removed untested model configs * removed untested model configs * fixed weird bug in penobscot data loader * fixed weird bug in penobscot data loader * penobscot experiments working for hrnet, seresnet, no depth and patch depth * penobscot experiments working for hrnet, seresnet, no depth and patch depth * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * fixed bugs in my previous 'fix' * fixed bugs in my previous 'fix' * removed redundant _open_mask from subclasses * removed redundant _open_mask from subclasses * Merged PR 601: Fixes to penobscot experiments A few changes: - Instructions in README on how to download and process Penobscot and F3 2D data sets - moved prepare_data scripts to the scripts/ directory - fixed a weird issue with a class method in Penobscot data loader - fixed a bug in section loader (_add_extra_channel in section loader was not necessary and was causing an issue) - removed config files that were not tested or working in Penobscot experiments - modified default.py so it's working if train.py ran without a config file Related work items: #20694 * Merged PR 605: added common metrics to Waldeland model in Ignite Related work items: #19550 * Removed redundant extract_metric_from * Removed redundant extract_metric_from * formatting changes in metrics * formatting changes in metrics * modified penobscot experiment to use new local metrics * modified penobscot experiment to use new local metrics * modified section experimen to pass device to metrics * modified section experimen to pass device to metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * fixed other experiments after new metrics * fixed other experiments after new metrics * removed apex metrics from distributed train.py * removed apex metrics from distributed train.py * added ignite-based metrics to dutch voxel experiment * added ignite-based metrics to dutch voxel experiment * removed apex metrics * removed apex metrics * modified penobscot test script to use new metrics * pytorch-ignite pre-release with new metrics until stable available * removed cell output from the F3 notebook * deleted .vscode * modified metric import in test_metrics.py * separated metrics out as a module * relative logger file path, modified section experiment * removed the REPO_PATH from init * created util logging function, and moved logging file to each experiment * modified demo experiment * modified penobscot experiment * modified dutchf3_voxel experiment * no logging in voxel2pixel * modified dutchf3 patch local experiment * modified patch distributed experiment * modified interpretation notebook * minor changes to comments * DOC: forking dislaimer and new build names. (#9) * Updating README.md with introduction material (#10) * Update README with introduction to DeepSeismic Add intro material for DeepSeismic * Adding logo file * Adding image to readme * Update README.md * Updates the 3D visualisation to use itkwidgets (#11) * Updates notebook to use itkwidgets for interactive visualisation * Adds jupytext to pre-commit (#12) * Add jupytext * Adds demo notebook for HRNet (#13) * Adding TF 2.0 to allow for tensorboard vis in notebooks * Modifies hrnet config for notebook * Add HRNet notebook for demo * Updates HRNet notebook and tidies F3 * removed my username references (#15) * moving 3D models into contrib folder (#16) * Weetok (#17) * Update it to include sections for imaging * Update README.md * Update README.md * fixed link for F3 download * MINOR: python version fix to 3.6.7 (#72) * Adding system requirements in README (#74) * Update main_build.yml for Azure Pipelines * Update main_build.yml for Azure Pipelines * BUILD: added build status badges (#6) * Adds dataloader for numpy datasets as well as demo pipeline for such a dataset (#7) * Finished version of numpy data loader * Working training script for demo * Adds the new metrics * Fixes docstrings and adds header * Removing extra setup.py * Log config file now experiment specific (#8) * Merging work on salt dataset * Adds computer vision to dependencies * Updates dependecies * Update * Updates the environemnt files * Updates readme and envs * Initial running version of dutchf3 * INFRA: added structure templates. * VOXEL: initial rough code push - need to clean up before PRing. * Working version * Working version before refactor * quick minor fixes in README * 3D SEG: first commit for PR. * 3D SEG: removed data files to avoid redistribution. * Updates * 3D SEG: restyled batch file, moving onto others. * Working HRNet * 3D SEG: finished going through Waldeland code * Updates test scripts and makes it take processing arguments * minor update * Fixing imports * Refactoring the experiments * Removing .vscode * Updates gitignore * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * minor wording fix * minor wording fix * enabled splitting dataset into sections, rather than only patches * enabled splitting dataset into sections, rather than only patches * merged duplicate ifelse blocks * merged duplicate ifelse blocks * refactored prepare_data.py * refactored prepare_data.py * added scripts for section train test * added scripts for section train test * section train/test works for single channel input * section train/test works for single channel input * Merged PR 174: F3 Dutch README, and fixed issues in prepare_data.py This PR includes the following changes: - added README instructions for running f3dutch experiments - prepare_dataset.py didn't work for creating section-based splits, so I fixed a few issues. There are no changes to the patch-based splitting logic. - ran black formatter on the file, which created all the formatting changes (sorry!) * Merged PR 204: Adds loaders to deepseismic from cv_lib * train and test script for section based training/testing * train and test script for section based training/testing * Merged PR 209: changes to section loaders in data.py Changes in this PR will affect patch scripts as well. The following are required changes in patch scripts: - get_train_loader() in train.py should be changed to get_patch_loader(). I created separate function to load section and patch loaders. - SectionLoader now swaps H and W dims. When loading test data in patch, this line can be removed (and tested) from test.py h, w = img.shape[-2], img.shape[-1] # height and width * Merged PR 210: BENCHMARKS: added placeholder for benchmarks. BENCHMARKS: added placeholder for benchmarks. * Merged PR 211: Fixes issues left over from changes to data.py * removing experiments from deep_seismic, following the new struct * removing experiments from deep_seismic, following the new struct * Merged PR 220: Adds Horovod and fixes Add Horovod training script Updates dependencies in Horovod docker file Removes hard coding of path in data.py * section train/test scripts * section train/test scripts * Add cv_lib to repo and updates instructions * Add cv_lib to repo and updates instructions * Removes data.py and updates readme * Removes data.py and updates readme * Updates requirements * Updates requirements * Merged PR 222: Moves cv_lib into repo and updates setup instructions * renamed train/test scripts * renamed train/test scripts * train test works on alaudah section experiments, a few minor bugs left * train test works on alaudah section experiments, a few minor bugs left * cleaning up loaders * cleaning up loaders * Merged PR 236: Cleaned up dutchf3 data loaders @<Mathew Salvaris> , @<Ilia Karmanov> , @<Max Kaznady> , please check out if this PR will affect your experiments. The main change is with the initialization of sections/patches attributes of loaders. Previously, we were unnecessarily assigning all train/val splits to train loaders, rather than only those belonging to the given split for that loader. Similar for test loaders. This will affect your code if you access these attributes. E.g. if you have something like this in your experiments: ``` train_set = TrainPatchLoader(…) patches = train_set.patches[train_set.split] ``` or ``` train_set = TrainSectionLoader(…) sections = train_set.sections[train_set.split] ``` * training testing for sections works * training testing for sections works * minor changes * minor changes * reverting changes on dutchf3/local/default.py file * reverting changes on dutchf3/local/default.py file * added config file * added config file * Updates the repo with preliminary results for 2D segmentation * Merged PR 248: Experiment: section-based Alaudah training/testing This PR includes the section-based experiments on dutchf3 to replicate Alaudah's work. No changes were introduced to the code outside this experiment. * Merged PR 253: Waldeland based voxel loaders and TextureNet model Related work items: #16357 * Merged PR 290: A demo notebook on local train/eval on F3 data set Notebook and associated files + minor change in a patch_deconvnet_skip.py model file. Related work items: #17432 * Merged PR 312: moved dutchf3_section to experiments/interpretation moved dutchf3_section to experiments/interpretation Related work items: #17683 * Merged PR 309: minor change to README to reflect the changes in prepare_data script minor change to README to reflect the changes in prepare_data script Related work items: #17681 * Merged PR 315: Removing voxel exp Related work items: #17702 * sync with new experiment structure * sync with new experiment structure * added a logging handler for array metrics * added a logging handler for array metrics * first draft of metrics based on the ignite confusion matrix * first draft of metrics based on the ignite confusion matrix * metrics now based on ignite.metrics * metrics now based on ignite.metrics * modified patch train.py with new metrics * modified patch train.py with new metrics * Merged PR 361: VOXEL: fixes to original voxel2pixel code to make it work with the rest of the repo. Realized there was one bug in the code and the rest of the functions did not work with the different versions of libraries which we have listed in the conda yaml file. Also updated the download script. Related work items: #18264 * modified metrics with ignore_index * modified metrics with ignore_index * Merged PR 405: minor mods to notebook, more documentation A very small PR - Just a few more lines of documentation in the notebook, to improve clarity. Related work items: #17432 * Merged PR 368: Adds penobscot Adds for penobscot - Dataset reader - Training script - Testing script - Section depth augmentation - Patch depth augmentation - Iinline visualisation for Tensorboard Related work items: #14560, #17697, #17699, #17700 * Merged PR 407: Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Related work items: #16362 * Merged PR 452: decouple docker image creation from azureml removed all azureml dependencies from 010_CreateExperimentationDockerImage_GeophysicsTutorial_FWI_Azure_devito.ipynb All other changes are due to trivial reruns Related work items: #18346 * Merged PR 512: Pre-commit hooks for formatting and style checking Opening this PR to start the discussion - I added the required dotenv files and instructions for setting up pre-commit hooks for formatting and style checking. For formatting, we are using black, and style checking flake8. The following files are added: - .pre-commit-config.yaml - defines git hooks to be installed - .flake8 - settings for flake8 linter - pyproject.toml - settings for black formatter The last two files define the formatting and linting style we want to enforce on the repo. All of us would set up the pre-commit hooks locally, so regardless of what formatting/linting settings we have in our local editors, the settings specified by the git hooks would still be enforced prior to the commit, to ensure consistency among contributors. Some questions to start the discussion: - Do you want to change any of the default settings in the dotenv files - like the line lengths, error messages we exclude or include, or anything like that. - Do we want to have a requirements-dev.txt file for contributors? This setup uses pre-commit package, I didn't include it in the environment.yaml file, but instead instructed the user to install it in the CONTRIBUTING.MD file. - Once you have the hooks installed, it will only affect the files you are committing in the future. A big chunk of our codebase does not conform to the formatting/style settings. We will have to run the hooks on the codebase retrospectively. I'm happy to do that, but it will create many changes and a significant looking PR :) Any thoughts on how we should approach this? Thanks! Related work items: #18350 * Merged PR 513: 3D training script for Waldeland's model with Ignite Related work items: #16356 * Merged PR 565: Demo notebook updated with 3D graph Changes: 1) Updated demo notebook with the 3D visualization 2) Formatting changes due to new black/flake8 git hook Related work items: #17432 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * merged tests into this branch * merged tests into this branch * Merged PR 569: Minor PR: change to pre-commit configuration files Related work items: #18350 * Merged PR 586: Purging unused files and experiments Purging unused files and experiments Related work items: #20499 * moved prepare data under scripts * moved prepare data under scripts * removed untested model configs * removed untested model configs * fixed weird bug in penobscot data loader * fixed weird bug in penobscot data loader * penobscot experiments working for hrnet, seresnet, no depth and patch depth * penobscot experiments working for hrnet, seresnet, no depth and patch depth * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * fixed bugs in my previous 'fix' * fixed bugs in my previous 'fix' * removed redundant _open_mask from subclasses * removed redundant _open_mask from subclasses * Merged PR 601: Fixes to penobscot experiments A few changes: - Instructions in README on how to download and process Penobscot and F3 2D data sets - moved prepare_data scripts to the scripts/ directory - fixed a weird issue with a class method in Penobscot data loader - fixed a bug in section loader (_add_extra_channel in section loader was not necessary and was causing an issue) - removed config files that were not tested or working in Penobscot experiments - modified default.py so it's working if train.py ran without a config file Related work items: #20694 * Merged PR 605: added common metrics to Waldeland model in Ignite Related work items: #19550 * Removed redundant extract_metric_from * Removed redundant extract_metric_from * formatting changes in metrics * formatting changes in metrics * modified penobscot experiment to use new local metrics * modified penobscot experiment to use new local metrics * modified section experimen to pass device to metrics * modified section experimen to pass device to metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * fixed other experiments after new metrics * fixed other experiments after new metrics * removed apex metrics from distributed train.py * removed apex metrics from distributed train.py * added ignite-based metrics to dutch voxel experiment * added ignite-based metrics to dutch voxel experiment * removed apex metrics * removed apex metrics * modified penobscot test script to use new metrics * pytorch-ignite pre-release with new metrics until stable available * removed cell output from the F3 notebook * deleted .vscode * modified metric import in test_metrics.py * separated metrics out as a module * relative logger file path, modified section experiment * removed the REPO_PATH from init * created util logging function, and moved logging file to each experiment * modified demo experiment * modified penobscot experiment * modified dutchf3_voxel experiment * no logging in voxel2pixel * modified dutchf3 patch local experiment * modified patch distributed experiment * modified interpretation notebook * minor changes to comments * DOC: forking dislaimer and new build names. (#9) * Updating README.md with introduction material (#10) * Update README with introduction to DeepSeismic Add intro material for DeepSeismic * Adding logo file * Adding image to readme * Update README.md * Updates the 3D visualisation to use itkwidgets (#11) * Updates notebook to use itkwidgets for interactive visualisation * Adds jupytext to pre-commit (#12) * Add jupytext * Adds demo notebook for HRNet (#13) * Adding TF 2.0 to allow for tensorboard vis in notebooks * Modifies hrnet config for notebook * Add HRNet notebook for demo * Updates HRNet notebook and tidies F3 * removed my username references (#15) * moving 3D models into contrib folder (#16) * Weetok (#17) * Update it to include sections for imaging * Update README.md * Update README.md * added system requirements to readme * sdk 1.0.76; tested conda env vs docker image; extented readme * removed reference to imaging * minor md formatting * minor md formatting * clarify which DSVM we want to use - Ubuntu GPU-enabled VM, preferably NC12 - Issue #83 * Add Troubleshooting section for DSVM warnings #89 * Add Troubleshooting section for DSVM warnings, plus typo #89 * tested both yml conda env and docker; udated conda yml to have docker sdk * tested both yml conda env and docker; udated conda yml to have docker sdk; added * NVIDIA Tesla K80 (or V100 GPU for NCv2 series) - per Vanja's comment * Update README.md * BugBash2 Issue #83 and #89: clarify which DSVM we want to use - Ubuntu GPU-enabled VM, preferably NC12 (#88) (#2) * azureml sdk 1.0.74; foxed a few issues around ACR access; added nb 030 for scalability testing * azureml sdk 1.0.74; foxed a few issues around ACR access; added nb 030 for scalability testing * merge upstream into my fork (#1) * MINOR: addressing broken F3 download link (#73) * Update main_build.yml for Azure Pipelines * Update main_build.yml for Azure Pipelines * BUILD: added build status badges (#6) * Adds dataloader for numpy datasets as well as demo pipeline for such a dataset (#7) * Finished version of numpy data loader * Working training script for demo * Adds the new metrics * Fixes docstrings and adds header * Removing extra setup.py * Log config file now experiment specific (#8) * Merging work on salt dataset * Adds computer vision to dependencies * Updates dependecies * Update * Updates the environemnt files * Updates readme and envs * Initial running version of dutchf3 * INFRA: added structure templates. * VOXEL: initial rough code push - need to clean up before PRing. * Working version * Working version before refactor * quick minor fixes in README * 3D SEG: first commit for PR. * 3D SEG: removed data files to avoid redistribution. * Updates * 3D SEG: restyled batch file, moving onto others. * Working HRNet * 3D SEG: finished going through Waldeland code * Updates test scripts and makes it take processing arguments * minor update * Fixing imports * Refactoring the experiments * Removing .vscode * Updates gitignore * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * minor wording fix * minor wording fix * enabled splitting dataset into sections, rather than only patches * enabled splitting dataset into sections, rather than only patches * merged duplicate ifelse blocks * merged duplicate ifelse blocks * refactored prepare_data.py * refactored prepare_data.py * added scripts for section train test * added scripts for section train test * section train/test works for single channel input * section train/test works for single channel input * Merged PR 174: F3 Dutch README, and fixed issues in prepare_data.py This PR includes the following changes: - added README instructions for running f3dutch experiments - prepare_dataset.py didn't work for creating section-based splits, so I fixed a few issues. There are no changes to the patch-based splitting logic. - ran black formatter on the file, which created all the formatting changes (sorry!) * Merged PR 204: Adds loaders to deepseismic from cv_lib * train and test script for section based training/testing * train and test script for section based training/testing * Merged PR 209: changes to section loaders in data.py Changes in this PR will affect patch scripts as well. The following are required changes in patch scripts: - get_train_loader() in train.py should be changed to get_patch_loader(). I created separate function to load section and patch loaders. - SectionLoader now swaps H and W dims. When loading test data in patch, this line can be removed (and tested) from test.py h, w = img.shape[-2], img.shape[-1] # height and width * Merged PR 210: BENCHMARKS: added placeholder for benchmarks. BENCHMARKS: added placeholder for benchmarks. * Merged PR 211: Fixes issues left over from changes to data.py * removing experiments from deep_seismic, following the new struct * removing experiments from deep_seismic, following the new struct * Merged PR 220: Adds Horovod and fixes Add Horovod training script Updates dependencies in Horovod docker file Removes hard coding of path in data.py * section train/test scripts * section train/test scripts * Add cv_lib to repo and updates instructions * Add cv_lib to repo and updates instructions * Removes data.py and updates readme * Removes data.py and updates readme * Updates requirements * Updates requirements * Merged PR 222: Moves cv_lib into repo and updates setup instructions * renamed train/test scripts * renamed train/test scripts * train test works on alaudah section experiments, a few minor bugs left * train test works on alaudah section experiments, a few minor bugs left * cleaning up loaders * cleaning up loaders * Merged PR 236: Cleaned up dutchf3 data loaders @<Mathew Salvaris> , @<Ilia Karmanov> , @<Max Kaznady> , please check out if this PR will affect your experiments. The main change is with the initialization of sections/patches attributes of loaders. Previously, we were unnecessarily assigning all train/val splits to train loaders, rather than only those belonging to the given split for that loader. Similar for test loaders. This will affect your code if you access these attributes. E.g. if you have something like this in your experiments: ``` train_set = TrainPatchLoader(…) patches = train_set.patches[train_set.split] ``` or ``` train_set = TrainSectionLoader(…) sections = train_set.sections[train_set.split] ``` * training testing for sections works * training testing for sections works * minor changes * minor changes * reverting changes on dutchf3/local/default.py file * reverting changes on dutchf3/local/default.py file * added config file * added config file * Updates the repo with preliminary results for 2D segmentation * Merged PR 248: Experiment: section-based Alaudah training/testing This PR includes the section-based experiments on dutchf3 to replicate Alaudah's work. No changes were introduced to the code outside this experiment. * Merged PR 253: Waldeland based voxel loaders and TextureNet model Related work items: #16357 * Merged PR 290: A demo notebook on local train/eval on F3 data set Notebook and associated files + minor change in a patch_deconvnet_skip.py model file. Related work items: #17432 * Merged PR 312: moved dutchf3_section to experiments/interpretation moved dutchf3_section to experiments/interpretation Related work items: #17683 * Merged PR 309: minor change to README to reflect the changes in prepare_data script minor change to README to reflect the changes in prepare_data script Related work items: #17681 * Merged PR 315: Removing voxel exp Related work items: #17702 * sync with new experiment structure * sync with new experiment structure * added a logging handler for array metrics * added a logging handler for array metrics * first draft of metrics based on the ignite confusion matrix * first draft of metrics based on the ignite confusion matrix * metrics now based on ignite.metrics * metrics now based on ignite.metrics * modified patch train.py with new metrics * modified patch train.py with new metrics * Merged PR 361: VOXEL: fixes to original voxel2pixel code to make it work with the rest of the repo. Realized there was one bug in the code and the rest of the functions did not work with the different versions of libraries which we have listed in the conda yaml file. Also updated the download script. Related work items: #18264 * modified metrics with ignore_index * modified metrics with ignore_index * Merged PR 405: minor mods to notebook, more documentation A very small PR - Just a few more lines of documentation in the notebook, to improve clarity. Related work items: #17432 * Merged PR 368: Adds penobscot Adds for penobscot - Dataset reader - Training script - Testing script - Section depth augmentation - Patch depth augmentation - Iinline visualisation for Tensorboard Related work items: #14560, #17697, #17699, #17700 * Merged PR 407: Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Related work items: #16362 * Merged PR 452: decouple docker image creation from azureml removed all azureml dependencies from 010_CreateExperimentationDockerImage_GeophysicsTutorial_FWI_Azure_devito.ipynb All other changes are due to trivial reruns Related work items: #18346 * Merged PR 512: Pre-commit hooks for formatting and style checking Opening this PR to start the discussion - I added the required dotenv files and instructions for setting up pre-commit hooks for formatting and style checking. For formatting, we are using black, and style checking flake8. The following files are added: - .pre-commit-config.yaml - defines git hooks to be installed - .flake8 - settings for flake8 linter - pyproject.toml - settings for black formatter The last two files define the formatting and linting style we want to enforce on the repo. All of us would set up the pre-commit hooks locally, so regardless of what formatting/linting settings we have in our local editors, the settings specified by the git hooks would still be enforced prior to the commit, to ensure consistency among contributors. Some questions to start the discussion: - Do you want to change any of the default settings in the dotenv files - like the line lengths, error messages we exclude or include, or anything like that. - Do we want to have a requirements-dev.txt file for contributors? This setup uses pre-commit package, I didn't include it in the environment.yaml file, but instead instructed the user to install it in the CONTRIBUTING.MD file. - Once you have the hooks installed, it will only affect the files you are committing in the future. A big chunk of our codebase does not conform to the formatting/style settings. We will have to run the hooks on the codebase retrospectively. I'm happy to do that, but it will create many changes and a significant looking PR :) Any thoughts on how we should approach this? Thanks! Related work items: #18350 * Merged PR 513: 3D training script for Waldeland's model with Ignite Related work items: #16356 * Merged PR 565: Demo notebook updated with 3D graph Changes: 1) Updated demo notebook with the 3D visualization 2) Formatting changes due to new black/flake8 git hook Related work items: #17432 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * merged tests into this branch * merged tests into this branch * Merged PR 569: Minor PR: change to pre-commit configuration files Related work items: #18350 * Merged PR 586: Purging unused files and experiments Purging unused files and experiments Related work items: #20499 * moved prepare data under scripts * moved prepare data under scripts * removed untested model configs * removed untested model configs * fixed weird bug in penobscot data loader * fixed weird bug in penobscot data loader * penobscot experiments working for hrnet, seresnet, no depth and patch depth * penobscot experiments working for hrnet, seresnet, no depth and patch depth * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * fixed bugs in my previous 'fix' * fixed bugs in my previous 'fix' * removed redundant _open_mask from subclasses * removed redundant _open_mask from subclasses * Merged PR 601: Fixes to penobscot experiments A few changes: - Instructions in README on how to download and process Penobscot and F3 2D data sets - moved prepare_data scripts to the scripts/ directory - fixed a weird issue with a class method in Penobscot data loader - fixed a bug in section loader (_add_extra_channel in section loader was not necessary and was causing an issue) - removed config files that were not tested or working in Penobscot experiments - modified default.py so it's working if train.py ran without a config file Related work items: #20694 * Merged PR 605: added common metrics to Waldeland model in Ignite Related work items: #19550 * Removed redundant extract_metric_from * Removed redundant extract_metric_from * formatting changes in metrics * formatting changes in metrics * modified penobscot experiment to use new local metrics * modified penobscot experiment to use new local metrics * modified section experimen to pass device to metrics * modified section experimen to pass device to metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * fixed other experiments after new metrics * fixed other experiments after new metrics * removed apex metrics from distributed train.py * removed apex metrics from distributed train.py * added ignite-based metrics to dutch voxel experiment * added ignite-based metrics to dutch voxel experiment * removed apex metrics * removed apex metrics * modified penobscot test script to use new metrics * pytorch-ignite pre-release with new metrics until stable available * removed cell output from the F3 notebook * deleted .vscode * modified metric import in test_metrics.py * separated metrics out as a module * relative logger file path, modified section experiment * removed the REPO_PATH from init * created util logging function, and moved logging file to each experiment * modified demo experiment * modified penobscot experiment * modified dutchf3_voxel experiment * no logging in voxel2pixel * modified dutchf3 patch local experiment * modified patch distributed experiment * modified interpretation notebook * minor changes to comments * DOC: forking dislaimer and new build names. (#9) * Updating README.md with introduction material (#10) * Update README with introduction to DeepSeismic Add intro material for DeepSeismic * Adding logo file * Adding image to readme * Update README.md * Updates the 3D visualisation to use itkwidgets (#11) * Updates notebook to use itkwidgets for interactive visualisation * Adds jupytext to pre-commit (#12) * Add jupytext * Adds demo notebook for HRNet (#13) * Adding TF 2.0 to allow for tensorboard vis in notebooks * Modifies hrnet config for notebook * Add HRNet notebook for demo * Updates HRNet notebook and tidies F3 * removed my username references (#15) * moving 3D models into contrib folder (#16) * Weetok (#17) * Update it to include sections for imaging * Update README.md * Update README.md * fixed link for F3 download * MINOR: python version fix to 3.6.7 (#72) * Adding system requirements in README (#74) * Update main_build.yml for Azure Pipelines * Update main_build.yml for Azure Pipelines * BUILD: added build status badges (#6) * Adds dataloader for numpy datasets as well as demo pipeline for such a dataset (#7) * Finished version of numpy data loader * Working training script for demo * Adds the new metrics * Fixes docstrings and adds header * Removing extra setup.py * Log config file now experiment specific (#8) * Merging work on salt dataset * Adds computer vision to dependencies * Updates dependecies * Update * Updates the environemnt files * Updates readme and envs * Initial running version of dutchf3 * INFRA: added structure templates. * VOXEL: initial rough code push - need to clean up before PRing. * Working version * Working version before refactor * quick minor fixes in README * 3D SEG: first commit for PR. * 3D SEG: removed data files to avoid redistribution. * Updates * 3D SEG: restyled batch file, moving onto others. * Working HRNet * 3D SEG: finished going through Waldeland code * Updates test scripts and makes it take processing arguments * minor update * Fixing imports * Refactoring the experiments * Removing .vscode * Updates gitignore * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * minor wording fix * minor wording fix * enabled splitting dataset into sections, rather than only patches * enabled splitting dataset into sections, rather than only patches * merged duplicate ifelse blocks * merged duplicate ifelse blocks * refactored prepare_data.py * refactored prepare_data.py * added scripts for section train test * added scripts for section train test * section train/test works for single channel input * section train/test works for single channel input * Merged PR 174: F3 Dutch README, and fixed issues in prepare_data.py This PR includes the following changes: - added README instructions for running f3dutch experiments - prepare_dataset.py didn't work for creating section-based splits, so I fixed a few issues. There are no changes to the patch-based splitting logic. - ran black formatter on the file, which created all the formatting changes (sorry!) * Merged PR 204: Adds loaders to deepseismic from cv_lib * train and test script for section based training/testing * train and test script for section based training/testing * Merged PR 209: changes to section loaders in data.py Changes in this PR will affect patch scripts as well. The following are required changes in patch scripts: - get_train_loader() in train.py should be changed to get_patch_loader(). I created separate function to load section and patch loaders. - SectionLoader now swaps H and W dims. When loading test data in patch, this line can be removed (and tested) from test.py h, w = img.shape[-2], img.shape[-1] # height and width * Merged PR 210: BENCHMARKS: added placeholder for benchmarks. BENCHMARKS: added placeholder for benchmarks. * Merged PR 211: Fixes issues left over from changes to data.py * removing experiments from deep_seismic, following the new struct * removing experiments from deep_seismic, following the new struct * Merged PR 220: Adds Horovod and fixes Add Horovod training script Updates dependencies in Horovod docker file Removes hard coding of path in data.py * section train/test scripts * section train/test scripts * Add cv_lib to repo and updates instructions * Add cv_lib to repo and updates instructions * Removes data.py and updates readme * Removes data.py and updates readme * Updates requirements * Updates requirements * Merged PR 222: Moves cv_lib into repo and updates setup instructions * renamed train/test scripts * renamed train/test scripts * train test works on alaudah section experiments, a few minor bugs left * train test works on alaudah section experiments, a few minor bugs left * cleaning up loaders * cleaning up loaders * Merged PR 236: Cleaned up dutchf3 data loaders @<Mathew Salvaris> , @<Ilia Karmanov> , @<Max Kaznady> , please check out if this PR will affect your experiments. The main change is with the initialization of sections/patches attributes of loaders. Previously, we were unnecessarily assigning all train/val splits to train loaders, rather than only those belonging to the given split for that loader. Similar for test loaders. This will affect your code if you access these attributes. E.g. if you have something like this in your experiments: ``` train_set = TrainPatchLoader(…) patches = train_set.patches[train_set.split] ``` or ``` train_set = TrainSectionLoader(…) sections = train_set.sections[train_set.split] ``` * training testing for sections works * training testing for sections works * minor changes * minor changes * reverting changes on dutchf3/local/default.py file * reverting changes on dutchf3/local/default.py file * added config file * added config file * Updates the repo with preliminary results for 2D segmentation * Merged PR 248: Experiment: section-based Alaudah training/testing This PR includes the section-based experiments on dutchf3 to replicate Alaudah's work. No changes were introduced to the code outside this experiment. * Merged PR 253: Waldeland based voxel loaders and TextureNet model Related work items: #16357 * Merged PR 290: A demo notebook on local train/eval on F3 data set Notebook and associated files + minor change in a patch_deconvnet_skip.py model file. Related work items: #17432 * Merged PR 312: moved dutchf3_section to experiments/interpretation moved dutchf3_section to experiments/interpretation Related work items: #17683 * Merged PR 309: minor change to README to reflect the changes in prepare_data script minor change to README to reflect the changes in prepare_data script Related work items: #17681 * Merged PR 315: Removing voxel exp Related work items: #17702 * sync with new experiment structure * sync with new experiment structure * added a logging handler for array metrics * added a logging handler for array metrics * first draft of metrics based on the ignite confusion matrix * first draft of metrics based on the ignite confusion matrix * metrics now based on ignite.metrics * metrics now based on ignite.metrics * modified patch train.py with new metrics * modified patch train.py with new metrics * Merged PR 361: VOXEL: fixes to original voxel2pixel code to make it work with the rest of the repo. Realized there was one bug in the code and the rest of the functions did not work with the different versions of libraries which we have listed in the conda yaml file. Also updated the download script. Related work items: #18264 * modified metrics with ignore_index * modified metrics with ignore_index * Merged PR 405: minor mods to notebook, more documentation A very small PR - Just a few more lines of documentation in the notebook, to improve clarity. Related work items: #17432 * Merged PR 368: Adds penobscot Adds for penobscot - Dataset reader - Training script - Testing script - Section depth augmentation - Patch depth augmentation - Iinline visualisation for Tensorboard Related work items: #14560, #17697, #17699, #17700 * Merged PR 407: Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Related work items: #16362 * Merged PR 452: decouple docker image creation from azureml removed all azureml dependencies from 010_CreateExperimentationDockerImage_GeophysicsTutorial_FWI_Azure_devito.ipynb All other changes are due to trivial reruns Related work items: #18346 * Merged PR 512: Pre-commit hooks for formatting and style checking Opening this PR to start the discussion - I added the required dotenv files and instructions for setting up pre-commit hooks for formatting and style checking. For formatting, we are using black, and style checking flake8. The following files are added: - .pre-commit-config.yaml - defines git hooks to be installed - .flake8 - settings for flake8 linter - pyproject.toml - settings for black formatter The last two files define the formatting and linting style we want to enforce on the repo. All of us would set up the pre-commit hooks locally, so regardless of what formatting/linting settings we have in our local editors, the settings specified by the git hooks would still be enforced prior to the commit, to ensure consistency among contributors. Some questions to start the discussion: - Do you want to change any of the default settings in the dotenv files - like the line lengths, error messages we exclude or include, or anything like that. - Do we want to have a requirements-dev.txt file for contributors? This setup uses pre-commit package, I didn't include it in the environment.yaml file, but instead instructed the user to install it in the CONTRIBUTING.MD file. - Once you have the hooks installed, it will only affect the files you are committing in the future. A big chunk of our codebase does not conform to the formatting/style settings. We will have to run the hooks on the codebase retrospectively. I'm happy to do that, but it will create many changes and a significant looking PR :) Any thoughts on how we should approach this? Thanks! Related work items: #18350 * Merged PR 513: 3D training script for Waldeland's model with Ignite Related work items: #16356 * Merged PR 565: Demo notebook updated with 3D graph Changes: 1) Updated demo notebook with the 3D visualization 2) Formatting changes due to new black/flake8 git hook Related work items: #17432 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * merged tests into this branch * merged tests into this branch * Merged PR 569: Minor PR: change to pre-commit configuration files Related work items: #18350 * Merged PR 586: Purging unused files and experiments Purging unused files and experiments Related work items: #20499 * moved prepare data under scripts * moved prepare data under scripts * removed untested model configs * removed untested model configs * fixed weird bug in penobscot data loader * fixed weird bug in penobscot data loader * penobscot experiments working for hrnet, seresnet, no depth and patch depth * penobscot experiments working for hrnet, seresnet, no depth and patch depth * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * fixed bugs in my previous 'fix' * fixed bugs in my previous 'fix' * removed redundant _open_mask from subclasses * removed redundant _open_mask from subclasses * Merged PR 601: Fixes to penobscot experiments A few changes: - Instructions in README on how to download and process Penobscot and F3 2D data sets - moved prepare_data scripts to the scripts/ directory - fixed a weird issue with a class method in Penobscot data loader - fixed a bug in section loader (_add_extra_channel in section loader was not necessary and was causing an issue) - removed config files that were not tested or working in Penobscot experiments - modified default.py so it's working if train.py ran without a config file Related work items: #20694 * Merged PR 605: added common metrics to Waldeland model in Ignite Related work items: #19550 * Removed redundant extract_metric_from * Removed redundant extract_metric_from * formatting changes in metrics * formatting changes in metrics * modified penobscot experiment to use new local metrics * modified penobscot experiment to use new local metrics * modified section experimen to pass device to metrics * modified section experimen to pass device to metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * fixed other experiments after new metrics * fixed other experiments after new metrics * removed apex metrics from distributed train.py * removed apex metrics from distributed train.py * added ignite-based metrics to dutch voxel experiment * added ignite-based metrics to dutch voxel experiment * removed apex metrics * removed apex metrics * modified penobscot test script to use new metrics * pytorch-ignite pre-release with new metrics until stable available * removed cell output from the F3 notebook * deleted .vscode * modified metric import in test_metrics.py * separated metrics out as a module * relative logger file path, modified section experiment * removed the REPO_PATH from init * created util logging function, and moved logging file to each experiment * modified demo experiment * modified penobscot experiment * modified dutchf3_voxel experiment * no logging in voxel2pixel * modified dutchf3 patch local experiment * modified patch distributed experiment * modified interpretation notebook * minor changes to comments * DOC: forking dislaimer and new build names. (#9) * Updating README.md with introduction material (#10) * Update README with introduction to DeepSeismic Add intro material for DeepSeismic * Adding logo file * Adding image to readme * Update README.md * Updates the 3D visualisation to use itkwidgets (#11) * Updates notebook to use itkwidgets for interactive visualisation * Adds jupytext to pre-commit (#12) * Add jupytext * Adds demo notebook for HRNet (#13) * Adding TF 2.0 to allow for tensorboard vis in notebooks * Modifies hrnet config for notebook * Add HRNet notebook for demo * Updates HRNet notebook and tidies F3 * removed my username references (#15) * moving 3D models into contrib folder (#16) * Weetok (#17) * Update it to include sections for imaging * Update README.md * Update README.md * added system requirements to readme * sdk 1.0.76; tested conda env vs docker image; extented readme * removed reference to imaging * minor md formatting * minor md formatting * clarify which DSVM we want to use - Ubuntu GPU-enabled VM, preferably NC12 - Issue #83 * Add Troubleshooting section for DSVM warnings #89 * Add Troubleshooting section for DSVM warnings, plus typo #89 * tested both yml conda env and docker; udated conda yml to have docker sdk * tested both yml conda env and docker; udated conda yml to have docker sdk; added * NVIDIA Tesla K80 (or V100 GPU for NCv2 series) - per Vanja's comment * Update README.md * BugBash2 Issue #83 and #89: clarify which DSVM we want to use - Ubuntu GPU-enabled VM, preferably NC12 (#88) (#3) * azureml sdk 1.0.74; foxed a few issues around ACR access; added nb 030 for scalability testing * azureml sdk 1.0.74; foxed a few issues around ACR access; added nb 030 for scalability testing * merge upstream into my fork (#1) * MINOR: addressing broken F3 download link (#73) * Update main_build.yml for Azure Pipelines * Update main_build.yml for Azure Pipelines * BUILD: added build stat… * Minor fix: broken links in README (#120) * fully-run notebooks links and fixed contrib voxel models (#123) * added README documentation per bug bush feedback * added missing tests * - added notebook links - made sure orginal voxel2pixel code runs * update ignite port of texturenet * resolved merge conflict * formatting change * Adds reproduction instructions to readme (#122) * Update main_build.yml for Azure Pipelines * Update main_build.yml for Azure Pipelines * BUILD: added build status badges (#6) * Adds dataloader for numpy datasets as well as demo pipeline for such a dataset (#7) * Finished version of numpy data loader * Working training script for demo * Adds the new metrics * Fixes docstrings and adds header * Removing extra setup.py * Log config file now experiment specific (#8) * Merging work on salt dataset * Adds computer vision to dependencies * Updates dependecies * Update * Updates the environemnt files * Updates readme and envs * Initial running version of dutchf3 * INFRA: added structure templates. * VOXEL: initial rough code push - need to clean up before PRing. * Working version * Working version before refactor * quick minor fixes in README * 3D SEG: first commit for PR. * 3D SEG: removed data files to avoid redistribution. * Updates * 3D SEG: restyled batch file, moving onto others. * Working HRNet * 3D SEG: finished going through Waldeland code * Updates test scripts and makes it take processing arguments * minor update * Fixing imports * Refactoring the experiments * Removing .vscode * Updates gitignore * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * added instructions for running f3dutch experiments, and fixed some issues in prepare_data.py script * minor wording fix * minor wording fix * enabled splitting dataset into sections, rather than only patches * enabled splitting dataset into sections, rather than only patches * merged duplicate ifelse blocks * merged duplicate ifelse blocks * refactored prepare_data.py * refactored prepare_data.py * added scripts for section train test * added scripts for section train test * section train/test works for single channel input * section train/test works for single channel input * Merged PR 174: F3 Dutch README, and fixed issues in prepare_data.py This PR includes the following changes: - added README instructions for running f3dutch experiments - prepare_dataset.py didn't work for creating section-based splits, so I fixed a few issues. There are no changes to the patch-based splitting logic. - ran black formatter on the file, which created all the formatting changes (sorry!) * Merged PR 204: Adds loaders to deepseismic from cv_lib * train and test script for section based training/testing * train and test script for section based training/testing * Merged PR 209: changes to section loaders in data.py Changes in this PR will affect patch scripts as well. The following are required changes in patch scripts: - get_train_loader() in train.py should be changed to get_patch_loader(). I created separate function to load section and patch loaders. - SectionLoader now swaps H and W dims. When loading test data in patch, this line can be removed (and tested) from test.py h, w = img.shape[-2], img.shape[-1] # height and width * Merged PR 210: BENCHMARKS: added placeholder for benchmarks. BENCHMARKS: added placeholder for benchmarks. * Merged PR 211: Fixes issues left over from changes to data.py * removing experiments from deep_seismic, following the new struct * removing experiments from deep_seismic, following the new struct * Merged PR 220: Adds Horovod and fixes Add Horovod training script Updates dependencies in Horovod docker file Removes hard coding of path in data.py * section train/test scripts * section train/test scripts * Add cv_lib to repo and updates instructions * Add cv_lib to repo and updates instructions * Removes data.py and updates readme * Removes data.py and updates readme * Updates requirements * Updates requirements * Merged PR 222: Moves cv_lib into repo and updates setup instructions * renamed train/test scripts * renamed train/test scripts * train test works on alaudah section experiments, a few minor bugs left * train test works on alaudah section experiments, a few minor bugs left * cleaning up loaders * cleaning up loaders * Merged PR 236: Cleaned up dutchf3 data loaders @<Mathew Salvaris> , @<Ilia Karmanov> , @<Max Kaznady> , please check out if this PR will affect your experiments. The main change is with the initialization of sections/patches attributes of loaders. Previously, we were unnecessarily assigning all train/val splits to train loaders, rather than only those belonging to the given split for that loader. Similar for test loaders. This will affect your code if you access these attributes. E.g. if you have something like this in your experiments: ``` train_set = TrainPatchLoader(…) patches = train_set.patches[train_set.split] ``` or ``` train_set = TrainSectionLoader(…) sections = train_set.sections[train_set.split] ``` * training testing for sections works * training testing for sections works * minor changes * minor changes * reverting changes on dutchf3/local/default.py file * reverting changes on dutchf3/local/default.py file * added config file * added config file * Updates the repo with preliminary results for 2D segmentation * Merged PR 248: Experiment: section-based Alaudah training/testing This PR includes the section-based experiments on dutchf3 to replicate Alaudah's work. No changes were introduced to the code outside this experiment. * Merged PR 253: Waldeland based voxel loaders and TextureNet model Related work items: #16357 * Merged PR 290: A demo notebook on local train/eval on F3 data set Notebook and associated files + minor change in a patch_deconvnet_skip.py model file. Related work items: #17432 * Merged PR 312: moved dutchf3_section to experiments/interpretation moved dutchf3_section to experiments/interpretation Related work items: #17683 * Merged PR 309: minor change to README to reflect the changes in prepare_data script minor change to README to reflect the changes in prepare_data script Related work items: #17681 * Merged PR 315: Removing voxel exp Related work items: #17702 * sync with new experiment structure * sync with new experiment structure * added a logging handler for array metrics * added a logging handler for array metrics * first draft of metrics based on the ignite confusion matrix * first draft of metrics based on the ignite confusion matrix * metrics now based on ignite.metrics * metrics now based on ignite.metrics * modified patch train.py with new metrics * modified patch train.py with new metrics * Merged PR 361: VOXEL: fixes to original voxel2pixel code to make it work with the rest of the repo. Realized there was one bug in the code and the rest of the functions did not work with the different versions of libraries which we have listed in the conda yaml file. Also updated the download script. Related work items: #18264 * modified metrics with ignore_index * modified metrics with ignore_index * Merged PR 405: minor mods to notebook, more documentation A very small PR - Just a few more lines of documentation in the notebook, to improve clarity. Related work items: #17432 * Merged PR 368: Adds penobscot Adds for penobscot - Dataset reader - Training script - Testing script - Section depth augmentation - Patch depth augmentation - Iinline visualisation for Tensorboard Related work items: #14560, #17697, #17699, #17700 * Merged PR 407: Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Azure ML SDK Version: 1.0.65; running devito in AzureML Estimators Related work items: #16362 * Merged PR 452: decouple docker image creation from azureml removed all azureml dependencies from 010_CreateExperimentationDockerImage_GeophysicsTutorial_FWI_Azure_devito.ipynb All other changes are due to trivial reruns Related work items: #18346 * Merged PR 512: Pre-commit hooks for formatting and style checking Opening this PR to start the discussion - I added the required dotenv files and instructions for setting up pre-commit hooks for formatting and style checking. For formatting, we are using black, and style checking flake8. The following files are added: - .pre-commit-config.yaml - defines git hooks to be installed - .flake8 - settings for flake8 linter - pyproject.toml - settings for black formatter The last two files define the formatting and linting style we want to enforce on the repo. All of us would set up the pre-commit hooks locally, so regardless of what formatting/linting settings we have in our local editors, the settings specified by the git hooks would still be enforced prior to the commit, to ensure consistency among contributors. Some questions to start the discussion: - Do you want to change any of the default settings in the dotenv files - like the line lengths, error messages we exclude or include, or anything like that. - Do we want to have a requirements-dev.txt file for contributors? This setup uses pre-commit package, I didn't include it in the environment.yaml file, but instead instructed the user to install it in the CONTRIBUTING.MD file. - Once you have the hooks installed, it will only affect the files you are committing in the future. A big chunk of our codebase does not conform to the formatting/style settings. We will have to run the hooks on the codebase retrospectively. I'm happy to do that, but it will create many changes and a significant looking PR :) Any thoughts on how we should approach this? Thanks! Related work items: #18350 * Merged PR 513: 3D training script for Waldeland's model with Ignite Related work items: #16356 * Merged PR 565: Demo notebook updated with 3D graph Changes: 1) Updated demo notebook with the 3D visualization 2) Formatting changes due to new black/flake8 git hook Related work items: #17432 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * Merged PR 341: Tests for cv_lib/metrics This PR is dependent on the tests created in the previous branch !333. That's why the PR is to merge tests into vapaunic/metrics branch (so the changed files below only include the diff between these two branches. However, I can change this once the vapaunic/metrics is merged. I created these tests under cv_lib/ since metrics are a part of that library. I imagine we will have tests under deepseismic_interpretation/, and the top level /tests for integration testing. Let me know if you have any comments on this test, or the structure. As agreed, I'm using pytest. Related work items: #16955 * merged tests into this branch * merged tests into this branch * Merged PR 569: Minor PR: change to pre-commit configuration files Related work items: #18350 * Merged PR 586: Purging unused files and experiments Purging unused files and experiments Related work items: #20499 * moved prepare data under scripts * moved prepare data under scripts * removed untested model configs * removed untested model configs * fixed weird bug in penobscot data loader * fixed weird bug in penobscot data loader * penobscot experiments working for hrnet, seresnet, no depth and patch depth * penobscot experiments working for hrnet, seresnet, no depth and patch depth * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * removed a section loader bug in the penobscot loader * fixed bugs in my previous 'fix' * fixed bugs in my previous 'fix' * removed redundant _open_mask from subclasses * removed redundant _open_mask from subclasses * Merged PR 601: Fixes to penobscot experiments A few changes: - Instructions in README on how to download and process Penobscot and F3 2D data sets - moved prepare_data scripts to the scripts/ directory - fixed a weird issue with a class method in Penobscot data loader - fixed a bug in section loader (_add_extra_channel in section loader was not necessary and was causing an issue) - removed config files that were not tested or working in Penobscot experiments - modified default.py so it's working if train.py ran without a config file Related work items: #20694 * Merged PR 605: added common metrics to Waldeland model in Ignite Related work items: #19550 * Removed redundant extract_metric_from * Removed redundant extract_metric_from * formatting changes in metrics * formatting changes in metrics * modified penobscot experiment to use new local metrics * modified penobscot experiment to use new local metrics * modified section experimen to pass device to metrics * modified section experimen to pass device to metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * moved metrics out of dutchf3, modified distributed to work with the new metrics * fixed other experiments after new metrics * fixed other experiments after new metrics * removed apex metrics from distributed train.py * removed apex metrics from distributed train.py * added ignite-based metrics to dutch voxel experiment * added ignite-based metrics to dutch voxel experiment * removed apex metrics * removed apex metrics * modified penobscot test script to use new metrics * pytorch-ignite pre-release with new metrics until stable available * removed cell output from the F3 notebook * deleted .vscode * modified metric import in test_metrics.py * separated metrics out as a module * relative logger file path, modified section experiment * removed the REPO_PATH from init * created util logging function, and moved logging file to each experiment * modified demo experiment * modified penobscot experiment * modified dutchf3_voxel experiment * no logging in voxel2pixel * modified dutchf3 patch local experiment * modified patch distributed experiment * modified interpretation notebook * minor changes to comments * Updates notebook to use itkwidgets for interactive visualisation * Further updates * Fixes merge conflicts * removing files * Adding reproduction experiment instructions to readme * checking in ablation study from ilkarman (#124) tests pass but final results aren't communicated to github. No way to trigger another commit other than to do a dummy commit
This commit is contained in:
Родитель
341bb01b3b
Коммит
b75e6476c9
|
@ -0,0 +1,27 @@
|
|||
|
||||
parameters:
|
||||
storagename: #
|
||||
storagekey: #
|
||||
conda: seismic-interpretation
|
||||
|
||||
steps:
|
||||
|
||||
- bash: |
|
||||
echo "##vso[task.prependpath]$CONDA/bin"
|
||||
|
||||
- bash: |
|
||||
echo "Running setup..."
|
||||
|
||||
# make sure we have the latest and greatest
|
||||
conda env create -f environment/anaconda/local/environment.yml python=3.6 --force
|
||||
conda init bash
|
||||
source activate ${{parameters.conda}}
|
||||
pip install -e interpretation
|
||||
pip install -e cv_lib
|
||||
# add this if pytorch stops detecting GPU
|
||||
# conda install pytorch torchvision cudatoolkit=9.2 -c pytorch
|
||||
|
||||
# copy your model files like so - using dummy file to illustrate
|
||||
azcopy --quiet --source:https://${{parameters.storagename}}.blob.core.windows.net/models/model --source-key ${{parameters.storagekey}} --destination ./models/your_model_name
|
||||
displayName: Setup
|
||||
failOnStderr: True
|
|
@ -0,0 +1,18 @@
|
|||
parameters:
|
||||
conda: seismic-interpretation
|
||||
|
||||
steps:
|
||||
- bash: |
|
||||
echo "Starting unit tests"
|
||||
source activate ${{parameters.conda}}
|
||||
pytest --durations=0 --junitxml 'reports/test-unit.xml' cv_lib/tests/
|
||||
echo "Unit test job passed"
|
||||
displayName: Unit Tests Job
|
||||
failOnStderr: True
|
||||
|
||||
- task: PublishTestResults@2
|
||||
displayName: 'Publish Test Results **/test-*.xml'
|
||||
inputs:
|
||||
testResultsFiles: '**/test-*.xml'
|
||||
failTaskOnFailedTests: true
|
||||
condition: succeededOrFailed()
|
|
@ -0,0 +1,28 @@
|
|||
# Copyright (c) Microsoft Corporation. All rights reserved.
|
||||
# Licensed under the MIT License.
|
||||
|
||||
# Pull request against these branches will trigger this build
|
||||
pr:
|
||||
- master
|
||||
- staging
|
||||
|
||||
# Any commit to this branch will trigger the build.
|
||||
trigger:
|
||||
- master
|
||||
- staging
|
||||
|
||||
jobs:
|
||||
# partially disable setup for now - done manually on build VM
|
||||
- job: DeepSeismic
|
||||
|
||||
displayName: Deep Seismic Main Build
|
||||
pool:
|
||||
name: $(AgentName)
|
||||
|
||||
steps:
|
||||
- template: steps/setup_step.yml
|
||||
parameters:
|
||||
storagename: $(storageaccoutname)
|
||||
storagekey: $(storagekey)
|
||||
|
||||
- template: steps/unit_test_steps.yml
|
|
@ -0,0 +1,17 @@
|
|||
[flake8]
|
||||
max-line-length = 120
|
||||
max-complexity = 18
|
||||
select = B,C,E,F,W,T4,B9
|
||||
ignore =
|
||||
# slice notation whitespace, invalid
|
||||
E203
|
||||
# too many leading ‘#’ for block comment
|
||||
E266
|
||||
# module level import not at top of file
|
||||
E402
|
||||
# line break before binary operator
|
||||
W503
|
||||
# blank line contains whitespace
|
||||
W293
|
||||
# line too long
|
||||
E501
|
|
@ -89,6 +89,24 @@ venv/
|
|||
ENV/
|
||||
env.bak/
|
||||
venv.bak/
|
||||
wheels/
|
||||
|
||||
|
||||
.dev_env
|
||||
.azureml
|
||||
|
||||
# Logs
|
||||
*.tfevents.*
|
||||
**/runs
|
||||
**/log
|
||||
**/output
|
||||
|
||||
#
|
||||
interpretation/environment/anaconda/local/src/*
|
||||
interpretation/environment/anaconda/local/src/cv-lib
|
||||
.code-workspace.code-workspace
|
||||
**/.vscode
|
||||
**/.idea
|
||||
|
||||
# Spyder project settings
|
||||
.spyderproject
|
||||
|
@ -97,8 +115,4 @@ venv.bak/
|
|||
# Rope project settings
|
||||
.ropeproject
|
||||
|
||||
# mkdocs documentation
|
||||
/site
|
||||
|
||||
# mypy
|
||||
.mypy_cache/
|
||||
*.pth
|
|
@ -0,0 +1,17 @@
|
|||
repos:
|
||||
- repo: https://github.com/psf/black
|
||||
rev: stable
|
||||
hooks:
|
||||
- id: black
|
||||
- repo: https://github.com/pre-commit/pre-commit-hooks
|
||||
rev: v1.2.3
|
||||
hooks:
|
||||
- id: flake8
|
||||
- repo: local
|
||||
hooks:
|
||||
- id: jupytext
|
||||
name: jupytext
|
||||
entry: jupytext --from ipynb --pipe black --check flake8
|
||||
pass_filenames: true
|
||||
files: .ipynb
|
||||
language: python
|
|
@ -1,6 +0,0 @@
|
|||
{
|
||||
"python.formatting.provider": "black",
|
||||
"python.linting.enabled": true,
|
||||
"python.linting.flake8Enabled": true,
|
||||
"python.linting.pylintEnabled": false,
|
||||
}
|
|
@ -0,0 +1,32 @@
|
|||
Contributor
|
||||
============
|
||||
|
||||
All names are sorted alphabetically by last name.
|
||||
Contributors, please add your name to the list when you submit a patch to the project.
|
||||
|
||||
|
||||
Contributors (sorted alphabetically)
|
||||
-------------------------------------
|
||||
To contributors: please add your name to the list when you submit a patch to the project.
|
||||
|
||||
* Ashish Bhatia
|
||||
* Daniel Ciborowski
|
||||
* George Iordanescu
|
||||
* Ilia Karmanov
|
||||
* Max Kaznady
|
||||
* Vanja Paunic
|
||||
* Mathew Salvaris
|
||||
|
||||
|
||||
## How to be a contributor to the repository
|
||||
This project welcomes contributions and suggestions. Most contributions require you to agree to a
|
||||
Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us
|
||||
the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
|
||||
|
||||
When you submit a pull request, a CLA bot will automatically determine whether you need to provide
|
||||
a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions
|
||||
provided by the bot. You will only need to do this once across all repos using our CLA.
|
||||
|
||||
This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
|
||||
For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or
|
||||
contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments.
|
|
@ -0,0 +1,88 @@
|
|||
# Contribution Guidelines
|
||||
|
||||
Contributions are welcomed! Here's a few things to know:
|
||||
|
||||
* [Steps to Contributing](#steps-to-contributing)
|
||||
* [Coding Guidelines](#coding-guidelines)
|
||||
* [Microsoft Contributor License Agreement](#microsoft-contributor-license-agreement)
|
||||
* [Code of Conduct](#code-of-conduct)
|
||||
|
||||
## Steps to Contributing
|
||||
|
||||
**TL;DR for contributing: We use the staging branch to land all new features and fixes. To make a contribution, please create a branch from staging, make a modification in the code and create a PR to staging.**
|
||||
|
||||
Here are the basic steps to get started with your first contribution. Please reach out with any questions.
|
||||
1. Use [open issues](https://github.com/Microsoft/DeepSeismic/issues) to discuss the proposed changes. Create an issue describing changes if necessary to collect feedback. Also, please use provided labels to tag issues so everyone can easily sort issues of interest.
|
||||
2. [Fork the repo](https://help.github.com/articles/fork-a-repo/) so you can make and test local changes.
|
||||
3. Create a new branch **from staging branch** for the issue (please do not create a branch from master). We suggest prefixing the branch with your username and then a descriptive title: (e.g. username/update_contributing_docs)
|
||||
4. Create a test that replicates the issue.
|
||||
5. Make code changes.
|
||||
6. Ensure unit tests pass and code style / formatting is consistent TODO: add docstring links.
|
||||
7. Create a pull request against **staging** branch.
|
||||
|
||||
Once the features included in a [milestone](https://github.com/Microsoft/DeepSeismic/milestones) are completed, we will merge contrib into staging. TODO: make a wiki with coding guidelines.
|
||||
|
||||
## Coding Guidelines
|
||||
|
||||
We strive to maintain high quality code to make the utilities in the repository easy to understand, use, and extend. We also work hard to maintain a friendly and constructive environment. We've found that having clear expectations on the development process and consistent style helps to ensure everyone can contribute and collaborate effectively.
|
||||
|
||||
### Code formatting and style checking
|
||||
We use `git-hooks` to automate the process of formatting and style checking the code. In particular, we use `black` as a code formatter, `flake8` for style checking, and the `pre-commit` Python framework, which ensures that both, code formatter and checker, are ran on the code during commit. If they are executed with no issues, then the commit is made, otherwise, the commit is denied until stylistic or formatting changes are made.
|
||||
|
||||
Please follow these instructions to set up `pre-commit` in your environment.
|
||||
|
||||
```
|
||||
pip install pre-commit
|
||||
pre-commit install
|
||||
```
|
||||
|
||||
The above will install the pre-commit package, and install git hooks specified in `.pre-commit-config.yaml` into your `.git/` directory.
|
||||
|
||||
## Microsoft Contributor License Agreement
|
||||
|
||||
Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.microsoft.com.
|
||||
|
||||
TODO: add CLA-bot
|
||||
|
||||
## Code of Conduct
|
||||
|
||||
This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
|
||||
|
||||
For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments.
|
||||
|
||||
Apart from the official Code of Conduct developed by Microsoft, in the Computer Vision team we adopt the following behaviors, to ensure a great working environment:
|
||||
|
||||
#### Do not point fingers
|
||||
Let’s be constructive.
|
||||
|
||||
<details>
|
||||
<summary><em>Click here to see some examples</em></summary>
|
||||
|
||||
"This method is missing docstrings" instead of "YOU forgot to put docstrings".
|
||||
|
||||
</details>
|
||||
|
||||
#### Provide code feedback based on evidence
|
||||
|
||||
When making code reviews, try to support your ideas based on evidence (papers, library documentation, stackoverflow, etc) rather than your personal preferences.
|
||||
|
||||
<details>
|
||||
<summary><em>Click here to see some examples</em></summary>
|
||||
|
||||
"When reviewing this code, I saw that the Python implementation the metrics are based on classes, however, [scikit-learn](https://scikit-learn.org/stable/modules/classes.html#sklearn-metrics-metrics) and [tensorflow](https://www.tensorflow.org/api_docs/python/tf/metrics) use functions. We should follow the standard in the industry."
|
||||
|
||||
</details>
|
||||
|
||||
|
||||
#### Ask questions - do not give answers
|
||||
Try to be empathic.
|
||||
|
||||
<details>
|
||||
<summary><em>Click here to see some examples</em></summary>
|
||||
|
||||
* Would it make more sense if ...?
|
||||
* Have you considered this ... ?
|
||||
|
||||
</details>
|
||||
|
||||
|
Двоичный файл не отображается.
После Ширина: | Высота: | Размер: 151 KiB |
43
LICENSE
43
LICENSE
|
@ -1,21 +1,22 @@
|
|||
MIT License
|
||||
|
||||
Copyright (c) Microsoft Corporation.
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to deal
|
||||
in the Software without restriction, including without limitation the rights
|
||||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
copies of the Software, and to permit persons to whom the Software is
|
||||
furnished to do so, subject to the following conditions:
|
||||
|
||||
The above copyright notice and this permission notice shall be included in all
|
||||
copies or substantial portions of the Software.
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
SOFTWARE
|
||||
MIT License
|
||||
|
||||
Copyright (c) Microsoft Corporation. All rights reserved.
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to deal
|
||||
in the Software without restriction, including without limitation the rights
|
||||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
copies of the Software, and to permit persons to whom the Software is
|
||||
furnished to do so, subject to the following conditions:
|
||||
|
||||
The above copyright notice and this permission notice shall be included in all
|
||||
copies or substantial portions of the Software.
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
SOFTWARE
|
||||
|
Разница между файлами не показана из-за своего большого размера
Загрузить разницу
471
README.md
471
README.md
|
@ -1,69 +1,402 @@
|
|||
---
|
||||
page_type: sample
|
||||
languages:
|
||||
- csharp
|
||||
products:
|
||||
- dotnet
|
||||
description: "Add 150 character max description"
|
||||
urlFragment: "update-this-to-unique-url-stub"
|
||||
---
|
||||
|
||||
# DeepSeismic
|
||||
|
||||
![Build Status](https://dev.azure.com/best-practices/deepseismic/_apis/build/status/microsoft.DeepSeismic?branchName=master)
|
||||
[![Build Status](https://dev.azure.com/best-practices/deepseismic/_apis/build/status/microsoft.DeepSeismic?branchName=master)](https://dev.azure.com/best-practices/deepseismic/_build/latest?definitionId=108&branchName=master)
|
||||
|
||||
# Official Microsoft Sample
|
||||
|
||||
<!--
|
||||
Guidelines on README format: https://review.docs.microsoft.com/help/onboard/admin/samples/concepts/readme-template?branch=master
|
||||
|
||||
Guidance on onboarding samples to docs.microsoft.com/samples: https://review.docs.microsoft.com/help/onboard/admin/samples/process/onboarding?branch=master
|
||||
|
||||
Taxonomies for products and languages: https://review.docs.microsoft.com/new-hope/information-architecture/metadata/taxonomies?branch=master
|
||||
-->
|
||||
|
||||
Give a short description for your sample here. What does it do and why is it important?
|
||||
|
||||
## Contents
|
||||
|
||||
Outline the file contents of the repository. It helps users navigate the codebase, build configuration and any related assets.
|
||||
|
||||
| File/folder | Description |
|
||||
|-------------------|--------------------------------------------|
|
||||
| `src` | Sample source code. |
|
||||
| `.gitignore` | Define what to ignore at commit time. |
|
||||
| `CHANGELOG.md` | List of changes to the sample. |
|
||||
| `CONTRIBUTING.md` | Guidelines for contributing to the sample. |
|
||||
| `README.md` | This README file. |
|
||||
| `LICENSE` | The license for the sample. |
|
||||
|
||||
## Prerequisites
|
||||
|
||||
Outline the required components and tools that a user might need to have on their machine in order to run the sample. This can be anything from frameworks, SDKs, OS versions or IDE releases.
|
||||
|
||||
## Setup
|
||||
|
||||
Explain how to prepare the sample once the user clones or downloads the repository. The section should outline every step necessary to install dependencies and set up any settings (for example, API keys and output folders).
|
||||
|
||||
## Runnning the sample
|
||||
|
||||
Outline step-by-step instructions to execute the sample and see its output. Include steps for executing the sample from the IDE, starting specific services in the Azure portal or anything related to the overall launch of the code.
|
||||
|
||||
## Key concepts
|
||||
|
||||
Provide users with more context on the tools and services used in the sample. Explain some of the code that is being used and how services interact with each other.
|
||||
|
||||
## Contributing
|
||||
|
||||
This project welcomes contributions and suggestions. Most contributions require you to agree to a
|
||||
Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us
|
||||
the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
|
||||
|
||||
When you submit a pull request, a CLA bot will automatically determine whether you need to provide
|
||||
a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions
|
||||
provided by the bot. You will only need to do this once across all repos using our CLA.
|
||||
|
||||
This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
|
||||
For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or
|
||||
contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments.
|
||||
# DeepSeismic
|
||||
![DeepSeismic](./assets/DeepSeismicLogo.jpg )
|
||||
|
||||
This repository shows you how to perform seismic imaging and interpretation on Azure. It empowers geophysicists and data scientists to run seismic experiments using state-of-art DSL-based PDE solvers and segmentation algorithms on Azure.
|
||||
|
||||
The repository provides sample notebooks, data loaders for seismic data, utilities, and out-of-the box ML pipelines, organized as follows:
|
||||
- **sample notebooks**: these can be found in the `examples` folder - they are standard Jupyter notebooks which highlight how to use the codebase by walking the user through a set of pre-made examples
|
||||
- **experiments**: the goal is to provide runnable Python scripts which train and test (score) our machine learning models in `experiments` folder. The models themselves are swappable, meaning a single train script can be used to run a different model on the same dataset by simply swapping out the configuration file which defines the model. Experiments are organized by model types and datasets - for example, "2D segmentation on Dutch F3 dataset", "2D segmentation on Penobscot dataset" and "3D segmentation on Penobscot dataset" are all different experiments. As another example, if one is swapping 2D segmentation models on Dutch F3 dataset, one would just point the train and test scripts to a different configuration file within the same experiment.
|
||||
- **pip installable utilities**: we provide `cv_lib` and `deepseismic_interpretation` utilities (more info below) which are used by both sample notebooks and experiments mentioned above
|
||||
|
||||
DeepSeismic currently focuses on Seismic Interpretation (3D segmentation aka facies classification) with experimental code provided around Seismic Imaging.
|
||||
|
||||
### Quick Start
|
||||
|
||||
There are two ways to get started with the DeepSeismic codebase, which currently focuses on Interpretation:
|
||||
- if you'd like to get an idea of how our interpretation (segmentation) models are used, simply review the [HRNet demo notebook](https://github.com/microsoft/DeepSeismic/blob/master/examples/interpretation/notebooks/HRNet_Penobscot_demo_notebook.ipynb)
|
||||
- to actually run the code, you'll need to set up a compute environment (which includes setting up a GPU-enabled Linux VM and downloading the appropriate Anaconda Python packages) and download the datasets which you'd like to work with - detailed steps for doing this are provided in the next `Interpretation` section below.
|
||||
|
||||
If you run into any problems, chances are your problem has already been solved in the [Troubleshooting](#troubleshooting) section.
|
||||
|
||||
### Pre-run notebooks
|
||||
|
||||
Notebooks stored in the repository have output intentionally displaced - you can find full auto-generated versions of the notebooks here:
|
||||
- **HRNet Penobscot demo**: [[HTML](https://deepseismicstore.blob.core.windows.net/shared/HRNet_Penobscot_demo_notebook.html)] [[.ipynb](https://deepseismicstore.blob.core.windows.net/shared/HRNet_Penobscot_demo_notebook.ipynb)]
|
||||
- **Dutch F3 dataset**: [[HTML](https://deepseismicstore.blob.core.windows.net/shared/F3_block_training_and_evaluation_local.html)] [[.ipynb](https://deepseismicstore.blob.core.windows.net/shared/F3_block_training_and_evaluation_local.ipynb)]
|
||||
|
||||
### Azure Machine Learning
|
||||
[Azure Machine Learning](https://docs.microsoft.com/en-us/azure/machine-learning/) enables you to train and deploy your machine learning models and pipelines at scale, ane leverage open-source Python frameworks, such as PyTorch, TensorFlow, and scikit-learn. If you are looking at getting started with using the code in this repository with Azure Machine Learning, refer to [Azure Machine Learning How-to](https://github.com/Azure/MachineLearningNotebooks/tree/master/how-to-use-azureml) to get started.
|
||||
|
||||
## Interpretation
|
||||
For seismic interpretation, the repository consists of extensible machine learning pipelines, that shows how you can leverage state-of-the-art segmentation algorithms (UNet, SEResNET, HRNet) for seismic interpretation, and also benchmarking results from running these algorithms using various seismic datasets (Dutch F3, and Penobscot).
|
||||
|
||||
To run examples available on the repo, please follow instructions below to:
|
||||
1) [Set up the environment](#setting-up-environment)
|
||||
2) [Download the data sets](#dataset-download-and-preparation)
|
||||
3) [Run example notebooks and scripts](#run-examples)
|
||||
|
||||
### Setting up Environment
|
||||
|
||||
Follow the instruction bellow to read about compute requirements and install required libraries.
|
||||
|
||||
|
||||
#### Compute environment
|
||||
|
||||
We recommend using a virtual machine to run the example notebooks and scripts. Specifically, you will need a GPU powered Linux machine, as this repository is developed and tested on __Linux only__. The easiest way to get started is to use the [Azure Data Science Virtual Machine (DSVM) for Linux (Ubuntu)](https://docs.microsoft.com/en-us/azure/machine-learning/data-science-virtual-machine/dsvm-ubuntu-intro). This VM will come installed with all the system requirements that are needed to create the conda environment described below and then run the notebooks in this repository.
|
||||
|
||||
For this repo, we recommend selecting a multi-GPU Ubuntu VM of type [Standard_NC12](https://docs.microsoft.com/en-us/azure/virtual-machines/windows/sizes-gpu#nc-series). The machine is powered by NVIDIA Tesla K80 (or V100 GPU for NCv2 series) which can be found in most Azure regions.
|
||||
|
||||
> NOTE: For users new to Azure, your subscription may not come with a quota for GPUs. You may need to go into the Azure portal to increase your quota for GPU VMs. Learn more about how to do this here: https://docs.microsoft.com/en-us/azure/azure-subscription-service-limits.
|
||||
|
||||
|
||||
#### Package Installation
|
||||
|
||||
To install packages contained in this repository, navigate to the directory where you pulled the DeepSeismic repo to run:
|
||||
```bash
|
||||
conda env create -f environment/anaconda/local/environment.yml
|
||||
```
|
||||
This will create the appropriate conda environment to run experiments.
|
||||
|
||||
Next you will need to install the common package for interpretation:
|
||||
```bash
|
||||
conda activate seismic-interpretation
|
||||
pip install -e interpretation
|
||||
```
|
||||
|
||||
Then you will also need to install `cv_lib` which contains computer vision related utilities:
|
||||
```bash
|
||||
pip install -e cv_lib
|
||||
```
|
||||
|
||||
Both repos are installed in developer mode with the `-e` flag. This means that to update simply go to the folder and pull the appropriate commit or branch.
|
||||
|
||||
During development, in case you need to update the environment due to a conda env file change, you can run
|
||||
```
|
||||
conda env update --file environment/anaconda/local/environment.yml
|
||||
```
|
||||
from the root of DeepSeismic repo.
|
||||
|
||||
|
||||
### Dataset download and preparation
|
||||
|
||||
This repository provides examples on how to run seismic interpretation on two publicly available annotated seismic datasets: [Penobscot](https://zenodo.org/record/1341774) and [F3 Netherlands](https://github.com/olivesgatech/facies_classification_benchmark). Their respective sizes (uncompressed on disk in your folder after downloading and pre-processing) are:
|
||||
- **Penobscot**: 7.9 GB
|
||||
- **Dutch F3**: 2.2 GB
|
||||
|
||||
Please make sure you have enough disk space to download either dataset.
|
||||
|
||||
We have experiments and notebooks which use either one dataset or the other. Depending on which experiment/notebook you want to run you'll need to download the corresponding dataset. We suggest you start by looking at [HRNet demo notebook](https://github.com/microsoft/DeepSeismic/blob/master/examples/interpretation/notebooks/HRNet_Penobscot_demo_notebook.ipynb) which requires the Penobscot dataset.
|
||||
|
||||
#### Penobscot
|
||||
To download the Penobscot dataset run the [download_penobscot.sh](scripts/download_penobscot.sh) script, e.g.
|
||||
|
||||
```
|
||||
data_dir="$HOME/data/penobscot"
|
||||
mkdir -p "$data_dir"
|
||||
./scripts/download_penobscot.sh "$data_dir"
|
||||
```
|
||||
|
||||
Note that the specified download location should be configured with appropriate `write` permissions. On some Linux virtual machines, you may want to place the data into `/mnt` or `/data` folder so you have to make sure you have write access.
|
||||
|
||||
To make things easier, we suggested you use your home directory where you might run out of space. If this happens on an [Azure Data Science Virtual Machine](https://azure.microsoft.com/en-us/services/virtual-machines/data-science-virtual-machines/) you can resize the disk quite easily from [Azure Portal](https://portal.azure.com) - please see the [Troubleshooting](#troubleshooting) section at the end of this README regarding [how to do this](#how-to-resize-data-science-virtual-machine-disk).
|
||||
|
||||
To prepare the data for the experiments (e.g. split into train/val/test), please run the following script (modifying arguments as desired):
|
||||
|
||||
```
|
||||
python scripts/prepare_penobscot.py split_inline --data-dir="$HOME/data/penobscot" --val-ratio=.1 --test-ratio=.2
|
||||
```
|
||||
|
||||
#### F3 Netherlands
|
||||
To download the F3 Netherlands dataset for 2D experiments, please follow the data download instructions at
|
||||
[this github repository](https://github.com/yalaudah/facies_classification_benchmark) (section Dataset).
|
||||
|
||||
Once you've downloaded the data set, make sure to create an empty `splits` directory, under the downloaded `data` directory; you can re-use the same data directory as the one for Penobscot dataset created earlier. This is where your training/test/validation splits will be saved.
|
||||
|
||||
```
|
||||
cd data
|
||||
mkdir splits
|
||||
```
|
||||
|
||||
At this point, your `data` directory tree should look like this:
|
||||
|
||||
```
|
||||
data
|
||||
├── splits
|
||||
├── test_once
|
||||
│ ├── test1_labels.npy
|
||||
│ ├── test1_seismic.npy
|
||||
│ ├── test2_labels.npy
|
||||
│ └── test2_seismic.npy
|
||||
└── train
|
||||
├── train_labels.npy
|
||||
└── train_seismic.npy
|
||||
```
|
||||
|
||||
To prepare the data for the experiments (e.g. split into train/val/test), please run the following script:
|
||||
|
||||
```
|
||||
# For section-based experiments
|
||||
python scripts/prepare_dutchf3.py split_train_val section --data-dir=/mnt/dutchf3
|
||||
|
||||
|
||||
# For patch-based experiments
|
||||
python scripts/prepare_dutchf3.py split_train_val patch --data-dir=/mnt/dutchf3 --stride=50 --patch=100
|
||||
|
||||
```
|
||||
|
||||
Refer to the script itself for more argument options.
|
||||
|
||||
### Run Examples
|
||||
|
||||
#### Notebooks
|
||||
We provide example notebooks under `examples/interpretation/notebooks/` to demonstrate how to train seismic interpretation models and evaluate them on Penobscot and F3 datasets.
|
||||
|
||||
Make sure to run the notebooks in the conda environment we previously set up (`seismic-interpretation`). To register the conda environment in Jupyter, please run:
|
||||
|
||||
```
|
||||
python -m ipykernel install --user --name seismic-interpretation
|
||||
```
|
||||
|
||||
#### Experiments
|
||||
|
||||
We also provide scripts for a number of experiments we conducted using different segmentation approaches. These experiments are available under `experiments/interpretation`, and can be used as examples. Within each experiment start from the `train.sh` and `test.sh` scripts under the `local/` (single GPU) and `distributed/` (multiple GPUs) directories, which invoke the corresponding python scripts, `train.py` and `test.py`. Take a look at the experiment configurations (see Experiment Configuration Files section below) for experiment options and modify if necessary.
|
||||
|
||||
Please refer to individual experiment README files for more information.
|
||||
- [Penobscot](experiments/interpretation/penobscot/README.md)
|
||||
- [F3 Netherlands Patch](experiments/interpretation/dutchf3_patch/README.md)
|
||||
- [F3 Netherlands Section](experiments/interpretation/dutchf3_section/README.md)
|
||||
|
||||
#### Configuration Files
|
||||
We use [YACS](https://github.com/rbgirshick/yacs) configuration library to manage configuration options for the experiments. There are three ways to pass arguments to the experiment scripts (e.g. train.py or test.py):
|
||||
|
||||
- __default.py__ - A project config file `default.py` is a one-stop reference point for all configurable options, and provides sensible defaults for all arguments. If no arguments are passed to `train.py` or `test.py` script (e.g. `python train.py`), the arguments are by default loaded from `default.py`. Please take a look at `default.py` to familiarize yourself with the experiment arguments the script you run uses.
|
||||
|
||||
- __yml config files__ - YAML configuration files under `configs/` are typically created one for each experiment. These are meant to be used for repeatable experiment runs and reproducible settings. Each configuration file only overrides the options that are changing in that experiment (e.g. options loaded from `defaults.py` during an experiment run will be overridden by arguments loaded from the yaml file). As an example, to use yml configuration file with the training script, run:
|
||||
|
||||
```
|
||||
python train.py --cfg "configs/hrnet.yaml"
|
||||
```
|
||||
|
||||
- __command line__ - Finally, options can be passed in through `options` argument, and those will override arguments loaded from the configuration file. We created CLIs for all our scripts (using Python Fire library), so you can pass these options via command-line arguments, like so:
|
||||
|
||||
```
|
||||
python train.py DATASET.ROOT "/mnt/dutchf3" TRAIN.END_EPOCH 10
|
||||
```
|
||||
|
||||
|
||||
### Pretrained Models
|
||||
|
||||
#### HRNet
|
||||
|
||||
To achieve the same results as the benchmarks above you will need to download the HRNet model [pretrained](https://github.com/HRNet/HRNet-Image-Classification) on ImageNet. We are specifically using the [HRNet-W48-C](https://1drv.ms/u/s!Aus8VCZ_C_33dKvqI6pBZlifgJk) pre-trained model; other HRNet variants are also available [here](https://github.com/HRNet/HRNet-Image-Classification) - you can navigate to those from the [main HRNet landing page](https://github.com/HRNet/HRNet-Object-Detection) for object detection.
|
||||
|
||||
Unfortunately the OneDrive location which is used to host the model is using a temporary authentication token, so there is no way for us to scipt up model download. There are two ways to upload and use the pre-trained HRNet model on DS VM:
|
||||
- download the model to your local drive using a web browser of your choice and then upload the model to the DS VM using something like `scp`; navigate to Portal and copy DS VM's public IP from the Overview panel of your DS VM (you can search your DS VM by name in the search bar of the Portal) then use `scp local_model_location username@DS_VM_public_IP:./model/save/path` to upload
|
||||
- alternatively you can use the same public IP to open remote desktop over SSH to your Linux VM using [X2Go](https://wiki.x2go.org/doku.php/download:start): you can basically open the web browser on your VM this way and download the model to VM's disk
|
||||
|
||||
|
||||
### Viewers (optional)
|
||||
|
||||
For seismic interpretation (segmentation), if you want to visualize cross-sections of a 3D volume (both the input velocity model and the segmented output) you can use
|
||||
[segyviewer](https://github.com/equinor/segyviewer). To install and use segyviewer, please follow the instructions below.
|
||||
|
||||
#### segyviewer
|
||||
|
||||
To install [segyviewer](https://github.com/equinor/segyviewer) run:
|
||||
```bash
|
||||
conda env create -n segyviewer python=2.7
|
||||
conda activate segyviewer
|
||||
conda install -c anaconda pyqt=4.11.4
|
||||
pip install segyviewer
|
||||
```
|
||||
|
||||
To visualize cross-sections of a 3D volume, you can run
|
||||
[segyviewer](https://github.com/equinor/segyviewer) like so:
|
||||
```bash
|
||||
segyviewer "${HOME}/data/dutchf3/data.segy"
|
||||
```
|
||||
|
||||
### Benchmarks
|
||||
|
||||
#### Dense Labels
|
||||
|
||||
This section contains benchmarks of different algorithms for seismic interpretation on 3D seismic datasets with densely-annotated data.
|
||||
|
||||
Below are the results from the models contained in this repo. To run them check the instructions in <benchmarks> folder. Alternatively take a look in <examples> for how to run them on your own dataset
|
||||
|
||||
#### Netherlands F3
|
||||
|
||||
| Source | Experiment | PA | FW IoU | MCA |
|
||||
|------------------|-----------------------------------|-------------|--------------|------------|
|
||||
| Alaudah et al.| Section-based | 0.905 | 0.817 | .832 |
|
||||
| | Patch-based | 0.852 | 0.743 | .689 |
|
||||
| DeepSeismic | Patch-based+fixed | .869 | .761 | .775 |
|
||||
| | SEResNet UNet+section depth | .917 | .849 | .834 |
|
||||
| | HRNet(patch)+patch_depth | .908 | .843 | .837 |
|
||||
| | HRNet(patch)+section_depth | .928 | .871 | .871 |
|
||||
|
||||
#### Penobscot
|
||||
|
||||
Trained and tested on full dataset. Inlines with artefacts were left in for training, validation and testing.
|
||||
The dataset was split 70% training, 10% validation and 20% test. The results below are from the test set
|
||||
|
||||
| Source | Experiment | PA | IoU | MCA |
|
||||
|------------------|-------------------------------------|-------------|--------------|------------|
|
||||
| DeepSeismic | SEResNet UNet + section depth | 1.0 | .98 | .99 |
|
||||
| | HRNet(patch) + section depth | 1.0 | .97 | .98 |
|
||||
|
||||
![Best Penobscot SEResNet](assets/penobscot_seresnet_best.png "Best performing inlines, Mask and Predictions from SEResNet")
|
||||
![Worst Penobscot SEResNet](assets/penobscot_seresnet_worst.png "Worst performing inlines Mask and Predictions from SEResNet")
|
||||
|
||||
#### Reproduce benchmarks
|
||||
In order to reproduce the benchmarks you will need to navigate to the [experiments](experiments) folder. In there each of the experiments
|
||||
are split into different folders. To run the Netherlands F3 experiment navigate to the [dutchf3_patch/local](experiments/dutchf3_patch/local) folder. In there is a training script [([train.sh](experiments/dutchf3_patch/local/train.sh))
|
||||
which will run the training for any configuration you pass in. Once you have run the training you will need to run the [test.sh](experiments/dutchf3_patch/local/test.sh) script. Make sure you specify
|
||||
the path to the best performing model from your training run, either by passing it in as an argument or altering the YACS config file.
|
||||
|
||||
To reproduce the benchmarks
|
||||
for the Penobscot dataset follow the same instructions but navigate to the [penobscot](penobscot) folder.
|
||||
|
||||
#### Scripts
|
||||
- [parallel_training.sh](scripts/parallel_training.sh): Script to launch multiple jobs in parallel. Used mainly for local hyperparameter tuning. Look at the script for further instructions
|
||||
|
||||
- [kill_windows.sh](scripts/kill_windows.sh): Script to kill multiple tmux windows. Used to kill jobs that parallel_training.sh might have started.
|
||||
|
||||
|
||||
## Contributing
|
||||
|
||||
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
|
||||
|
||||
### Submitting a Pull Request
|
||||
|
||||
We try to keep the repo in a clean state, which means that we only enable read access to the repo - read access still enables one to submit a PR or an issue. To do so, fork the repo, and submit a PR from a branch in your forked repo into our staging branch.
|
||||
|
||||
When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
|
||||
|
||||
This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/). For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments.
|
||||
|
||||
## Build Status
|
||||
| Build | Branch | Status |
|
||||
| --- | --- | --- |
|
||||
| **Legal Compliance** | staging | [![Build Status](https://dev.azure.com/best-practices/deepseismic/_apis/build/status/microsoft.ComponentGovernance%20(seismic-deeplearning)?branchName=staging)](https://dev.azure.com/best-practices/deepseismic/_build/latest?definitionId=124&branchName=staging) |
|
||||
| **Legal Compliance** | master | [![Build Status](https://dev.azure.com/best-practices/deepseismic/_apis/build/status/microsoft.ComponentGovernance%20(seismic-deeplearning)?branchName=master)](https://dev.azure.com/best-practices/deepseismic/_build/latest?definitionId=124&branchName=master) |
|
||||
| **Tests** | staging | [![Build Status](https://dev.azure.com/best-practices/deepseismic/_apis/build/status/microsoft.Notebooks%20(seismic-deeplearning)?branchName=staging)](https://dev.azure.com/best-practices/deepseismic/_build/latest?definitionId=125&branchName=staging) |
|
||||
| **Tests** | master | [![Build Status](https://dev.azure.com/best-practices/deepseismic/_apis/build/status/microsoft.Notebooks%20(seismic-deeplearning)?branchName=master)](https://dev.azure.com/best-practices/deepseismic/_build/latest?definitionId=125&branchName=master) |
|
||||
| **Notebook Tests** | staging | [![Build Status](https://dev.azure.com/best-practices/deepseismic/_apis/build/status/microsoft.Tests%20(seismic-deeplearning)?branchName=staging)](https://dev.azure.com/best-practices/deepseismic/_build/latest?definitionId=126&branchName=staging) |
|
||||
| **Notebook Tests** | master | [![Build Status](https://dev.azure.com/best-practices/deepseismic/_apis/build/status/microsoft.Tests%20(seismic-deeplearning)?branchName=master)](https://dev.azure.com/best-practices/deepseismic/_build/latest?definitionId=126&branchName=master) |
|
||||
|
||||
|
||||
# Troubleshooting
|
||||
|
||||
For Data Science Virtual Machine conda package installation issues, make sure you locate the anaconda location on the DSVM, for example by running:
|
||||
```bash
|
||||
which python
|
||||
```
|
||||
A typical output will be:
|
||||
```bash
|
||||
someusername@somevm:/projects/DeepSeismic$ which python
|
||||
/anaconda/envs/py35/bin/python
|
||||
```
|
||||
which will indicate that anaconda folder is __/anaconda__. We'll refer to this location in instructions below, but you should update the commands according to your local anaconda folder.
|
||||
|
||||
<details>
|
||||
<summary><b>Data Science Virtual Machine conda package installation errors</b></summary>
|
||||
|
||||
It could happen that you don't have sufficient permissions to run conda commands / install packages in an Anaconda packages directory. To remedy the situation, please run the following commands
|
||||
```bash
|
||||
rm -rf /anaconda/pkgs/*
|
||||
sudo chown -R $(whoami) /anaconda
|
||||
```
|
||||
|
||||
After these commands complete, try installing the packages again.
|
||||
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary><b>Data Science Virtual Machine conda package installation warnings</b></summary>
|
||||
|
||||
It could happen that while creating the conda environment defined by environment/anaconda/local/environment.yml on an Ubuntu DSVM, one can get multiple warnings like so:
|
||||
```
|
||||
WARNING conda.gateways.disk.delete:unlink_or_rename_to_trash(140): Could not remove or rename /anaconda/pkgs/ipywidgets-7.5.1-py_0/site-packages/ipywidgets-7.5.1.dist-info/LICENSE. Please remove this file manually (you may need to reboot to free file handles)
|
||||
```
|
||||
|
||||
If this happens, similar to instructions above, stop the conda environment creation (type ```Ctrl+C```) and then change recursively the ownership /anaconda directory from root to current user, by running this command:
|
||||
|
||||
```bash
|
||||
sudo chown -R $USER /anaconda
|
||||
```
|
||||
|
||||
After these command completes, try creating the conda environment in __environment/anaconda/local/environment.yml__ again.
|
||||
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary><b>Model training or scoring is not using GPU</b></summary>
|
||||
|
||||
To see if GPU is being using while your model is being trained or used for inference, run
|
||||
```bash
|
||||
nvidia-smi
|
||||
```
|
||||
and confirm that you see you Python process using the GPU.
|
||||
|
||||
If not, you may want to try reverting to an older version of CUDA for use with pyTorch. After the environment has been setup, run the following command (by default we use CUDA 10) after running `conda activate seismic-interpretation` to activate the conda environment:
|
||||
```bash
|
||||
conda install pytorch torchvision cudatoolkit=9.2 -c pytorch
|
||||
```
|
||||
|
||||
To test whether this setup worked, right after you can open `ipython` and execute the following code
|
||||
```python
|
||||
import torch
|
||||
torch.cuda.is_available()
|
||||
```
|
||||
|
||||
The output should say "True".
|
||||
|
||||
If the output is still "False", you may want to try setting your environment variable to specify the device manually - to test this, start a new `ipython` session and type:
|
||||
```python
|
||||
import os
|
||||
os.environ['CUDA_VISIBLE_DEVICES']='0'
|
||||
import torch
|
||||
torch.cuda.is_available()
|
||||
```
|
||||
|
||||
Output should say "True" this time. If it does, you can make the change permanent by adding
|
||||
```bash
|
||||
export CUDA_VISIBLE_DEVICES=0
|
||||
```
|
||||
to your `$HOME/.bashrc` file.
|
||||
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary><b>GPU out of memory errors</b></summary>
|
||||
|
||||
You should be able to see how much GPU memory your process is using by running
|
||||
```bash
|
||||
nvidia-smi
|
||||
```
|
||||
and seeing if this amount is close to the physical memory limit specified by the GPU manufacturer.
|
||||
|
||||
If we're getting close to the memory limit, you may want to lower the batch size in the model configuration file. Specifically, `TRAIN.BATCH_SIZE_PER_GPU` and `VALIDATION.BATCH_SIZE_PER_GPU` settings.
|
||||
|
||||
</details>
|
||||
|
||||
<details>
|
||||
<summary><b>How to resize Data Science Virtual Machine disk</b></summary>
|
||||
|
||||
1. Go to the [Azure Portal](https://portal.azure.com) and find your virtual machine by typing its name in the search bar at the very top of the page.
|
||||
|
||||
2. In the Overview panel on the left hand side, click Stop button to stop the virtual machine.
|
||||
|
||||
3. Next, select Disks in the same panel on the left hand side.
|
||||
|
||||
4. Click the Name of the OS Disk - you'll be navigated to the Disk view. From this view, select Configuration on the left hand side and then increase Size in GB and hit the Save button.
|
||||
|
||||
5. Navigate back to the Virtual Machine view in Step 2 and click the Start button to start the virtual machine.
|
||||
|
||||
</details>
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
|
|
|
@ -0,0 +1,51 @@
|
|||
AUTO_RESUME: False
|
||||
CUDNN:
|
||||
BENCHMARK: True
|
||||
DETERMINISTIC: False
|
||||
ENABLED: True
|
||||
DATASET:
|
||||
CLASS_WEIGHTS: [0.7151, 0.8811, 0.5156, 0.9346, 0.9683, 0.9852]
|
||||
NUM_CLASSES: 6
|
||||
ROOT:
|
||||
GPUS: (0,)
|
||||
LOG_CONFIG: logging.conf
|
||||
LOG_DIR:
|
||||
MODEL:
|
||||
IN_CHANNELS: 1
|
||||
NAME: patch_deconvnet
|
||||
OUTPUT_DIR: output
|
||||
PIN_MEMORY: True
|
||||
PRINT_FREQ: 20
|
||||
SEED: 42
|
||||
TEST:
|
||||
CROSSLINE: True
|
||||
INLINE: True
|
||||
MODEL_PATH:
|
||||
SPLIT: Both
|
||||
TEST_STRIDE: 10
|
||||
TRAIN:
|
||||
AUGMENTATION: True
|
||||
AUGMENTATIONS:
|
||||
PAD:
|
||||
HEIGHT: 256
|
||||
WIDTH: 256
|
||||
RESIZE:
|
||||
HEIGHT: 200
|
||||
WIDTH: 200
|
||||
BATCH_SIZE_PER_GPU: 32
|
||||
BEGIN_EPOCH: 0
|
||||
DEPTH: no
|
||||
END_EPOCH: 484
|
||||
MAX_LR: 0.01
|
||||
MEAN: 0.0009997
|
||||
MIN_LR: 0.001
|
||||
MODEL_DIR: models
|
||||
MOMENTUM: 0.9
|
||||
PATCH_SIZE: 99
|
||||
SNAPSHOTS: 5
|
||||
STD: 0.20977
|
||||
STRIDE: 50
|
||||
WEIGHT_DECAY: 0.0001
|
||||
VALIDATION:
|
||||
BATCH_SIZE_PER_GPU: 32
|
||||
WORKERS: 4
|
Двоичный файл не отображается.
После Ширина: | Высота: | Размер: 151 KiB |
Двоичный файл не отображается.
После Ширина: | Высота: | Размер: 48 KiB |
Двоичный файл не отображается.
После Ширина: | Высота: | Размер: 50 KiB |
|
@ -1,19 +0,0 @@
|
|||
# Starter pipeline
|
||||
# Start with a minimal pipeline that you can customize to build and deploy your code.
|
||||
# Add steps that build, run tests, deploy, and more:
|
||||
# https://aka.ms/yaml
|
||||
|
||||
trigger:
|
||||
- master
|
||||
|
||||
pool:
|
||||
vmImage: 'ubuntu-latest'
|
||||
|
||||
steps:
|
||||
- script: echo Hello, world!
|
||||
displayName: 'Run a one-line script'
|
||||
|
||||
- script: |
|
||||
echo Add other tasks to build, test, and deploy your project.
|
||||
echo See https://aka.ms/yaml
|
||||
displayName: 'Run a multi-line script'
|
6
bin/ds
6
bin/ds
|
@ -1,6 +0,0 @@
|
|||
#!/usr/bin/env python
|
||||
|
||||
from deepseismic import cli
|
||||
|
||||
if __name__ == "__main__":
|
||||
cli.main()
|
|
@ -0,0 +1,64 @@
|
|||
{"Registrations":[
|
||||
{
|
||||
"component": {
|
||||
"type": "git",
|
||||
"git": {
|
||||
"repositoryUrl": "https://github.com/olivesgatech/facies_classification_benchmark",
|
||||
"commitHash": "12102683a1ae78f8fbc953823c35a43b151194b3"
|
||||
}
|
||||
},
|
||||
"license": "MIT"
|
||||
},
|
||||
{
|
||||
"component": {
|
||||
"type": "git",
|
||||
"git": {
|
||||
"repositoryUrl": "https://github.com/waldeland/CNN-for-ASI",
|
||||
"commitHash": "6f985cccecf9a811565d0b7cd919412569a22b7b"
|
||||
}
|
||||
},
|
||||
"license": "MIT"
|
||||
},
|
||||
{
|
||||
"component": {
|
||||
"type": "git",
|
||||
"git": {
|
||||
"repositoryUrl": "https://github.com/opesci/devito",
|
||||
"commitHash": "f6129286d9c0b3a8bfe07e724ac5b00dc762efee"
|
||||
}
|
||||
},
|
||||
"license": "MIT"
|
||||
},
|
||||
{
|
||||
"component": {
|
||||
"type": "git",
|
||||
"git": {
|
||||
"repositoryUrl": "https://github.com/pytorch/ignite",
|
||||
"commitHash": "38a4f37de759e33bc08441bde99bcb50f3d81f55"
|
||||
}
|
||||
},
|
||||
"license": "BSD-3-Clause"
|
||||
},
|
||||
{
|
||||
"component": {
|
||||
"type": "git",
|
||||
"git": {
|
||||
"repositoryUrl": "https://github.com/HRNet/HRNet-Semantic-Segmentation",
|
||||
"commitHash": "06142dc1c7026e256a7561c3e875b06622b5670f"
|
||||
}
|
||||
},
|
||||
"license": "MIT"
|
||||
},
|
||||
{
|
||||
"component": {
|
||||
"type": "git",
|
||||
"git": {
|
||||
"repositoryUrl": "https://github.com/dask/dask",
|
||||
"commitHash": "54019e9c05134585c9c40e4195206aa78e2ea61a"
|
||||
}
|
||||
},
|
||||
"license": "IPL-1.0"
|
||||
}
|
||||
],
|
||||
"Version": 1
|
||||
}
|
|
@ -0,0 +1,8 @@
|
|||
### Contrib folder
|
||||
|
||||
Code in this folder has not been tested, and are meant for exploratory work only.
|
||||
|
||||
We encourage submissions to the contrib folder, and once they are well-tested, do submit a pull request and work with the repository owners to graduate it to the main DeepSeismic repository.
|
||||
|
||||
Thank you.
|
||||
|
|
@ -0,0 +1,6 @@
|
|||
# Benchmarks
|
||||
|
||||
In this folder we show benchmarks using different algorithms. To facilitate the benchmark computation, we provide a set of wrapper functions that can be found in the file [benchmark_utils.py](benchmark_utils.py).
|
||||
|
||||
TODO
|
||||
|
|
@ -0,0 +1,17 @@
|
|||
First, make sure that `${HOME}/data/dutch_f3` folder exists and you have write access.
|
||||
|
||||
Next, to get the main input dataset which is the [Dutch F3 dataset](https://terranubis.com/datainfo/Netherlands-Offshore-F3-Block-Complete),
|
||||
navigate to [MalenoV](https://github.com/bolgebrygg/MalenoV) project website and follow the links (which will lead to
|
||||
[this](https://drive.google.com/drive/folders/0B7brcf-eGK8CbGhBdmZoUnhiTWs) download). Save this file as
|
||||
`${HOME}/data/dutch_f3/data.segy`
|
||||
|
||||
To download the train and validation masks, from the root of the repo, run
|
||||
```bash
|
||||
./contrib/scripts/get_F3_voxel.sh ${HOME}/data/dutch_f3
|
||||
```
|
||||
|
||||
This will also download train and validation masks to the same location as data.segy.
|
||||
|
||||
That's it!
|
||||
|
||||
To run the training script, run `python train.py --cfg=configs/texture_net.yaml`.
|
|
@ -0,0 +1,41 @@
|
|||
# TextureNet configuration
|
||||
|
||||
CUDNN:
|
||||
BENCHMARK: true
|
||||
DETERMINISTIC: false
|
||||
ENABLED: true
|
||||
GPUS: (0,)
|
||||
OUTPUT_DIR: 'output'
|
||||
LOG_DIR: 'log'
|
||||
WORKERS: 4
|
||||
PRINT_FREQ: 10
|
||||
LOG_CONFIG: logging.conf
|
||||
SEED: 2019
|
||||
WINDOW_SIZE: 65
|
||||
|
||||
DATASET:
|
||||
NUM_CLASSES: 2
|
||||
ROOT: /home/maxkaz/data/dutchf3
|
||||
FILENAME: data.segy
|
||||
|
||||
MODEL:
|
||||
NAME: texture_net
|
||||
IN_CHANNELS: 1
|
||||
NUM_FILTERS: 50
|
||||
|
||||
TRAIN:
|
||||
BATCH_SIZE_PER_GPU: 32
|
||||
END_EPOCH: 5000
|
||||
LR: 0.02
|
||||
MOMENTUM: 0.9
|
||||
WEIGHT_DECAY: 0.0001
|
||||
DEPTH: "voxel" # Options are No, Patch, Section and Voxel
|
||||
MODEL_DIR: "models"
|
||||
|
||||
VALIDATION:
|
||||
BATCH_SIZE_PER_GPU: 32
|
||||
|
||||
TEST:
|
||||
MODEL_PATH: ""
|
||||
SPLIT: 'Both' # Can be Both, Test1, Test2
|
||||
|
|
@ -0,0 +1,82 @@
|
|||
# ------------------------------------------------------------------------------
|
||||
# Copyright (c) Microsoft
|
||||
# Licensed under the MIT License.
|
||||
# ------------------------------------------------------------------------------
|
||||
|
||||
from __future__ import absolute_import
|
||||
from __future__ import division
|
||||
from __future__ import print_function
|
||||
|
||||
from yacs.config import CfgNode as CN
|
||||
|
||||
_C = CN()
|
||||
|
||||
# Cudnn related params
|
||||
_C.CUDNN = CN()
|
||||
_C.CUDNN.BENCHMARK = True
|
||||
_C.CUDNN.DETERMINISTIC = False
|
||||
_C.CUDNN.ENABLED = True
|
||||
|
||||
_C.GPUS = (0,)
|
||||
_C.OUTPUT_DIR = "output" # This will be the base directory for all output, such as logs and saved models
|
||||
_C.LOG_DIR = "" # This will be a subdirectory inside OUTPUT_DIR
|
||||
_C.WORKERS = 4
|
||||
_C.PRINT_FREQ = 20
|
||||
_C.LOG_CONFIG = "logging.conf"
|
||||
_C.SEED = 42
|
||||
# size of voxel cube: WINDOW_SIZE x WINDOW_SIZE x WINDOW_SIZE; used for 3D models only
|
||||
_C.WINDOW_SIZE = 65
|
||||
|
||||
# DATASET related params
|
||||
_C.DATASET = CN()
|
||||
_C.DATASET.NUM_CLASSES = 2
|
||||
_C.DATASET.ROOT = ""
|
||||
_C.DATASET.FILENAME = "data.segy"
|
||||
|
||||
# common params for NETWORK
|
||||
_C.MODEL = CN()
|
||||
_C.MODEL.NAME = "texture_net"
|
||||
_C.MODEL.IN_CHANNELS = 1
|
||||
_C.MODEL.NUM_FILTERS = 50
|
||||
_C.MODEL.EXTRA = CN(new_allowed=True)
|
||||
|
||||
# training
|
||||
_C.TRAIN = CN()
|
||||
_C.TRAIN.BATCH_SIZE_PER_GPU = 32
|
||||
# number of batches per epoch
|
||||
_C.TRAIN.BATCH_PER_EPOCH = 10
|
||||
# total number of epochs
|
||||
_C.TRAIN.END_EPOCH = 200
|
||||
_C.TRAIN.LR = 0.01
|
||||
_C.TRAIN.MOMENTUM = 0.9
|
||||
_C.TRAIN.WEIGHT_DECAY = 0.0001
|
||||
_C.TRAIN.DEPTH = "voxel" # Options are None, Patch and Section
|
||||
_C.TRAIN.MODEL_DIR = "models" # This will be a subdirectory inside OUTPUT_DIR
|
||||
|
||||
# validation
|
||||
_C.VALIDATION = CN()
|
||||
_C.VALIDATION.BATCH_SIZE_PER_GPU = 32
|
||||
|
||||
# TEST
|
||||
_C.TEST = CN()
|
||||
_C.TEST.MODEL_PATH = ""
|
||||
_C.TEST.SPLIT = "Both" # Can be Both, Test1, Test2
|
||||
|
||||
|
||||
def update_config(cfg, options=None, config_file=None):
|
||||
cfg.defrost()
|
||||
|
||||
if config_file:
|
||||
cfg.merge_from_file(config_file)
|
||||
|
||||
if options:
|
||||
cfg.merge_from_list(options)
|
||||
|
||||
cfg.freeze()
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
import sys
|
||||
|
||||
with open(sys.argv[1], "w") as f:
|
||||
print(_C, file=f)
|
|
@ -0,0 +1,34 @@
|
|||
[loggers]
|
||||
keys=root,__main__,event_handlers
|
||||
|
||||
[handlers]
|
||||
keys=consoleHandler
|
||||
|
||||
[formatters]
|
||||
keys=simpleFormatter
|
||||
|
||||
[logger_root]
|
||||
level=INFO
|
||||
handlers=consoleHandler
|
||||
|
||||
[logger___main__]
|
||||
level=INFO
|
||||
handlers=consoleHandler
|
||||
qualname=__main__
|
||||
propagate=0
|
||||
|
||||
[logger_event_handlers]
|
||||
level=INFO
|
||||
handlers=consoleHandler
|
||||
qualname=event_handlers
|
||||
propagate=0
|
||||
|
||||
[handler_consoleHandler]
|
||||
class=StreamHandler
|
||||
level=INFO
|
||||
formatter=simpleFormatter
|
||||
args=(sys.stdout,)
|
||||
|
||||
[formatter_simpleFormatter]
|
||||
format=%(asctime)s - %(name)s - %(levelname)s - %(message)s
|
||||
|
|
@ -0,0 +1,230 @@
|
|||
# Copyright (c) Microsoft Corporation. All rights reserved.
|
||||
# Licensed under the MIT License.
|
||||
# /* spell-checker: disable */
|
||||
|
||||
import logging
|
||||
import logging.config
|
||||
from os import path
|
||||
|
||||
import fire
|
||||
import numpy as np
|
||||
import torch
|
||||
from torch.utils import data
|
||||
from ignite.engine import Events
|
||||
from ignite.handlers import ModelCheckpoint
|
||||
from ignite.metrics import Loss
|
||||
from ignite.utils import convert_tensor
|
||||
from tqdm import tqdm
|
||||
|
||||
from deepseismic_interpretation.dutchf3.data import get_voxel_loader
|
||||
from deepseismic_interpretation.models.texture_net import TextureNet
|
||||
|
||||
from cv_lib.utils import load_log_configuration
|
||||
from cv_lib.event_handlers import (
|
||||
SnapshotHandler,
|
||||
logging_handlers,
|
||||
tensorboard_handlers,
|
||||
)
|
||||
from cv_lib.event_handlers.logging_handlers import Evaluator
|
||||
from cv_lib.event_handlers.tensorboard_handlers import create_summary_writer
|
||||
|
||||
from cv_lib.segmentation.metrics import (
|
||||
pixelwise_accuracy,
|
||||
class_accuracy,
|
||||
mean_class_accuracy,
|
||||
class_iou,
|
||||
mean_iou,
|
||||
)
|
||||
from cv_lib.segmentation import extract_metric_from
|
||||
|
||||
# from cv_lib.segmentation.dutchf3.engine import (
|
||||
# create_supervised_evaluator,
|
||||
# create_supervised_trainer,
|
||||
# )
|
||||
# Use ignite generic versions for now
|
||||
from ignite.engine import create_supervised_trainer, create_supervised_evaluator
|
||||
|
||||
from default import _C as config
|
||||
from default import update_config
|
||||
|
||||
|
||||
def _prepare_batch(batch, device=None, non_blocking=False, t_type=torch.FloatTensor):
|
||||
x, y = batch
|
||||
new_x = convert_tensor(torch.squeeze(x, 1), device=device, non_blocking=non_blocking)
|
||||
new_y = convert_tensor(torch.unsqueeze(y, 2), device=device, non_blocking=non_blocking)
|
||||
if device == "cuda":
|
||||
return (
|
||||
new_x.type(t_type).cuda(),
|
||||
torch.unsqueeze(new_y, 3).type(torch.LongTensor).cuda(),
|
||||
)
|
||||
else:
|
||||
return new_x.type(t_type), torch.unsqueeze(new_y, 3).type(torch.LongTensor)
|
||||
|
||||
|
||||
def run(*options, cfg=None):
|
||||
"""Run training and validation of model
|
||||
|
||||
Notes:
|
||||
Options can be passed in via the options argument and loaded from the cfg file
|
||||
Options from default.py will be overridden by options loaded from cfg file
|
||||
Options passed in via options argument will override option loaded from cfg file
|
||||
|
||||
Args:
|
||||
*options (str,int ,optional): Options used to overide what is loaded from the
|
||||
config. To see what options are available consult
|
||||
default.py
|
||||
cfg (str, optional): Location of config file to load. Defaults to None.
|
||||
"""
|
||||
|
||||
update_config(config, options=options, config_file=cfg)
|
||||
|
||||
# Start logging
|
||||
load_log_configuration(config.LOG_CONFIG)
|
||||
logger = logging.getLogger(__name__)
|
||||
logger.debug(config.WORKERS)
|
||||
torch.backends.cudnn.benchmark = config.CUDNN.BENCHMARK
|
||||
|
||||
torch.manual_seed(config.SEED)
|
||||
if torch.cuda.is_available():
|
||||
torch.cuda.manual_seed_all(config.SEED)
|
||||
np.random.seed(seed=config.SEED)
|
||||
|
||||
# load the data
|
||||
TrainVoxelLoader = get_voxel_loader(config)
|
||||
|
||||
train_set = TrainVoxelLoader(
|
||||
config.DATASET.ROOT,
|
||||
config.DATASET.FILENAME,
|
||||
split="train",
|
||||
window_size=config.WINDOW_SIZE,
|
||||
len=config.TRAIN.BATCH_SIZE_PER_GPU * config.TRAIN.BATCH_PER_EPOCH,
|
||||
batch_size=config.TRAIN.BATCH_SIZE_PER_GPU,
|
||||
)
|
||||
val_set = TrainVoxelLoader(
|
||||
config.DATASET.ROOT,
|
||||
config.DATASET.FILENAME,
|
||||
split="val",
|
||||
window_size=config.WINDOW_SIZE,
|
||||
len=config.TRAIN.BATCH_SIZE_PER_GPU * config.TRAIN.BATCH_PER_EPOCH,
|
||||
batch_size=config.TRAIN.BATCH_SIZE_PER_GPU,
|
||||
)
|
||||
|
||||
n_classes = train_set.n_classes
|
||||
|
||||
# set dataset length to batch size to be consistent with 5000 iterations
|
||||
# each of size 32 in the original Waldeland implementation
|
||||
train_loader = data.DataLoader(
|
||||
train_set, batch_size=config.TRAIN.BATCH_SIZE_PER_GPU, num_workers=config.WORKERS, shuffle=False,
|
||||
)
|
||||
val_loader = data.DataLoader(
|
||||
val_set, batch_size=config.VALIDATION.BATCH_SIZE_PER_GPU, num_workers=config.WORKERS, shuffle=False,
|
||||
)
|
||||
|
||||
# this is how we import model for CV - here we're importing a seismic
|
||||
# segmentation model
|
||||
model = TextureNet(n_classes=config.DATASET.NUM_CLASSES)
|
||||
|
||||
optimizer = torch.optim.Adam(
|
||||
model.parameters(),
|
||||
lr=config.TRAIN.LR,
|
||||
# momentum=config.TRAIN.MOMENTUM,
|
||||
weight_decay=config.TRAIN.WEIGHT_DECAY,
|
||||
)
|
||||
|
||||
device = "cpu"
|
||||
|
||||
if torch.cuda.is_available():
|
||||
device = "cuda"
|
||||
model = model.cuda()
|
||||
|
||||
loss = torch.nn.CrossEntropyLoss()
|
||||
|
||||
trainer = create_supervised_trainer(model, optimizer, loss, prepare_batch=_prepare_batch, device=device)
|
||||
|
||||
desc = "ITERATION - loss: {:.2f}"
|
||||
pbar = tqdm(initial=0, leave=False, total=len(train_loader), desc=desc.format(0))
|
||||
|
||||
# add model checkpointing
|
||||
output_dir = path.join(config.OUTPUT_DIR, config.TRAIN.MODEL_DIR)
|
||||
checkpoint_handler = ModelCheckpoint(
|
||||
output_dir, "model", save_interval=1, n_saved=3, create_dir=True, require_empty=False,
|
||||
)
|
||||
|
||||
criterion = torch.nn.CrossEntropyLoss(reduction="mean")
|
||||
|
||||
# save model at each epoch
|
||||
trainer.add_event_handler(Events.EPOCH_COMPLETED, checkpoint_handler, {config.MODEL.NAME: model})
|
||||
|
||||
def _select_pred_and_mask(model_out):
|
||||
# receive a tuple of (x, y_pred), y
|
||||
# so actually in line 51 of
|
||||
# cv_lib/cv_lib/segmentation/dutch_f3/metrics/__init__.py
|
||||
# we do the following line, so here we just select the model
|
||||
# _, y_pred = torch.max(model_out[0].squeeze(), 1, keepdim=True)
|
||||
y_pred = model_out[0].squeeze()
|
||||
y = model_out[1].squeeze()
|
||||
return (y_pred.squeeze(), y)
|
||||
|
||||
evaluator = create_supervised_evaluator(
|
||||
model,
|
||||
metrics={
|
||||
"nll": Loss(criterion, device=device),
|
||||
"pixa": pixelwise_accuracy(n_classes, output_transform=_select_pred_and_mask, device=device),
|
||||
"cacc": class_accuracy(n_classes, output_transform=_select_pred_and_mask, device=device),
|
||||
"mca": mean_class_accuracy(n_classes, output_transform=_select_pred_and_mask, device=device),
|
||||
"ciou": class_iou(n_classes, output_transform=_select_pred_and_mask, device=device),
|
||||
"mIoU": mean_iou(n_classes, output_transform=_select_pred_and_mask, device=device),
|
||||
},
|
||||
device=device,
|
||||
prepare_batch=_prepare_batch,
|
||||
)
|
||||
|
||||
# Set the validation run to start on the epoch completion of the training run
|
||||
trainer.add_event_handler(Events.EPOCH_COMPLETED, Evaluator(evaluator, val_loader))
|
||||
|
||||
summary_writer = create_summary_writer(log_dir=path.join(output_dir, config.LOG_DIR))
|
||||
|
||||
evaluator.add_event_handler(
|
||||
Events.EPOCH_COMPLETED,
|
||||
logging_handlers.log_metrics(
|
||||
"Validation results",
|
||||
metrics_dict={
|
||||
"mIoU": "Avg IoU :",
|
||||
"nll": "Avg loss :",
|
||||
"pixa": "Pixelwise Accuracy :",
|
||||
"mca": "Mean Class Accuracy :",
|
||||
},
|
||||
),
|
||||
)
|
||||
evaluator.add_event_handler(
|
||||
Events.EPOCH_COMPLETED,
|
||||
tensorboard_handlers.log_metrics(
|
||||
summary_writer,
|
||||
trainer,
|
||||
"epoch",
|
||||
metrics_dict={"mIoU": "Validation/IoU", "nll": "Validation/Loss", "mca": "Validation/MCA",},
|
||||
),
|
||||
)
|
||||
|
||||
summary_writer = create_summary_writer(log_dir=path.join(output_dir, config.LOG_DIR))
|
||||
|
||||
snapshot_duration = 1
|
||||
|
||||
def snapshot_function():
|
||||
return (trainer.state.iteration % snapshot_duration) == 0
|
||||
|
||||
checkpoint_handler = SnapshotHandler(
|
||||
path.join(output_dir, config.TRAIN.MODEL_DIR),
|
||||
config.MODEL.NAME,
|
||||
extract_metric_from("mIoU"),
|
||||
snapshot_function,
|
||||
)
|
||||
evaluator.add_event_handler(Events.EPOCH_COMPLETED, checkpoint_handler, {"model": model})
|
||||
|
||||
logger.info("Starting training")
|
||||
trainer.run(train_loader, max_epochs=config.TRAIN.END_EPOCH // config.TRAIN.BATCH_PER_EPOCH)
|
||||
pbar.close()
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
fire.Fire(run)
|
|
@ -0,0 +1,54 @@
|
|||
# Voxel to Pixel approach to Seismic Interpretation
|
||||
|
||||
The code which is used in this approach is greatly described in the paper
|
||||
<br />
|
||||
**Convolutional Neural Networks for Automated Seismic Interpretation**,<br />
|
||||
A. U. Waldeland, A. C. Jensen, L. Gelius and A. H. S. Solberg <br />
|
||||
[*The Leading Edge, July 2018*](https://library.seg.org/doi/abs/10.1190/tle37070529.1)
|
||||
|
||||
There is also an
|
||||
EAGE E-lecture which you can watch: [*Seismic interpretation with deep learning*](https://www.youtube.com/watch?v=lm85Ap4OstM) (YouTube)
|
||||
|
||||
### Setup to get started
|
||||
- make sure you follow `README.md` file in root of repo to install all the proper dependencies.
|
||||
- downgrade TensorFlow and pyTorch's CUDA:
|
||||
- downgrade TensorFlow by running `pip install tensorflow-gpu==1.14`
|
||||
- make sure pyTorch uses downgraded CUDA `pip install torch==1.3.1+cu92 torchvision==0.4.2+cu92 -f https://download.pytorch.org/whl/torch_stable.html`
|
||||
- download the data by running `contrib/scrips/get_F3_voxel.sh` from the `contrib` folder of this repo.
|
||||
This will download the training and validation labels/masks.
|
||||
- to get the main input dataset which is the [Dutch F3 dataset](https://terranubis.com/datainfo/Netherlands-Offshore-F3-Block-Complete),
|
||||
navigate to [MalenoV](https://github.com/bolgebrygg/MalenoV) project website and follow the links (which will lead to
|
||||
[this](https://drive.google.com/drive/folders/0B7brcf-eGK8CbGhBdmZoUnhiTWs) download). Save this file as
|
||||
`interpretation/voxel2pixel/F3/data.segy`
|
||||
|
||||
If you want to revert downgraded packages, just run `conda env update -f environment/anaconda/local/environment.yml` from the root folder of the repo.
|
||||
|
||||
### Monitoring progress with TensorBoard
|
||||
- from the `voxel2pixel` directory, run `tensorboard --logdir='log'` (all runtime logging information is
|
||||
written to the `log` folder <br />
|
||||
- open a web-browser and go to localhost:6006<br />
|
||||
More information can be found [here](https://www.tensorflow.org/get_started/summaries_and_tensorboard#launching_tensorboard).
|
||||
|
||||
### Usage
|
||||
- `python train.py` will train the CNN and produce a model after a few hours on a decent gaming GPU
|
||||
with at least 6GB of onboard memory<br />
|
||||
- `python test_parallel.py` - Example of how the trained CNN can be applied to predict salt in a slice or
|
||||
the full cube in distributed fashion on a single multi-GPU machine (single GPU mode is also supported).
|
||||
In addition it shows how learned attributes can be extracted.<br />
|
||||
|
||||
### Files
|
||||
In addition, it may be useful to have a look on these files<br/>
|
||||
- texture_net.py - this is where the network is defined <br/>
|
||||
- batch.py - provides functionality to generate training batches with random augmentation <br/>
|
||||
- data.py - load/save data sets with segy-format and labeled slices as images <br/>
|
||||
- tb_logger.py - connects to the tensorboard functionality <br/>
|
||||
- utils.py - some help functions <br/>
|
||||
- test_parallel.py - multi-GPU prediction script for scoring<br />
|
||||
|
||||
### Using a different data set and custom training labels
|
||||
If you want to use a different data set, do the following:
|
||||
- Make a new folder where you place the segy-file
|
||||
- Make a folder for the training labels
|
||||
- Save images of the slices you want to train on as 'SLICETYPE_SLICENO.png' (or jpg), where SLICETYPE is either 'inline', 'crossline', or 'timeslice' and SLICENO is the slice number.
|
||||
- Draw the classes on top of the seismic data, using a simple image editing program with the class colors. Currently up to six classes are supported, indicated by the colors: red, blue, green, cyan, magenta and yellow.
|
||||
|
|
@ -0,0 +1,351 @@
|
|||
# Copyright (c) Microsoft. All rights reserved.
|
||||
# Licensed under the MIT license.
|
||||
|
||||
# code modified from https://github.com/waldeland/CNN-for-ASI
|
||||
|
||||
import numpy as np
|
||||
|
||||
|
||||
def get_random_batch(
|
||||
data_cube,
|
||||
label_coordinates,
|
||||
im_size,
|
||||
num_batch_size,
|
||||
random_flip=False,
|
||||
random_stretch=None,
|
||||
random_rot_xy=None,
|
||||
random_rot_z=None,
|
||||
):
|
||||
"""
|
||||
Returns a batch of augmented samples with center pixels randomly drawn from label_coordinates
|
||||
|
||||
Args:
|
||||
data_cube: 3D numpy array with floating point velocity values
|
||||
label_coordinates: 3D coordinates of the labeled training slice
|
||||
im_size: size of the 3D voxel which we're cutting out around each label_coordinate
|
||||
num_batch_size: size of the batch
|
||||
random_flip: bool to perform random voxel flip
|
||||
random_stretch: bool to enable random stretch
|
||||
random_rot_xy: bool to enable random rotation of the voxel around dim-0 and dim-1
|
||||
random_rot_z: bool to enable random rotation around dim-2
|
||||
|
||||
Returns:
|
||||
a tuple of batch numpy array array of data with dimension
|
||||
(batch, 1, data_cube.shape[0], data_cube.shape[1], data_cube.shape[2]) and the associated labels as an array
|
||||
of size (batch).
|
||||
"""
|
||||
|
||||
# Make 3 im_size elements
|
||||
if isinstance(im_size, int):
|
||||
im_size = [im_size, im_size, im_size]
|
||||
|
||||
# Output arrays
|
||||
batch = np.zeros([num_batch_size, 1, im_size[0], im_size[1], im_size[2]])
|
||||
ret_labels = np.zeros([num_batch_size])
|
||||
|
||||
class_keys = list(label_coordinates)
|
||||
n_classes = len(class_keys)
|
||||
|
||||
# Loop through batch
|
||||
n_for_class = 0
|
||||
class_ind = 0
|
||||
for i in range(num_batch_size):
|
||||
|
||||
# Start by getting a grid centered around (0,0,0)
|
||||
grid = get_grid(im_size)
|
||||
|
||||
# Apply random flip
|
||||
if random_flip:
|
||||
grid = augment_flip(grid)
|
||||
|
||||
# Apply random rotations
|
||||
if random_rot_xy:
|
||||
grid = augment_rot_xy(grid, random_rot_xy)
|
||||
if random_rot_z:
|
||||
grid = augment_rot_z(grid, random_rot_z)
|
||||
|
||||
# Apply random stretch
|
||||
if random_stretch:
|
||||
grid = augment_stretch(grid, random_stretch)
|
||||
|
||||
# Pick random location from the label_coordinates for this class:
|
||||
coords_for_class = label_coordinates[class_keys[class_ind]]
|
||||
random_index = rand_int(0, coords_for_class.shape[1])
|
||||
coord = coords_for_class[:, random_index : random_index + 1]
|
||||
|
||||
# Move grid to be centered around this location
|
||||
grid += coord
|
||||
|
||||
# Interpolate samples at grid from the data:
|
||||
sample = trilinear_interpolation(data_cube, grid)
|
||||
|
||||
# Insert in output arrays
|
||||
ret_labels[i] = class_ind
|
||||
batch[i, 0, :, :, :] = np.reshape(sample, (im_size[0], im_size[1], im_size[2]))
|
||||
|
||||
# We seek to have a balanced batch with equally many samples from each class.
|
||||
n_for_class += 1
|
||||
if n_for_class + 1 > int(0.5 + num_batch_size / float(n_classes)):
|
||||
if class_ind < n_classes - 1:
|
||||
class_ind += 1
|
||||
n_for_class = 0
|
||||
|
||||
return batch, ret_labels
|
||||
|
||||
|
||||
def get_grid(im_size):
|
||||
"""
|
||||
getGrid returns z,x,y coordinates centered around (0,0,0)
|
||||
|
||||
Args:
|
||||
im_size: size of window
|
||||
|
||||
Returns
|
||||
numpy int array with size: 3 x im_size**3
|
||||
"""
|
||||
win0 = np.linspace(-im_size[0] // 2, im_size[0] // 2, im_size[0])
|
||||
win1 = np.linspace(-im_size[1] // 2, im_size[1] // 2, im_size[1])
|
||||
win2 = np.linspace(-im_size[2] // 2, im_size[2] // 2, im_size[2])
|
||||
|
||||
x0, x1, x2 = np.meshgrid(win0, win1, win2, indexing="ij")
|
||||
|
||||
ex0 = np.expand_dims(x0.ravel(), 0)
|
||||
ex1 = np.expand_dims(x1.ravel(), 0)
|
||||
ex2 = np.expand_dims(x2.ravel(), 0)
|
||||
|
||||
grid = np.concatenate((ex0, ex1, ex2), axis=0)
|
||||
|
||||
return grid
|
||||
|
||||
|
||||
def augment_flip(grid):
|
||||
"""
|
||||
Random flip of non-depth axes.
|
||||
|
||||
Args:
|
||||
grid: 3D coordinates of the voxel
|
||||
|
||||
Returns:
|
||||
flipped grid coordinates
|
||||
"""
|
||||
|
||||
# Flip x axis
|
||||
if rand_bool():
|
||||
grid[1, :] = -grid[1, :]
|
||||
|
||||
# Flip y axis
|
||||
if rand_bool():
|
||||
grid[2, :] = -grid[2, :]
|
||||
|
||||
return grid
|
||||
|
||||
|
||||
def augment_stretch(grid, stretch_factor):
|
||||
"""
|
||||
Random stretch/scale
|
||||
|
||||
Args:
|
||||
grid: 3D coordinate grid of the voxel
|
||||
stretch_factor: this is actually a boolean which triggers stretching
|
||||
TODO: change this to just call the function and not do -1,1 in rand_float
|
||||
|
||||
Returns:
|
||||
stretched grid coordinates
|
||||
"""
|
||||
stretch = rand_float(-stretch_factor, stretch_factor)
|
||||
grid *= 1 + stretch
|
||||
return grid
|
||||
|
||||
|
||||
def augment_rot_xy(grid, random_rot_xy):
|
||||
"""
|
||||
Random rotation
|
||||
|
||||
Args:
|
||||
grid: coordinate grid list of 3D points
|
||||
random_rot_xy: this is actually a boolean which triggers rotation
|
||||
TODO: change this to just call the function and not do -1,1 in rand_float
|
||||
|
||||
Returns:
|
||||
randomly rotated grid
|
||||
"""
|
||||
theta = np.deg2rad(rand_float(-random_rot_xy, random_rot_xy))
|
||||
x = grid[2, :] * np.cos(theta) - grid[1, :] * np.sin(theta)
|
||||
y = grid[2, :] * np.sin(theta) + grid[1, :] * np.cos(theta)
|
||||
grid[1, :] = x
|
||||
grid[2, :] = y
|
||||
return grid
|
||||
|
||||
|
||||
def augment_rot_z(grid, random_rot_z):
|
||||
"""
|
||||
Random tilt around z-axis (dim-2)
|
||||
|
||||
Args:
|
||||
grid: coordinate grid list of 3D points
|
||||
random_rot_z: this is actually a boolean which triggers rotation
|
||||
TODO: change this to just call the function and not do -1,1 in rand_float
|
||||
|
||||
Returns:
|
||||
randomly tilted coordinate grid
|
||||
"""
|
||||
theta = np.deg2rad(rand_float(-random_rot_z, random_rot_z))
|
||||
z = grid[0, :] * np.cos(theta) - grid[1, :] * np.sin(theta)
|
||||
x = grid[0, :] * np.sin(theta) + grid[1, :] * np.cos(theta)
|
||||
grid[0, :] = z
|
||||
grid[1, :] = x
|
||||
return grid
|
||||
|
||||
|
||||
def trilinear_interpolation(input_array, indices):
|
||||
"""
|
||||
Linear interpolation
|
||||
code taken from
|
||||
http://stackoverflow.com/questions/6427276/3d-interpolation-of-numpy-arrays-without-scipy
|
||||
|
||||
Args:
|
||||
input_array: 3D data array
|
||||
indices: 3D grid coordinates
|
||||
|
||||
Returns:
|
||||
interpolated input array
|
||||
"""
|
||||
|
||||
x_indices, y_indices, z_indices = indices[0:3]
|
||||
|
||||
n0, n1, n2 = input_array.shape
|
||||
|
||||
x0 = x_indices.astype(np.integer)
|
||||
y0 = y_indices.astype(np.integer)
|
||||
z0 = z_indices.astype(np.integer)
|
||||
x1 = x0 + 1
|
||||
y1 = y0 + 1
|
||||
z1 = z0 + 1
|
||||
|
||||
# put all samples outside datacube to 0
|
||||
inds_out_of_range = (
|
||||
(x0 < 0)
|
||||
| (x1 < 0)
|
||||
| (y0 < 0)
|
||||
| (y1 < 0)
|
||||
| (z0 < 0)
|
||||
| (z1 < 0)
|
||||
| (x0 >= n0)
|
||||
| (x1 >= n0)
|
||||
| (y0 >= n1)
|
||||
| (y1 >= n1)
|
||||
| (z0 >= n2)
|
||||
| (z1 >= n2)
|
||||
)
|
||||
|
||||
x0[inds_out_of_range] = 0
|
||||
y0[inds_out_of_range] = 0
|
||||
z0[inds_out_of_range] = 0
|
||||
x1[inds_out_of_range] = 0
|
||||
y1[inds_out_of_range] = 0
|
||||
z1[inds_out_of_range] = 0
|
||||
|
||||
x = x_indices - x0
|
||||
y = y_indices - y0
|
||||
z = z_indices - z0
|
||||
output = (
|
||||
input_array[x0, y0, z0] * (1 - x) * (1 - y) * (1 - z)
|
||||
+ input_array[x1, y0, z0] * x * (1 - y) * (1 - z)
|
||||
+ input_array[x0, y1, z0] * (1 - x) * y * (1 - z)
|
||||
+ input_array[x0, y0, z1] * (1 - x) * (1 - y) * z
|
||||
+ input_array[x1, y0, z1] * x * (1 - y) * z
|
||||
+ input_array[x0, y1, z1] * (1 - x) * y * z
|
||||
+ input_array[x1, y1, z0] * x * y * (1 - z)
|
||||
+ input_array[x1, y1, z1] * x * y * z
|
||||
)
|
||||
|
||||
output[inds_out_of_range] = 0
|
||||
return output
|
||||
|
||||
|
||||
def rand_float(low, high):
|
||||
"""
|
||||
Generate random floating point number between two limits
|
||||
|
||||
Args:
|
||||
low: low limit
|
||||
high: high limit
|
||||
|
||||
Returns:
|
||||
single random floating point number
|
||||
"""
|
||||
return (high - low) * np.random.random_sample() + low
|
||||
|
||||
|
||||
def rand_int(low, high):
|
||||
"""
|
||||
Generate random integer between two limits
|
||||
|
||||
Args:
|
||||
low: low limit
|
||||
high: high limit
|
||||
|
||||
Returns:
|
||||
random integer between two limits
|
||||
"""
|
||||
return np.random.randint(low, high)
|
||||
|
||||
|
||||
def rand_bool():
|
||||
"""
|
||||
Generate random boolean.
|
||||
|
||||
Returns:
|
||||
Random boolean
|
||||
"""
|
||||
return bool(np.random.randint(0, 2))
|
||||
|
||||
|
||||
"""
|
||||
TODO: the following is not needed and should be added as tests later.
|
||||
|
||||
# Test the batch-functions
|
||||
if __name__ == "__main__":
|
||||
from data import read_segy, read_labels, get_slice
|
||||
import tb_logger
|
||||
import numpy as np
|
||||
import os
|
||||
|
||||
data, data_info = read_segy(os.path.join("F3", "data.segy"))
|
||||
|
||||
train_coordinates = {"1": np.expand_dims(np.array([50, 50, 50]), 1)}
|
||||
|
||||
logger = tb_logger.TBLogger("log", "batch test")
|
||||
|
||||
[batch, labels] = get_random_batch(data, train_coordinates, 65, 32)
|
||||
logger.log_images("normal", batch)
|
||||
|
||||
[batch, labels] = get_random_batch(
|
||||
data, train_coordinates, 65, 32, random_flip=True
|
||||
)
|
||||
logger.log_images("flipping", batch)
|
||||
|
||||
[batch, labels] = get_random_batch(
|
||||
data, train_coordinates, 65, 32, random_stretch=0.50
|
||||
)
|
||||
logger.log_images("stretching", batch)
|
||||
|
||||
[batch, labels] = get_random_batch(
|
||||
data, train_coordinates, 65, 32, random_rot_xy=180
|
||||
)
|
||||
logger.log_images("rot", batch)
|
||||
|
||||
[batch, labels] = get_random_batch(
|
||||
data, train_coordinates, 65, 32, random_rot_z=15
|
||||
)
|
||||
logger.log_images("dip", batch)
|
||||
|
||||
train_cls_imgs, train_coordinates = read_labels(
|
||||
os.path.join("F3", "train"), data_info
|
||||
)
|
||||
[batch, labels] = get_random_batch(data, train_coordinates, 65, 32)
|
||||
logger.log_images("salt", batch[:16, :, :, :, :])
|
||||
logger.log_images("not salt", batch[16:, :, :, :, :])
|
||||
|
||||
logger.log_images("data", data[:, :, 50])
|
||||
"""
|
|
@ -0,0 +1,326 @@
|
|||
# Copyright (c) Microsoft. All rights reserved.
|
||||
# Licensed under the MIT license.
|
||||
|
||||
# code modified from https://github.com/waldeland/CNN-for-ASI
|
||||
|
||||
from __future__ import print_function
|
||||
from os.path import isfile, join
|
||||
|
||||
import segyio
|
||||
from os import listdir
|
||||
import numpy as np
|
||||
import scipy.misc
|
||||
|
||||
|
||||
def read_segy(filename):
|
||||
"""
|
||||
Read in a SEGY-format file given a filename
|
||||
|
||||
Args:
|
||||
filename: input filename
|
||||
|
||||
Returns:
|
||||
numpy data array and its info as a dictionary (tuple)
|
||||
|
||||
"""
|
||||
print("Loading data cube from", filename, "with:")
|
||||
|
||||
# Read full data cube
|
||||
data = segyio.tools.cube(filename)
|
||||
|
||||
# Put temporal axis first
|
||||
data = np.moveaxis(data, -1, 0)
|
||||
|
||||
# Make data cube fast to access
|
||||
data = np.ascontiguousarray(data, "float32")
|
||||
|
||||
# Read meta data
|
||||
segyfile = segyio.open(filename, "r")
|
||||
print(" Crosslines: ", segyfile.xlines[0], ":", segyfile.xlines[-1])
|
||||
print(" Inlines: ", segyfile.ilines[0], ":", segyfile.ilines[-1])
|
||||
print(" Timeslices: ", "1", ":", data.shape[0])
|
||||
|
||||
# Make dict with cube-info
|
||||
# TODO: read this from segy
|
||||
# Read dt and other params needed to do create a new
|
||||
data_info = {
|
||||
"crossline_start": segyfile.xlines[0],
|
||||
"inline_start": segyfile.ilines[0],
|
||||
"timeslice_start": 1,
|
||||
"shape": data.shape,
|
||||
}
|
||||
|
||||
return data, data_info
|
||||
|
||||
|
||||
def write_segy(out_filename, in_filename, out_cube):
|
||||
"""
|
||||
Writes out_cube to a segy-file (out_filename) with same header/size as in_filename
|
||||
|
||||
Args:
|
||||
out_filename:
|
||||
in_filename:
|
||||
out_cube:
|
||||
|
||||
Returns:
|
||||
|
||||
"""
|
||||
# Select last channel
|
||||
if type(out_cube) is list:
|
||||
out_cube = out_cube[-1]
|
||||
|
||||
print("Writing interpretation to " + out_filename)
|
||||
# Copy segy file
|
||||
from shutil import copyfile
|
||||
|
||||
copyfile(in_filename, out_filename)
|
||||
|
||||
# Moving temporal axis back again
|
||||
out_cube = np.moveaxis(out_cube, 0, -1)
|
||||
|
||||
# Open out-file
|
||||
with segyio.open(out_filename, "r+") as src:
|
||||
iline_start = src.ilines[0]
|
||||
dtype = src.iline[iline_start].dtype
|
||||
# loop through inlines and insert output
|
||||
for i in src.ilines:
|
||||
iline = out_cube[i - iline_start, :, :]
|
||||
src.iline[i] = np.ascontiguousarray(iline.astype(dtype))
|
||||
|
||||
# TODO: rewrite this whole function
|
||||
# Moving temporal axis first again - just in case the user want to keep working on it
|
||||
out_cube = np.moveaxis(out_cube, -1, 0)
|
||||
|
||||
print("Writing interpretation - Finished")
|
||||
return
|
||||
|
||||
|
||||
# Alternative writings for slice-type
|
||||
inline_alias = ["inline", "in-line", "iline", "y"]
|
||||
crossline_alias = ["crossline", "cross-line", "xline", "x"]
|
||||
timeslice_alias = ["timeslice", "time-slice", "t", "z", "depthslice", "depth"]
|
||||
|
||||
|
||||
def read_labels(fname, data_info):
|
||||
"""
|
||||
Read labels from an image.
|
||||
|
||||
Args:
|
||||
fname: filename of labelling mask (image)
|
||||
data_info: dictionary describing the data
|
||||
|
||||
Returns:
|
||||
list of labels and list of coordinates
|
||||
"""
|
||||
|
||||
label_imgs = []
|
||||
label_coordinates = {}
|
||||
|
||||
# Find image files in folder
|
||||
|
||||
tmp = fname.split("/")[-1].split("_")
|
||||
slice_type = tmp[0].lower()
|
||||
tmp = tmp[1].split(".")
|
||||
slice_no = int(tmp[0])
|
||||
|
||||
if slice_type not in inline_alias + crossline_alias + timeslice_alias:
|
||||
print(
|
||||
"File:", fname, "could not be loaded.", "Unknown slice type",
|
||||
)
|
||||
return None
|
||||
|
||||
if slice_type in inline_alias:
|
||||
slice_type = "inline"
|
||||
if slice_type in crossline_alias:
|
||||
slice_type = "crossline"
|
||||
if slice_type in timeslice_alias:
|
||||
slice_type = "timeslice"
|
||||
|
||||
# Read file
|
||||
print("Loading labels for", slice_type, slice_no, "with")
|
||||
img = scipy.misc.imread(fname)
|
||||
img = interpolate_to_fit_data(img, slice_type, slice_no, data_info)
|
||||
label_img = parse_labels_in_image(img)
|
||||
|
||||
# Get coordinates for slice
|
||||
coords = get_coordinates_for_slice(slice_type, slice_no, data_info)
|
||||
|
||||
# Loop through labels in label_img and append to label_coordinates
|
||||
for cls in np.unique(label_img):
|
||||
if cls > -1:
|
||||
if str(cls) not in label_coordinates.keys():
|
||||
label_coordinates[str(cls)] = np.array(np.zeros([3, 0]))
|
||||
inds_with_cls = label_img == cls
|
||||
cords_with_cls = coords[:, inds_with_cls.ravel()]
|
||||
label_coordinates[str(cls)] = np.concatenate((label_coordinates[str(cls)], cords_with_cls), 1)
|
||||
print(
|
||||
" ", str(np.sum(inds_with_cls)), "labels for class", str(cls),
|
||||
)
|
||||
if len(np.unique(label_img)) == 1:
|
||||
print(" ", 0, "labels", str(cls))
|
||||
|
||||
# Add label_img to output
|
||||
label_imgs.append([label_img, slice_type, slice_no])
|
||||
|
||||
return label_imgs, label_coordinates
|
||||
|
||||
|
||||
# Add colors to this table to make it possible to have more classes
|
||||
class_color_coding = [
|
||||
[0, 0, 255], # blue
|
||||
[0, 255, 0], # green
|
||||
[0, 255, 255], # cyan
|
||||
[255, 0, 0], # red
|
||||
[255, 0, 255], # blue
|
||||
[255, 255, 0], # yellow
|
||||
]
|
||||
|
||||
|
||||
def parse_labels_in_image(img):
|
||||
"""
|
||||
Convert RGB image to class img.
|
||||
|
||||
Args:
|
||||
img: 3-channel image array
|
||||
|
||||
Returns:
|
||||
monotonically increasing class labels
|
||||
"""
|
||||
label_img = np.int16(img[:, :, 0]) * 0 - 1 # -1 = no class
|
||||
|
||||
# decompose color channels (#Alpha is ignored)
|
||||
r = img[:, :, 0]
|
||||
g = img[:, :, 1]
|
||||
b = img[:, :, 2]
|
||||
|
||||
# Alpha channel
|
||||
if img.shape[2] == 4:
|
||||
a = 1 - img.shape[2] // 255
|
||||
r = r * a
|
||||
g = g * a
|
||||
b = b * a
|
||||
|
||||
tolerance = 1
|
||||
# Go through classes and find pixels with this class
|
||||
cls = 0
|
||||
for color in class_color_coding:
|
||||
# Find pixels with these labels
|
||||
inds = (
|
||||
(np.abs(r - color[0]) < tolerance) & (np.abs(g - color[1]) < tolerance) & (np.abs(b - color[2]) < tolerance)
|
||||
)
|
||||
label_img[inds] = cls
|
||||
cls += 1
|
||||
|
||||
return label_img
|
||||
|
||||
|
||||
def interpolate_to_fit_data(img, slice_type, slice_no, data_info):
|
||||
"""
|
||||
Function to resize image if needed
|
||||
|
||||
Args:
|
||||
img: image array
|
||||
slice_type: inline, crossline or timeslice slice type
|
||||
slice_no: slice number
|
||||
data_info: data info dictionary distracted from SEGY file
|
||||
|
||||
Returns:
|
||||
resized image array
|
||||
|
||||
"""
|
||||
|
||||
# Get wanted output size
|
||||
if slice_type == "inline":
|
||||
n0 = data_info["shape"][0]
|
||||
n1 = data_info["shape"][2]
|
||||
elif slice_type == "crossline":
|
||||
n0 = data_info["shape"][0]
|
||||
n1 = data_info["shape"][1]
|
||||
elif slice_type == "timeslice":
|
||||
n0 = data_info["shape"][1]
|
||||
n1 = data_info["shape"][2]
|
||||
return scipy.misc.imresize(img, (n0, n1), interp="nearest")
|
||||
|
||||
|
||||
def get_coordinates_for_slice(slice_type, slice_no, data_info):
|
||||
"""
|
||||
|
||||
Get coordinates for slice in the full cube
|
||||
|
||||
Args:
|
||||
slice_type: type of slice, e.g. inline, crossline, etc
|
||||
slice_no: slice number
|
||||
data_info: data dictionary array
|
||||
|
||||
Returns:
|
||||
index coordinates of the voxel
|
||||
|
||||
"""
|
||||
ds = data_info["shape"]
|
||||
|
||||
# Coordinates for cube
|
||||
x0, x1, x2 = np.meshgrid(
|
||||
np.linspace(0, ds[0] - 1, ds[0]),
|
||||
np.linspace(0, ds[1] - 1, ds[1]),
|
||||
np.linspace(0, ds[2] - 1, ds[2]),
|
||||
indexing="ij",
|
||||
)
|
||||
if slice_type == "inline":
|
||||
start = data_info["inline_start"]
|
||||
slice_no = slice_no - start
|
||||
|
||||
x0 = x0[:, slice_no, :]
|
||||
x1 = x1[:, slice_no, :]
|
||||
x2 = x2[:, slice_no, :]
|
||||
elif slice_type == "crossline":
|
||||
start = data_info["crossline_start"]
|
||||
slice_no = slice_no - start
|
||||
x0 = x0[:, :, slice_no]
|
||||
x1 = x1[:, :, slice_no]
|
||||
x2 = x2[:, :, slice_no]
|
||||
|
||||
elif slice_type == "timeslice":
|
||||
start = data_info["timeslice_start"]
|
||||
slice_no = slice_no - start
|
||||
x0 = x0[slice_no, :, :]
|
||||
x1 = x1[slice_no, :, :]
|
||||
x2 = x2[slice_no, :, :]
|
||||
|
||||
# Collect indexes
|
||||
x0 = np.expand_dims(x0.ravel(), 0)
|
||||
x1 = np.expand_dims(x1.ravel(), 0)
|
||||
x2 = np.expand_dims(x2.ravel(), 0)
|
||||
coords = np.concatenate((x0, x1, x2), axis=0)
|
||||
|
||||
return coords
|
||||
|
||||
|
||||
def get_slice(data, data_info, slice_type, slice_no, window=0):
|
||||
"""
|
||||
Return data-slice
|
||||
|
||||
Args:
|
||||
data: input 3D voxel numpy array
|
||||
data_info: data info dictionary
|
||||
slice_type: type of slice, like inline, crossline, etc
|
||||
slice_no: slice number
|
||||
window: window size around center pixel
|
||||
|
||||
Returns:
|
||||
2D slice of the voxel as a numpy array
|
||||
|
||||
"""
|
||||
|
||||
if slice_type == "inline":
|
||||
start = data_info["inline_start"]
|
||||
|
||||
elif slice_type == "crossline":
|
||||
start = data_info["crossline_start"]
|
||||
|
||||
elif slice_type == "timeslice":
|
||||
start = data_info["timeslice_start"]
|
||||
|
||||
slice_no = slice_no - start
|
||||
slice = data[:, slice_no - window : slice_no + window + 1, :]
|
||||
|
||||
return np.squeeze(slice)
|
|
@ -0,0 +1,181 @@
|
|||
# Copyright (c) Microsoft. All rights reserved.
|
||||
# Licensed under the MIT license.
|
||||
|
||||
# code modified from https://github.com/waldeland/CNN-for-ASI
|
||||
|
||||
from __future__ import print_function
|
||||
from os.path import join
|
||||
|
||||
# TODO: make this nicer and remove the non-bare except for PEP8 compliance
|
||||
try:
|
||||
import tensorflow as tf
|
||||
except:
|
||||
print("Tensorflow could not be imported, therefore tensorboard cannot be used.")
|
||||
|
||||
from io import BytesIO
|
||||
import matplotlib.pyplot as plt
|
||||
import numpy as np
|
||||
import torch
|
||||
import datetime
|
||||
|
||||
# TODO: it looks like the majority of the methods of this class are static and as such they should be in utils
|
||||
class TBLogger(object):
|
||||
"""
|
||||
TensorBoard logger class
|
||||
"""
|
||||
|
||||
def __init__(self, log_dir, folder_name=""):
|
||||
|
||||
self.log_dir = join(log_dir, folder_name + " " + datetime.datetime.now().strftime("%I%M%p, %B %d, %Y"),)
|
||||
self.log_dir = self.log_dir.replace("//", "/")
|
||||
self.writer = tf.summary.FileWriter(self.log_dir)
|
||||
|
||||
def log_scalar(self, tag, value, step=0):
|
||||
"""
|
||||
Add scalar
|
||||
|
||||
Args:
|
||||
tag: tag
|
||||
value: simple_value
|
||||
step: step
|
||||
|
||||
"""
|
||||
summary = tf.Summary(value=[tf.Summary.Value(tag=tag, simple_value=value)])
|
||||
self.writer.add_summary(summary, step)
|
||||
|
||||
# TODO: this should probably be a static method - take care of this when re-writing the whole thing
|
||||
def make_list_of_2d_array(self, im):
|
||||
"""
|
||||
Flatten 2D array to a list
|
||||
|
||||
Args:
|
||||
im: image
|
||||
|
||||
Returns:
|
||||
Flattened image list
|
||||
|
||||
"""
|
||||
if isinstance(im, list):
|
||||
return im
|
||||
ims = []
|
||||
if len(im.shape) == 2:
|
||||
ims.append(im)
|
||||
elif len(im.shape) == 3:
|
||||
for i in range(im.shape[0]):
|
||||
ims.append(np.squeeze(im[i, :, :]))
|
||||
|
||||
elif len(im.shape) == 4:
|
||||
for i in range(im.shape[0]):
|
||||
ims.append(np.squeeze(im[i, 0, :, :]))
|
||||
return ims
|
||||
|
||||
def log_images(self, tag, images, step=0, dim=2, max_imgs=50, cm="jet"):
|
||||
"""
|
||||
Log images to TensorBoard
|
||||
|
||||
Args:
|
||||
tag: image tag
|
||||
images: list of images
|
||||
step: training step
|
||||
dim: image shape (3 for voxel)
|
||||
max_imgs: max number of images
|
||||
cm: colormap
|
||||
|
||||
"""
|
||||
|
||||
# Make sure images are on numpy format in case the input is a Torch-variable
|
||||
images = self.convert_to_numpy(images)
|
||||
|
||||
if len(images.shape) > 2:
|
||||
dim = 3
|
||||
|
||||
# Make list of images
|
||||
if dim == 2:
|
||||
images = self.make_list_of_2d_array(images)
|
||||
|
||||
# If 3D we make one list for each slice-type
|
||||
if dim == 3:
|
||||
new_images_ts, new_images_il, new_images_cl = self.get_slices_from_3d(images)
|
||||
self.log_images(tag + "_timeslice", new_images_ts, step, 2, max_imgs)
|
||||
self.log_images(tag + "_inline", new_images_il, step, 2, max_imgs)
|
||||
self.log_images(tag + "_crossline", new_images_cl, step, 2, max_imgs)
|
||||
return
|
||||
|
||||
im_summaries = []
|
||||
|
||||
for nr, img in enumerate(images):
|
||||
|
||||
# Grayscale
|
||||
if cm == "gray" or cm == "grey":
|
||||
img = img.astype("float")
|
||||
img = np.repeat(np.expand_dims(img, 2), 3, 2)
|
||||
img -= img.min()
|
||||
img /= img.max()
|
||||
img *= 255
|
||||
img = img.astype("uint8")
|
||||
|
||||
# Write the image to a string
|
||||
s = BytesIO()
|
||||
plt.imsave(s, img, format="png")
|
||||
|
||||
# Create an Image object
|
||||
img_sum = tf.Summary.Image(encoded_image_string=s.getvalue(), height=img.shape[0], width=img.shape[1],)
|
||||
# Create a Summary value
|
||||
im_summaries.append(tf.Summary.Value(tag="%s/%d" % (tag, nr), image=img_sum))
|
||||
|
||||
# if nr == max_imgs-1:
|
||||
# break
|
||||
|
||||
# Create and write Summary
|
||||
summary = tf.Summary(value=im_summaries)
|
||||
self.writer.add_summary(summary, step)
|
||||
|
||||
# TODO: probably another static method
|
||||
def get_slices_from_3d(self, img):
|
||||
"""
|
||||
Cuts out middle slices from image
|
||||
|
||||
Args:
|
||||
img: image array
|
||||
|
||||
"""
|
||||
|
||||
new_images_ts = []
|
||||
new_images_il = []
|
||||
new_images_cl = []
|
||||
|
||||
if len(img.shape) == 3:
|
||||
new_images_ts.append(np.squeeze(img[img.shape[0] / 2, :, :]))
|
||||
new_images_il.append(np.squeeze(img[:, img.shape[1] / 2, :]))
|
||||
new_images_cl.append(np.squeeze(img[:, :, img.shape[2] / 2]))
|
||||
|
||||
elif len(img.shape) == 4:
|
||||
for i in range(img.shape[0]):
|
||||
new_images_ts.append(np.squeeze(img[i, img.shape[1] / 2, :, :]))
|
||||
new_images_il.append(np.squeeze(img[i, :, img.shape[2] / 2, :]))
|
||||
new_images_cl.append(np.squeeze(img[i, :, :, img.shape[3] / 2]))
|
||||
|
||||
elif len(img.shape) == 5:
|
||||
for i in range(img.shape[0]):
|
||||
new_images_ts.append(np.squeeze(img[i, 0, img.shape[2] / 2, :, :]))
|
||||
new_images_il.append(np.squeeze(img[i, 0, :, img.shape[3] / 2, :]))
|
||||
new_images_cl.append(np.squeeze(img[i, 0, :, :, img.shape[4] / 2]))
|
||||
|
||||
return new_images_ts, new_images_il, new_images_cl
|
||||
|
||||
# TODO: another static method most likely
|
||||
def convert_to_numpy(self, im):
|
||||
"""
|
||||
Convert torch to numpy
|
||||
|
||||
Args:
|
||||
im: image array
|
||||
|
||||
"""
|
||||
|
||||
if type(im) == torch.autograd.Variable:
|
||||
# Put on CPU
|
||||
im = im.cpu()
|
||||
# Get np-data
|
||||
im = im.data.numpy()
|
||||
return im
|
|
@ -0,0 +1,426 @@
|
|||
# Copyright (c) Microsoft. All rights reserved.
|
||||
# Licensed under the MIT license.
|
||||
|
||||
# code modified from https://github.com/waldeland/CNN-for-ASI
|
||||
from __future__ import print_function
|
||||
|
||||
import os
|
||||
|
||||
# set default number of GPUs which are discoverable
|
||||
N_GPU = 4
|
||||
DEVICE_IDS = list(range(N_GPU))
|
||||
os.environ["CUDA_VISIBLE_DEVICES"] = ",".join([str(x) for x in DEVICE_IDS])
|
||||
|
||||
# static parameters
|
||||
RESOLUTION = 1
|
||||
# these match how the model is trained
|
||||
N_CLASSES = 2
|
||||
IM_SIZE = 65
|
||||
|
||||
import random
|
||||
import argparse
|
||||
import json
|
||||
|
||||
import torch
|
||||
import torch.nn as nn
|
||||
import torch.backends.cudnn as cudnn
|
||||
from torch.utils.data import Dataset, DataLoader
|
||||
import torch.distributed as dist
|
||||
|
||||
if torch.cuda.is_available():
|
||||
device_str = os.environ["CUDA_VISIBLE_DEVICES"]
|
||||
device = torch.device("cuda:" + device_str)
|
||||
else:
|
||||
raise Exception("No GPU detected for parallel scoring!")
|
||||
|
||||
# ability to perform multiprocessing
|
||||
import multiprocessing
|
||||
|
||||
from os.path import join
|
||||
from data import read_segy, get_slice
|
||||
from texture_net import TextureNet
|
||||
import itertools
|
||||
import numpy as np
|
||||
import tb_logger
|
||||
from data import write_segy
|
||||
|
||||
# graphical progress bar
|
||||
from tqdm import tqdm
|
||||
|
||||
|
||||
class ModelWrapper(nn.Module):
|
||||
"""
|
||||
Wrap TextureNet for (Distributed)DataParallel to invoke classify method
|
||||
"""
|
||||
|
||||
def __init__(self, texture_model):
|
||||
super(ModelWrapper, self).__init__()
|
||||
self.texture_model = texture_model
|
||||
|
||||
def forward(self, input_net):
|
||||
return self.texture_model.classify(input_net)
|
||||
|
||||
|
||||
class MyDataset(Dataset):
|
||||
def __init__(self, data, window, coord_list):
|
||||
|
||||
# main array
|
||||
self.data = data
|
||||
self.coord_list = coord_list
|
||||
self.window = window
|
||||
self.len = len(coord_list)
|
||||
|
||||
def __getitem__(self, index):
|
||||
|
||||
# TODO: can we specify a pixel mathematically by index?
|
||||
pixel = self.coord_list[index]
|
||||
x, y, z = pixel
|
||||
# TODO: current bottleneck - can we slice out voxels any faster
|
||||
small_cube = self.data[
|
||||
x - self.window : x + self.window + 1,
|
||||
y - self.window : y + self.window + 1,
|
||||
z - self.window : z + self.window + 1,
|
||||
]
|
||||
|
||||
return small_cube[np.newaxis, :, :, :], pixel
|
||||
|
||||
def __len__(self):
|
||||
return self.len
|
||||
|
||||
|
||||
def main_worker(gpu, ngpus_per_node, args):
|
||||
"""
|
||||
Main worker function, given the gpu parameter and how many GPUs there are per node
|
||||
it can figure out its rank
|
||||
|
||||
:param gpu: rank of the process if gpu >= ngpus_per_node, otherwise just gpu ID which worker will run on.
|
||||
:param ngpus_per_node: total number of GPU available on this node.
|
||||
:param args: various arguments for the code in the worker.
|
||||
:return: nothing
|
||||
"""
|
||||
|
||||
print("I got GPU", gpu)
|
||||
|
||||
args.rank = gpu
|
||||
|
||||
# loop around in round-robin fashion if we want to run multiple processes per GPU
|
||||
args.gpu = gpu % ngpus_per_node
|
||||
|
||||
# initialize the distributed process and join the group
|
||||
print(
|
||||
"setting rank", args.rank, "world size", args.world_size, args.dist_backend, args.dist_url,
|
||||
)
|
||||
dist.init_process_group(
|
||||
backend=args.dist_backend, init_method=args.dist_url, world_size=args.world_size, rank=args.rank,
|
||||
)
|
||||
|
||||
# set default GPU device for this worker
|
||||
torch.cuda.set_device(args.gpu)
|
||||
# set up device for the rest of the code
|
||||
local_device = torch.device("cuda:" + str(args.gpu))
|
||||
|
||||
# Load trained model (run train.py to create trained
|
||||
network = TextureNet(n_classes=N_CLASSES)
|
||||
model_state_dict = torch.load(join(args.data, "saved_model.pt"), map_location=local_device)
|
||||
network.load_state_dict(model_state_dict)
|
||||
network.eval()
|
||||
network.cuda(args.gpu)
|
||||
|
||||
# set the scoring wrapper also to eval mode
|
||||
model = ModelWrapper(network)
|
||||
model.eval()
|
||||
model.cuda(args.gpu)
|
||||
|
||||
# When using a single GPU per process and per
|
||||
# DistributedDataParallel, we need to divide the batch size
|
||||
# ourselves based on the total number of GPUs we have.
|
||||
# Min batch size is 1
|
||||
args.batch_size = max(int(args.batch_size / ngpus_per_node), 1)
|
||||
# obsolete: number of data loading workers - this is only used when reading from disk, which we're not
|
||||
# args.workers = int((args.workers + ngpus_per_node - 1) / ngpus_per_node)
|
||||
|
||||
# wrap the model for distributed use - for scoring this is not needed
|
||||
# model = torch.nn.parallel.DistributedDataParallel(model, device_ids=[args.gpu])
|
||||
|
||||
# set to benchmark mode because we're running the same workload multiple times
|
||||
cudnn.benchmark = True
|
||||
|
||||
# Read 3D cube
|
||||
# NOTE: we cannot pass this data manually as serialization of data into each python process is costly,
|
||||
# so each worker has to load the data on its own.
|
||||
data, data_info = read_segy(join(args.data, "data.segy"))
|
||||
|
||||
# Get half window size
|
||||
window = IM_SIZE // 2
|
||||
|
||||
# reduce data size for debugging
|
||||
if args.debug:
|
||||
data = data[0 : 3 * window]
|
||||
|
||||
# generate full list of coordinates
|
||||
# memory footprint of this isn't large yet, so not need to wrap as a generator
|
||||
nx, ny, nz = data.shape
|
||||
x_list = range(window, nx - window)
|
||||
y_list = range(window, ny - window)
|
||||
z_list = range(window, nz - window)
|
||||
|
||||
print("-- generating coord list --")
|
||||
# TODO: is there any way to use a generator with pyTorch data loader?
|
||||
coord_list = list(itertools.product(x_list, y_list, z_list))
|
||||
|
||||
# we need to map the data manually to each rank - DistributedDataParallel doesn't do this at score time
|
||||
print("take a subset of coord_list by chunk")
|
||||
coord_list = list(np.array_split(np.array(coord_list), args.world_size)[args.rank])
|
||||
coord_list = [tuple(x) for x in coord_list]
|
||||
|
||||
# we only score first batch in debug mode
|
||||
if args.debug:
|
||||
coord_list = coord_list[0 : args.batch_size]
|
||||
|
||||
# prepare the data
|
||||
print("setup dataset")
|
||||
# TODO: RuntimeError: cannot pin 'torch.cuda.FloatTensor' only dense CPU tensors can be pinned
|
||||
data_torch = torch.cuda.FloatTensor(data).cuda(args.gpu, non_blocking=True)
|
||||
dataset = MyDataset(data_torch, window, coord_list)
|
||||
|
||||
# not sampling like in training
|
||||
# datasampler = DistributedSampler(dataset)
|
||||
# just set some default epoch
|
||||
# datasampler.set_epoch(1)
|
||||
|
||||
# we use 0 workers because we're reading from memory
|
||||
print("setting up loader")
|
||||
my_loader = DataLoader(
|
||||
dataset=dataset,
|
||||
batch_size=args.batch_size,
|
||||
shuffle=False,
|
||||
num_workers=0,
|
||||
pin_memory=False,
|
||||
sampler=None
|
||||
# sampler=datasampler
|
||||
)
|
||||
|
||||
print("running loop")
|
||||
|
||||
pixels_x = []
|
||||
pixels_y = []
|
||||
pixels_z = []
|
||||
predictions = []
|
||||
|
||||
# Loop through center pixels in output cube
|
||||
with torch.no_grad():
|
||||
print("no grad")
|
||||
for (chunk, pixel) in tqdm(my_loader):
|
||||
data_input = chunk.cuda(args.gpu, non_blocking=True)
|
||||
output = model(data_input)
|
||||
# save and deal with it later on CPU
|
||||
# we want to make sure order is preserved
|
||||
pixels_x += pixel[0].tolist()
|
||||
pixels_y += pixel[1].tolist()
|
||||
pixels_z += pixel[2].tolist()
|
||||
predictions += output.tolist()
|
||||
# just score a single batch in debug mode
|
||||
if args.debug:
|
||||
break
|
||||
|
||||
# TODO: legacy Queue Manager code from multiprocessing which we left here for illustration purposes
|
||||
# result_queue.append([deepcopy(coord_list), deepcopy(predictions)])
|
||||
# result_queue.append([coord_list, predictions])
|
||||
# transform pixels into x, y, z list format
|
||||
with open("results_{}.json".format(args.rank), "w") as f:
|
||||
json.dump(
|
||||
{
|
||||
"pixels_x": pixels_x,
|
||||
"pixels_y": pixels_y,
|
||||
"pixels_z": pixels_z,
|
||||
"preds": [int(x[0][0][0][0]) for x in predictions],
|
||||
},
|
||||
f,
|
||||
)
|
||||
|
||||
# TODO: we cannot use pickle to dump from multiprocess - processes lock up
|
||||
# with open("result_predictions_{}.pkl".format(args.rank), "wb") as f:
|
||||
# print ("dumping predictions pickle file")
|
||||
# pickle.dump(predictions, f)
|
||||
|
||||
|
||||
parser = argparse.ArgumentParser(description="Seismic Distributed Scoring")
|
||||
parser.add_argument("-d", "--data", default="/home/maxkaz/data/dutchf3", type=str, help="default dataset folder name")
|
||||
parser.add_argument(
|
||||
"-s",
|
||||
"--slice",
|
||||
default="inline",
|
||||
type=str,
|
||||
choices=["inline", "crossline", "timeslice", "full"],
|
||||
help="slice type which we want to score on",
|
||||
)
|
||||
parser.add_argument(
|
||||
"-n", "--slice-num", default=339, type=int, help="slice number which we want to score",
|
||||
)
|
||||
parser.add_argument(
|
||||
"-b", "--batch-size", default=2 ** 11, type=int, help="batch size which we use for scoring",
|
||||
)
|
||||
parser.add_argument(
|
||||
"-p", "--n-proc-per-gpu", default=1, type=int, help="number of multiple processes to run per each GPU",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--dist-url", default="tcp://127.0.0.1:12345", type=str, help="url used to set up distributed training",
|
||||
)
|
||||
parser.add_argument("--dist-backend", default="nccl", type=str, help="distributed backend")
|
||||
parser.add_argument("--seed", default=0, type=int, help="default random number seed")
|
||||
parser.add_argument(
|
||||
"--debug", action="store_true", help="debug flag - if on we will only process one batch",
|
||||
)
|
||||
|
||||
|
||||
def main():
|
||||
|
||||
# use distributed scoring+
|
||||
if RESOLUTION != 1:
|
||||
raise Exception("Currently we only support pixel-level scoring")
|
||||
|
||||
args = parser.parse_args()
|
||||
|
||||
args.gpu = None
|
||||
args.rank = 0
|
||||
|
||||
# world size is the total number of processes we want to run across all nodes and GPUs
|
||||
args.world_size = N_GPU * args.n_proc_per_gpu
|
||||
|
||||
if args.debug:
|
||||
args.batch_size = 4
|
||||
|
||||
# fix away any kind of randomness - although for scoring it should not matter
|
||||
random.seed(args.seed)
|
||||
torch.manual_seed(args.seed)
|
||||
cudnn.deterministic = True
|
||||
|
||||
print("RESOLUTION {}".format(RESOLUTION))
|
||||
|
||||
##########################################################################
|
||||
print("-- scoring on GPU --")
|
||||
|
||||
ngpus_per_node = torch.cuda.device_count()
|
||||
print("nGPUs per node", ngpus_per_node)
|
||||
|
||||
"""
|
||||
First, read this: https://thelaziestprogrammer.com/python/a-multiprocessing-pool-pickle
|
||||
|
||||
OK, so there are a few ways in which we can spawn a running process with pyTorch:
|
||||
1) Default mp.spawn should work just fine but won't let us access internals
|
||||
2) So we copied out the code from mp.spawn below to control how processes get created
|
||||
3) One could spawn their own processes but that would not be thread-safe with CUDA, line
|
||||
"mp = multiprocessing.get_context('spawn')" guarantees we use the proper pyTorch context
|
||||
|
||||
Input data serialization is too costly, in general so is output data serialization as noted here:
|
||||
https://docs.python.org/3/library/multiprocessing.html
|
||||
|
||||
Feeding data into each process is too costly, so each process loads its own data.
|
||||
|
||||
For deserialization we could try and fail using:
|
||||
1) Multiprocessing queue manager
|
||||
manager = Manager()
|
||||
return_dict = manager.dict()
|
||||
OR
|
||||
result_queue = multiprocessing.Queue()
|
||||
CALLING
|
||||
with Manager() as manager:
|
||||
results_list = manager.list()
|
||||
mp.spawn(main_worker, nprocs=args.world_size, args=(ngpus_per_node, results_list/dict/queue, args))
|
||||
results = deepcopy(results_list)
|
||||
2) pickling results to disc.
|
||||
|
||||
Turns out that for the reasons mentioned in the first article both approaches are too costly.
|
||||
|
||||
The only reasonable way to deserialize data from a Python process is to write it to text, in which case
|
||||
writing to JSON is a saner approach: https://www.datacamp.com/community/tutorials/pickle-python-tutorial
|
||||
"""
|
||||
|
||||
# invoke processes manually suppressing error queue
|
||||
mp = multiprocessing.get_context("spawn")
|
||||
# error_queues = []
|
||||
processes = []
|
||||
for i in range(args.world_size):
|
||||
# error_queue = mp.SimpleQueue()
|
||||
process = mp.Process(target=main_worker, args=(i, ngpus_per_node, args), daemon=False)
|
||||
process.start()
|
||||
# error_queues.append(error_queue)
|
||||
processes.append(process)
|
||||
|
||||
# block on wait
|
||||
for process in processes:
|
||||
process.join()
|
||||
|
||||
print("-- aggregating results --")
|
||||
|
||||
# Read 3D cube
|
||||
data, data_info = read_segy(join(args.data, "data.segy"))
|
||||
|
||||
# Log to tensorboard - input slice
|
||||
logger = tb_logger.TBLogger("log", "Test")
|
||||
logger.log_images(
|
||||
args.slice + "_" + str(args.slice_num), get_slice(data, data_info, args.slice, args.slice_num), cm="gray",
|
||||
)
|
||||
|
||||
x_coords = []
|
||||
y_coords = []
|
||||
z_coords = []
|
||||
predictions = []
|
||||
for i in range(args.world_size):
|
||||
with open("results_{}.json".format(i), "r") as f:
|
||||
results_dict = json.load(f)
|
||||
|
||||
x_coords += results_dict["pixels_x"]
|
||||
y_coords += results_dict["pixels_y"]
|
||||
z_coords += results_dict["pixels_z"]
|
||||
predictions += results_dict["preds"]
|
||||
|
||||
"""
|
||||
So because of Python's GIL having multiple workers write to the same array is not efficient - basically
|
||||
the only way we can have shared memory is with threading but thanks to GIL only one thread can execute at a time,
|
||||
so we end up with the overhead of managing multiple threads when writes happen sequentially.
|
||||
|
||||
A much faster alternative is to just invoke underlying compiled code (C) through the use of array indexing.
|
||||
|
||||
So basically instead of the following:
|
||||
|
||||
NUM_CORES = multiprocessing.cpu_count()
|
||||
print("Post-processing will run on {} CPU cores on your machine.".format(NUM_CORES))
|
||||
|
||||
def worker(classified_cube, coord):
|
||||
x, y, z = coord
|
||||
ind = new_coord_list.index(coord)
|
||||
# print (coord, ind)
|
||||
pred_class = predictions[ind]
|
||||
classified_cube[x, y, z] = pred_class
|
||||
|
||||
# launch workers in parallel with memory sharing ("threading" backend)
|
||||
_ = Parallel(n_jobs=4*NUM_CORES, backend="threading")(
|
||||
delayed(worker)(classified_cube, coord) for coord in tqdm(pixels)
|
||||
)
|
||||
|
||||
We do this:
|
||||
"""
|
||||
|
||||
# placeholder for results
|
||||
classified_cube = np.zeros(data.shape)
|
||||
# store final results
|
||||
classified_cube[x_coords, y_coords, z_coords] = predictions
|
||||
|
||||
print("-- writing segy --")
|
||||
in_file = join(args.data, "data.segy".format(RESOLUTION))
|
||||
out_file = join(args.data, "salt_{}.segy".format(RESOLUTION))
|
||||
write_segy(out_file, in_file, classified_cube)
|
||||
|
||||
print("-- logging prediction --")
|
||||
# log prediction to tensorboard
|
||||
logger = tb_logger.TBLogger("log", "Test_scored")
|
||||
logger.log_images(
|
||||
args.slice + "_" + str(args.slice_num),
|
||||
get_slice(classified_cube, data_info, args.slice, args.slice_num),
|
||||
cm="binary",
|
||||
)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
|
@ -0,0 +1,157 @@
|
|||
# Copyright (c) Microsoft. All rights reserved.
|
||||
# Licensed under the MIT license.
|
||||
|
||||
# code modified from https://github.com/waldeland/CNN-for-ASI
|
||||
|
||||
import torch
|
||||
from torch import nn
|
||||
|
||||
from utils import gpu_no_of_var
|
||||
|
||||
|
||||
class TextureNet(nn.Module):
|
||||
def __init__(self, n_classes=2, n_filters=50):
|
||||
super(TextureNet, self).__init__()
|
||||
|
||||
# Network definition
|
||||
# Parameters #in_channels, #out_channels, filter_size, stride (downsampling factor)
|
||||
self.net = nn.Sequential(
|
||||
nn.Conv3d(1, n_filters, 5, 4, padding=2),
|
||||
nn.BatchNorm3d(n_filters),
|
||||
# nn.Dropout3d() #Droput can be added like this ...
|
||||
nn.ReLU(),
|
||||
nn.Conv3d(n_filters, n_filters, 3, 2, padding=1, bias=False),
|
||||
nn.BatchNorm3d(n_filters),
|
||||
nn.ReLU(),
|
||||
nn.Conv3d(n_filters, n_filters, 3, 2, padding=1, bias=False),
|
||||
nn.BatchNorm3d(n_filters),
|
||||
nn.ReLU(),
|
||||
nn.Conv3d(n_filters, n_filters, 3, 2, padding=1, bias=False),
|
||||
nn.BatchNorm3d(n_filters),
|
||||
nn.ReLU(),
|
||||
nn.Conv3d(n_filters, n_filters, 3, 3, padding=1, bias=False),
|
||||
nn.BatchNorm3d(n_filters),
|
||||
nn.ReLU(),
|
||||
nn.Conv3d(
|
||||
n_filters, n_classes, 1, 1
|
||||
), # This is the equivalent of a fully connected layer since input has width/height/depth = 1
|
||||
nn.ReLU(),
|
||||
)
|
||||
# The filter weights are by default initialized by random
|
||||
|
||||
def forward(self, x):
|
||||
"""
|
||||
Is called to compute network output
|
||||
|
||||
Args:
|
||||
x: network input - torch tensor
|
||||
|
||||
Returns:
|
||||
output from the neural network
|
||||
|
||||
"""
|
||||
return self.net(x)
|
||||
|
||||
def classify(self, x):
|
||||
"""
|
||||
Classification wrapper
|
||||
|
||||
Args:
|
||||
x: input tensor for classification
|
||||
|
||||
Returns:
|
||||
classification result
|
||||
|
||||
"""
|
||||
x = self.net(x)
|
||||
_, class_no = torch.max(x, 1, keepdim=True)
|
||||
return class_no
|
||||
|
||||
# Functions to get output from intermediate feature layers
|
||||
def f1(self, x):
|
||||
"""
|
||||
Wrapper to obtain a particular network layer
|
||||
|
||||
Args:
|
||||
x: input tensor for classification
|
||||
|
||||
Returns:
|
||||
requested layer
|
||||
|
||||
"""
|
||||
return self.getFeatures(x, 0)
|
||||
|
||||
def f2(self, x):
|
||||
"""
|
||||
Wrapper to obtain a particular network layer
|
||||
|
||||
Args:
|
||||
x: input tensor for classification
|
||||
|
||||
Returns:
|
||||
requested layer
|
||||
|
||||
"""
|
||||
return self.getFeatures(x, 1)
|
||||
|
||||
def f3(self, x):
|
||||
"""
|
||||
Wrapper to obtain a particular network layer
|
||||
|
||||
Args:
|
||||
x: input tensor for classification
|
||||
|
||||
Returns:
|
||||
requested layer
|
||||
|
||||
"""
|
||||
return self.getFeatures(x, 2)
|
||||
|
||||
def f4(self, x):
|
||||
"""
|
||||
Wrapper to obtain a particular network layer
|
||||
|
||||
Args:
|
||||
x: input tensor for classification
|
||||
|
||||
Returns:
|
||||
requested layer
|
||||
|
||||
"""
|
||||
return self.getFeatures(x, 3)
|
||||
|
||||
def f5(self, x):
|
||||
"""
|
||||
Wrapper to obtain a particular network layer
|
||||
|
||||
Args:
|
||||
x: input tensor for classification
|
||||
|
||||
Returns:
|
||||
requested layer
|
||||
|
||||
"""
|
||||
return self.getFeatures(x, 4)
|
||||
|
||||
def getFeatures(self, x, layer_no):
|
||||
"""
|
||||
Main call method to call the wrapped layers
|
||||
|
||||
Args:
|
||||
x: input tensor for classification
|
||||
layer_no: number of hidden layer we want to extract
|
||||
|
||||
Returns:
|
||||
requested layer
|
||||
|
||||
"""
|
||||
layer_indexes = [0, 3, 6, 9, 12]
|
||||
|
||||
# Make new network that has the layers up to the requested output
|
||||
tmp_net = nn.Sequential()
|
||||
layers = list(self.net.children())[0 : layer_indexes[layer_no] + 1]
|
||||
for i in range(len(layers)):
|
||||
tmp_net.add_module(str(i), layers[i])
|
||||
if type(gpu_no_of_var(self)) == int:
|
||||
tmp_net.cuda(gpu_no_of_var(self))
|
||||
return tmp_net(x)
|
|
@ -0,0 +1,136 @@
|
|||
# Copyright (c) Microsoft. All rights reserved.
|
||||
# Licensed under the MIT license.
|
||||
|
||||
# code modified from https://github.com/waldeland/CNN-for-ASI
|
||||
|
||||
from __future__ import print_function
|
||||
from os.path import join
|
||||
import torch
|
||||
from torch import nn
|
||||
from data import read_segy, read_labels, get_slice
|
||||
from batch import get_random_batch
|
||||
from torch.autograd import Variable
|
||||
from texture_net import TextureNet
|
||||
import tb_logger
|
||||
import utils
|
||||
|
||||
# Parameters
|
||||
ROOT_PATH = "/home/maxkaz/data/dutchf3"
|
||||
INPUT_VOXEL = "data.segy"
|
||||
TRAIN_MASK = "inline_339.png"
|
||||
VAL_MASK = "inline_405.png"
|
||||
IM_SIZE = 65
|
||||
# If you have a GPU with little memory, try reducing this to 16 (may degrade results)
|
||||
BATCH_SIZE = 32
|
||||
# Switch to toggle the use of GPU or not
|
||||
USE_GPU = True
|
||||
# Log progress on tensor board
|
||||
LOG_TENSORBOARD = True
|
||||
|
||||
# the rest of the code
|
||||
if LOG_TENSORBOARD:
|
||||
logger = tb_logger.TBLogger("log", "Train")
|
||||
|
||||
# This is the network definition proposed in the paper
|
||||
network = TextureNet(n_classes=2)
|
||||
|
||||
# Loss function - Softmax function is included
|
||||
cross_entropy = nn.CrossEntropyLoss()
|
||||
|
||||
# Optimizer to control step size in gradient descent
|
||||
optimizer = torch.optim.Adam(network.parameters())
|
||||
|
||||
# Transfer model to gpu
|
||||
if USE_GPU and torch.cuda.is_available():
|
||||
network = network.cuda()
|
||||
|
||||
# Load the data cube and labels
|
||||
data, data_info = read_segy(join(ROOT_PATH, INPUT_VOXEL))
|
||||
train_class_imgs, train_coordinates = read_labels(join(ROOT_PATH, TRAIN_MASK), data_info)
|
||||
val_class_imgs, _ = read_labels(join(ROOT_PATH, VAL_MASK), data_info)
|
||||
|
||||
# Plot training/validation data with labels
|
||||
if LOG_TENSORBOARD:
|
||||
for class_img in train_class_imgs + val_class_imgs:
|
||||
logger.log_images(
|
||||
class_img[1] + "_" + str(class_img[2]), get_slice(data, data_info, class_img[1], class_img[2]), cm="gray",
|
||||
)
|
||||
logger.log_images(
|
||||
class_img[1] + "_" + str(class_img[2]) + "_true_class", class_img[0],
|
||||
)
|
||||
|
||||
# Training loop
|
||||
for i in range(5000):
|
||||
|
||||
# Get random training batch with augmentation
|
||||
# This is the bottle-neck for training and could be done more efficient on the GPU...
|
||||
[batch, labels] = get_random_batch(
|
||||
data,
|
||||
train_coordinates,
|
||||
IM_SIZE,
|
||||
BATCH_SIZE,
|
||||
random_flip=True,
|
||||
random_stretch=0.2,
|
||||
random_rot_xy=180,
|
||||
random_rot_z=15,
|
||||
)
|
||||
|
||||
# Format data to torch-variable
|
||||
batch = Variable(torch.Tensor(batch).float())
|
||||
labels = Variable(torch.Tensor(labels).long())
|
||||
|
||||
# Transfer data to gpu
|
||||
if USE_GPU and torch.cuda.is_available():
|
||||
batch = batch.cuda()
|
||||
labels = labels.cuda()
|
||||
|
||||
# Set network to training phase
|
||||
network.train()
|
||||
|
||||
# Run the samples through the network
|
||||
output = network(batch)
|
||||
|
||||
# Compute loss
|
||||
loss = cross_entropy(torch.squeeze(output), labels)
|
||||
|
||||
# Do back-propagation to get gradients of weights w.r.t. loss
|
||||
loss.backward()
|
||||
|
||||
# Ask the optimizer to adjust the parameters in the direction of lower loss
|
||||
optimizer.step()
|
||||
|
||||
# Every 10th iteration - print training loss
|
||||
if i % 10 == 0:
|
||||
network.eval()
|
||||
|
||||
# Log to training loss/acc
|
||||
print("Iteration:", i, "Training loss:", utils.var_to_np(loss))
|
||||
if LOG_TENSORBOARD:
|
||||
logger.log_scalar("training_loss", utils.var_to_np(loss), i)
|
||||
for k, v in utils.compute_accuracy(torch.argmax(output, 1), labels).items():
|
||||
if LOG_TENSORBOARD:
|
||||
logger.log_scalar("training_" + k, v, i)
|
||||
print(" -", k, v, "%")
|
||||
|
||||
# every 100th iteration
|
||||
if i % 100 == 0 and LOG_TENSORBOARD:
|
||||
network.eval()
|
||||
|
||||
# Output predicted train/validation class/probability images
|
||||
for class_img in train_class_imgs + val_class_imgs:
|
||||
|
||||
slice = class_img[1]
|
||||
slice_no = class_img[2]
|
||||
|
||||
class_img = utils.interpret(
|
||||
network.classify, data, data_info, slice, slice_no, IM_SIZE, 16, return_full_size=True, use_gpu=USE_GPU,
|
||||
)
|
||||
logger.log_images(slice + "_" + str(slice_no) + "_pred_class", class_img[0], step=i)
|
||||
|
||||
class_img = utils.interpret(
|
||||
network, data, data_info, slice, slice_no, IM_SIZE, 16, return_full_size=True, use_gpu=USE_GPU,
|
||||
)
|
||||
logger.log_images(slice + "_" + str(slice_no) + "_pred_prob", class_img[0], i)
|
||||
|
||||
# Store trained network
|
||||
torch.save(network.state_dict(), join(ROOT_PATH, "saved_model.pt"))
|
|
@ -0,0 +1,337 @@
|
|||
# Copyright (c) Microsoft. All rights reserved.
|
||||
# Licensed under the MIT license.
|
||||
|
||||
# code modified from https://github.com/waldeland/CNN-for-ASI
|
||||
|
||||
from __future__ import print_function
|
||||
|
||||
import torch
|
||||
import numpy as np
|
||||
from torch.autograd import Variable
|
||||
from scipy.interpolate import interpn
|
||||
import sys
|
||||
import time
|
||||
|
||||
# global parameters
|
||||
ST = 0
|
||||
LAST_UPDATE = 0
|
||||
|
||||
|
||||
def interpret(
|
||||
network, data, data_info, slice, slice_no, im_size, subsampl, return_full_size=True, use_gpu=True,
|
||||
):
|
||||
"""
|
||||
Down-samples a slice from the classified image and upsamples to full resolution if needed. Basically
|
||||
given a full 3D-classified voxel at a particular resolution (say we classify every n-th pixel as given by the
|
||||
subsampl variable below) we take a particular slice from the voxel and optoinally blow it up to full resolution
|
||||
as if we classified every single pixel.
|
||||
|
||||
Args:
|
||||
network: pytorch model definition
|
||||
data: input voxel
|
||||
data_info: input voxel information
|
||||
slice: slice type which we want to interpret
|
||||
slice_no: slice number
|
||||
im_size: size of the voxel
|
||||
subsampl: at what resolution do we want to subsample, e.g. we move across every subsampl pixels
|
||||
return_full_size: boolean flag, enable if you want to return full size without downsampling
|
||||
use_gpu: boolean flag to use the GPU
|
||||
|
||||
Returns:
|
||||
upsampled slice
|
||||
|
||||
"""
|
||||
|
||||
# Wrap np.linspace in compact function call
|
||||
ls = lambda N: np.linspace(0, N - 1, N, dtype="int")
|
||||
|
||||
# Size of cube
|
||||
N0, N1, N2 = data.shape
|
||||
|
||||
# Coords for full cube
|
||||
x0_range = ls(N0)
|
||||
x1_range = ls(N1)
|
||||
x2_range = ls(N2)
|
||||
|
||||
# Coords for subsampled cube
|
||||
pred_points = (x0_range[::subsampl], x1_range[::subsampl], x2_range[::subsampl])
|
||||
|
||||
# Select slice
|
||||
if slice == "full":
|
||||
class_cube = data[::subsampl, ::subsampl, ::subsampl] * 0
|
||||
|
||||
elif slice == "inline":
|
||||
slice_no = slice_no - data_info["inline_start"]
|
||||
class_cube = data[::subsampl, 0:1, ::subsampl] * 0
|
||||
x1_range = np.array([slice_no])
|
||||
pred_points = (pred_points[0], pred_points[2])
|
||||
|
||||
elif slice == "crossline":
|
||||
slice_no = slice_no - data_info["crossline_start"]
|
||||
class_cube = data[::subsampl, ::subsampl, 0:1,] * 0
|
||||
x2_range = np.array([slice_no])
|
||||
pred_points = (pred_points[0], pred_points[1])
|
||||
|
||||
elif slice == "timeslice":
|
||||
slice_no = slice_no - data_info["timeslice_start"]
|
||||
class_cube = data[0:1, ::subsampl, ::subsampl] * 0
|
||||
x0_range = np.array([slice_no])
|
||||
pred_points = (pred_points[1], pred_points[2])
|
||||
|
||||
# Grid for small class slice/cube
|
||||
n0, n1, n2 = class_cube.shape
|
||||
x0_grid, x1_grid, x2_grid = np.meshgrid(ls(n0,), ls(n1), ls(n2), indexing="ij")
|
||||
|
||||
# Grid for full slice/cube
|
||||
X0_grid, X1_grid, X2_grid = np.meshgrid(x0_range, x1_range, x2_range, indexing="ij")
|
||||
|
||||
# Indexes for large cube at small cube pixels
|
||||
X0_grid_sub = X0_grid[::subsampl, ::subsampl, ::subsampl]
|
||||
X1_grid_sub = X1_grid[::subsampl, ::subsampl, ::subsampl]
|
||||
X2_grid_sub = X2_grid[::subsampl, ::subsampl, ::subsampl]
|
||||
|
||||
# Get half window size
|
||||
w = im_size // 2
|
||||
|
||||
# Loop through center pixels in output cube
|
||||
for i in range(X0_grid_sub.size):
|
||||
|
||||
# Get coordinates in small and large cube
|
||||
x0 = x0_grid.ravel()[i]
|
||||
x1 = x1_grid.ravel()[i]
|
||||
x2 = x2_grid.ravel()[i]
|
||||
|
||||
X0 = X0_grid_sub.ravel()[i]
|
||||
X1 = X1_grid_sub.ravel()[i]
|
||||
X2 = X2_grid_sub.ravel()[i]
|
||||
|
||||
# Only compute when a full 65x65x65 cube can be extracted around center pixel
|
||||
if X0 > w and X1 > w and X2 > w and X0 < N0 - w + 1 and X1 < N1 - w + 1 and X2 < N2 - w + 1:
|
||||
|
||||
# Get mini-cube around center pixel
|
||||
mini_cube = data[X0 - w : X0 + w + 1, X1 - w : X1 + w + 1, X2 - w : X2 + w + 1]
|
||||
|
||||
# Get predicted "probabilities"
|
||||
mini_cube = Variable(torch.FloatTensor(mini_cube[np.newaxis, np.newaxis, :, :, :]))
|
||||
if use_gpu:
|
||||
mini_cube = mini_cube.cuda()
|
||||
out = network(mini_cube)
|
||||
out = out.data.cpu().numpy()
|
||||
|
||||
out = out[:, :, out.shape[2] // 2, out.shape[3] // 2, out.shape[4] // 2]
|
||||
out = np.squeeze(out)
|
||||
|
||||
# Make one output pr output channel
|
||||
if not isinstance(class_cube, list):
|
||||
class_cube = np.split(np.repeat(class_cube[:, :, :, np.newaxis], out.size, 3), out.size, axis=3,)
|
||||
|
||||
# Insert into output
|
||||
if out.size == 1:
|
||||
class_cube[0][x0, x1, x2] = out
|
||||
else:
|
||||
for i in range(out.size):
|
||||
class_cube[i][x0, x1, x2] = out[i]
|
||||
|
||||
# Keep user informed about progress
|
||||
if slice == "full":
|
||||
printProgressBar(i, x0_grid.size)
|
||||
|
||||
# Resize to input size
|
||||
if return_full_size:
|
||||
if slice == "full":
|
||||
print("Interpolating down sampled results to fit input cube")
|
||||
|
||||
N = X0_grid.size
|
||||
|
||||
# Output grid
|
||||
if slice == "full":
|
||||
grid_output_cube = np.concatenate(
|
||||
[X0_grid.reshape([N, 1]), X1_grid.reshape([N, 1]), X2_grid.reshape([N, 1]),], 1,
|
||||
)
|
||||
elif slice == "inline":
|
||||
grid_output_cube = np.concatenate([X0_grid.reshape([N, 1]), X2_grid.reshape([N, 1])], 1)
|
||||
elif slice == "crossline":
|
||||
grid_output_cube = np.concatenate([X0_grid.reshape([N, 1]), X1_grid.reshape([N, 1])], 1)
|
||||
elif slice == "timeslice":
|
||||
grid_output_cube = np.concatenate([X1_grid.reshape([N, 1]), X2_grid.reshape([N, 1])], 1)
|
||||
|
||||
# Interpolation
|
||||
for i in range(len(class_cube)):
|
||||
is_int = (
|
||||
np.sum(
|
||||
np.unique(class_cube[i]).astype("float") - np.unique(class_cube[i]).astype("int32").astype("float")
|
||||
)
|
||||
== 0
|
||||
)
|
||||
class_cube[i] = interpn(
|
||||
pred_points,
|
||||
class_cube[i].astype("float").squeeze(),
|
||||
grid_output_cube,
|
||||
method="linear",
|
||||
fill_value=0,
|
||||
bounds_error=False,
|
||||
)
|
||||
class_cube[i] = class_cube[i].reshape([x0_range.size, x1_range.size, x2_range.size])
|
||||
|
||||
# If ouput is class labels we convert the interpolated array to ints
|
||||
if is_int:
|
||||
class_cube[i] = class_cube[i].astype("int32")
|
||||
|
||||
if slice == "full":
|
||||
print("Finished interpolating")
|
||||
|
||||
# Squeeze outputs
|
||||
for i in range(len(class_cube)):
|
||||
class_cube[i] = class_cube[i].squeeze()
|
||||
|
||||
return class_cube
|
||||
|
||||
|
||||
# TODO: this should probably be replaced with TQDM
|
||||
def print_progress_bar(iteration, total, prefix="", suffix="", decimals=1, length=100, fill="="):
|
||||
"""
|
||||
Privides a progress bar implementation.
|
||||
|
||||
Adapted from https://stackoverflow.com/questions/3173320/text-progress-bar-in-the-console/14879561#14879561
|
||||
|
||||
Args:
|
||||
iteration: iteration number
|
||||
total: total number of iterations
|
||||
prefix: comment prefix in display
|
||||
suffix: comment suffix in display
|
||||
decimals: how many decimals to display
|
||||
length: character length of progress bar
|
||||
fill: character to display as progress bar
|
||||
|
||||
"""
|
||||
global ST, LAST_UPDATE
|
||||
|
||||
# Expect itteration to go from 0 to N-1
|
||||
iteration = iteration + 1
|
||||
|
||||
# Only update every 5 second
|
||||
if time.time() - LAST_UPDATE < 5:
|
||||
if iteration == total:
|
||||
time.sleep(1)
|
||||
else:
|
||||
return
|
||||
|
||||
if iteration <= 1:
|
||||
st = time.time()
|
||||
exp_h = ""
|
||||
exp_m = ""
|
||||
exp_s = ""
|
||||
elif iteration == total:
|
||||
exp_time = time.time() - ST
|
||||
exp_h = int(exp_time / 3600)
|
||||
exp_m = int(exp_time / 60 - exp_h * 60.0)
|
||||
exp_s = int(exp_time - exp_m * 60.0 - exp_h * 3600.0)
|
||||
else:
|
||||
exp_time = (time.time() - ST) / (iteration - 1) * total - (time.time() - ST)
|
||||
exp_h = int(exp_time / 3600)
|
||||
exp_m = int(exp_time / 60 - exp_h * 60.0)
|
||||
exp_s = int(exp_time - exp_m * 60.0 - exp_h * 3600.0)
|
||||
|
||||
percent = ("{0:." + str(decimals) + "f}").format(100 * (iteration / float(total)))
|
||||
filled_length = int(length * iteration // total)
|
||||
bar = fill * filled_length + "-" * (length - filled_length)
|
||||
if iteration != total:
|
||||
print("\r%s |%s| %s%% %s - %sh %smin %ss left" % (prefix, bar, percent, suffix, exp_h, exp_m, exp_s))
|
||||
else:
|
||||
print("\r%s |%s| %s%% %s - %sh %smin %ss " % (prefix, bar, percent, suffix, exp_h, exp_m, exp_s))
|
||||
sys.stdout.write("\033[F")
|
||||
# Print New Line on Complete
|
||||
if iteration == total:
|
||||
print("")
|
||||
# last_update = time.time()
|
||||
|
||||
|
||||
# TODO: rewrite this whole function to get rid of excepts
|
||||
# TODO: also not sure what this function is for - it's almost as if it's not needed - try to remove it.
|
||||
def gpu_no_of_var(var):
|
||||
"""
|
||||
Function that returns the GPU number or whether the tensor is on GPU or not
|
||||
|
||||
Args:
|
||||
var: torch tensor
|
||||
|
||||
Returns:
|
||||
The CUDA device that the torch tensor is on, or whether the tensor is on GPU
|
||||
|
||||
"""
|
||||
|
||||
try:
|
||||
is_cuda = next(var.parameters()).is_cuda
|
||||
except:
|
||||
is_cuda = var.is_cuda
|
||||
|
||||
if is_cuda:
|
||||
try:
|
||||
return next(var.parameters()).get_device()
|
||||
except:
|
||||
return var.get_device()
|
||||
else:
|
||||
return False
|
||||
|
||||
|
||||
# TODO: remove all the try except statements
|
||||
def var_to_np(var):
|
||||
"""
|
||||
Take a pyTorch tensor and convert it to numpy array of the same shape, as the name suggests.
|
||||
|
||||
Args:
|
||||
var: input variable
|
||||
|
||||
Returns:
|
||||
numpy array of the tensor
|
||||
|
||||
"""
|
||||
if type(var) in [np.array, np.ndarray]:
|
||||
return var
|
||||
|
||||
# If input is list we do this for all elements
|
||||
if type(var) == type([]):
|
||||
out = []
|
||||
for v in var:
|
||||
out.append(var_to_np(v))
|
||||
return out
|
||||
|
||||
try:
|
||||
var = var.cpu()
|
||||
except:
|
||||
None
|
||||
try:
|
||||
var = var.data
|
||||
except:
|
||||
None
|
||||
try:
|
||||
var = var.numpy()
|
||||
except:
|
||||
None
|
||||
|
||||
if type(var) == tuple:
|
||||
var = var[0]
|
||||
return var
|
||||
|
||||
|
||||
def compute_accuracy(predicted_class, labels):
|
||||
"""
|
||||
Accuracy performance metric which needs to be computed
|
||||
|
||||
Args:
|
||||
predicted_class: pyTorch tensor with predictions
|
||||
labels: pyTorch tensor with ground truth labels
|
||||
|
||||
Returns:
|
||||
Accuracy calculation as a dictionary per class and average class accuracy across classes
|
||||
|
||||
"""
|
||||
labels = var_to_np(labels)
|
||||
predicted_class = var_to_np(predicted_class)
|
||||
|
||||
accuracies = {}
|
||||
for cls in np.unique(labels):
|
||||
if cls >= 0:
|
||||
accuracies["accuracy_class_" + str(cls)] = int(np.mean(predicted_class[labels == cls] == cls) * 100)
|
||||
accuracies["average_class_accuracy"] = np.mean([acc for acc in accuracies.values()])
|
||||
return accuracies
|
|
@ -0,0 +1,60 @@
|
|||
# DeepSeismic
|
||||
|
||||
## Imaging
|
||||
|
||||
This tutorial shows how to run [devito](https://www.devitoproject.org/) tutorial [notebooks](https://github.com/opesci/devito/tree/master/examples/seismic/tutorials) in Azure Machine Learning ([Azure ML](https://docs.microsoft.com/en-us/azure/machine-learning/)) using [Azure Machine Learning Python SDK](https://docs.microsoft.com/en-us/azure/machine-learning/service/tutorial-1st-experiment-sdk-setup).
|
||||
|
||||
For best experience use a Linux (Ubuntu) Azure [DSVM](https://docs.microsoft.com/en-us/azure/machine-learning/data-science-virtual-machine/dsvm-ubuntu-intro) and Jupyter Notebook with AzureML Python SDK and [Azure CLI](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli?view=azure-cli-latest) to run the notebooks (see __Setting up Environment__ section below).
|
||||
|
||||
Devito is a domain-specific Language (DSL) and code generation framework for the design of highly optimized finite difference kernels via symbolic computation for use in inversion methods. Here we show how ```devito``` can be openly used in the cloud by leveraging AzureML experimentation framework as a transparent and scalable platform for generic computation workloads. We focus on Full waveform inversion (__FWI__) problems where non-linear data-fitting procedures are applied for computing estimates of subsurface properties from seismic data.
|
||||
|
||||
|
||||
### Setting up Environment
|
||||
|
||||
The [conda environment](https://docs.conda.io/projects/conda/en/latest/user-guide/concepts/environments.html) that encapsulates all the dependencies needed to run the notebooks described above can be created using the fwi_dev_conda_environment.yml file. See [here](https://github.com/Azure/MachineLearningNotebooks/blob/master/NBSETUP.md) generic instructions on how to install and run AzureML Python SDK in Jupyter Notebooks.
|
||||
|
||||
To create the conda environment, run:
|
||||
```
|
||||
conda env create -f fwi_dev_conda_environment.yml
|
||||
|
||||
```
|
||||
|
||||
then, one can see the created environment within the list of available environments and export it as a .yml file:
|
||||
```
|
||||
conda env list
|
||||
conda env export --name fwi_dev_conda_environment -f ./contrib/fwi/azureml_devito/fwi_dev_conda_environment_exported.yml
|
||||
|
||||
```
|
||||
The created conda environment needs to be activated, followed by the installation of its corresponding IPython kernel:
|
||||
```
|
||||
conda activate fwi_dev_conda_environment
|
||||
python -m ipykernel install --user --name fwi_dev_conda_environment --display-name "fwi_dev_conda_environment Python"
|
||||
```
|
||||
|
||||
Finally, start Jupyter notebook from within the activated environment:
|
||||
```
|
||||
jupyter notebook
|
||||
```
|
||||
One can then choose the __fwi_dev_conda_environment Python__ kernel defined above either when a notebook is opened for the first time, or by using the "Kernel/Change kernel" notebook menu.
|
||||
|
||||
|
||||
|
||||
[Azure CLI](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli?view=azure-cli-latest) is also used to create an ACR in notebook 000_Setup_GeophysicsTutorial_FWI_Azure_devito, and then push and pull docker images. One can also create the ACR via Azure [portal](https://azure.microsoft.com/).
|
||||
|
||||
### Run devito in Azure
|
||||
The devito fwi examples are run in AzuremL using 4 notebooks:
|
||||
- ```000_Setup_GeophysicsTutorial_FWI_Azure_devito.ipynb```: sets up Azure resources (like resource groups, AzureML [workspace](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-manage-workspace), Azure (docker) [container registry](https://azure.microsoft.com/en-us/services/container-registry/)).
|
||||
- ```010_CreateExperimentationDockerImage_GeophysicsTutorial_FWI_Azure_devito.ipynb```: Creates a custom docker file and the associated image that contains ```devito``` [github repository](https://github.com/opesci/devito.git) (including devito fwi tutorial [notebooks](https://github.com/opesci/devito/tree/master/examples/seismic/tutorials)) and runs the official devito install [tests](https://github.com/opesci/devito/tree/master/tests).
|
||||
- ```020_UseAzureMLEstimatorForExperimentation_GeophysicsTutorial_FWI_Azure_devito.ipynb```: shows how the devito fwi tutorial [notebooks](https://github.com/opesci/devito/tree/master/examples/seismic/tutorials) can be run in AzureML using Azure Machine Learning [generic](https://docs.microsoft.com/en-us/python/api/azureml-train-core/azureml.train.estimator?view=azure-ml-py) [estimators](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-train-ml-models) with custom docker images. FWI computation takes place on a managed AzureML [remote compute cluster](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-set-up-training-targets).
|
||||
|
||||
```Devito``` fwi computation artifacts (images and notebooks with data processing output results) are tracked under the AzureML workspace, and can be later downloaded and visualized.
|
||||
|
||||
Two ways of running devito code are shown:
|
||||
|
||||
(1) using __custom code__ (slightly modified graphing functions that save images to files). The AzureML experimentation job is defined by the devito code packaged as a py file. The experimentation job (defined by [azureml.core.experiment.Experiment](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.experiment.experiment?view=azure-ml-py) class can be used to track metrics or other artifacts (images) that are available in Azure portal.
|
||||
|
||||
(2) using [__papermill__](https://github.com/nteract/papermill) invoked via its Python API to run unedited devito demo notebooks (including the [dask](https://dask.org/) local cluster [example](https://github.com/opesci/devito/blob/master/examples/seismic/tutorials/04_dask.ipynb) on the remote compute target and the results as saved notebooks that are available in Azure portal.
|
||||
|
||||
- ```030_ScaleJobsUsingAzuremL_GeophysicsTutorial_FWI_Azure_devito.ipynb```: shows how the devito fwi tutorial notebooks can be run in parallel on the elastically allocated AzureML [remote compute cluster](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-set-up-training-targets) created before. By submitting multiple jobs via azureml.core.Experiment submit(azureml.train.estimator.Estimator) one can use the [portal](https://portal.azure.com) to visualize the elastic allocation of AzureML [remote compute cluster](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-set-up-training-targets) nodes.
|
||||
|
||||
|
|
@ -0,0 +1,17 @@
|
|||
|
||||
name: fwi_dev_conda_environment
|
||||
|
||||
channels:
|
||||
- anaconda
|
||||
dependencies:
|
||||
- python=3.7
|
||||
- numpy
|
||||
- notebook
|
||||
- ipykernel #nb_conda
|
||||
- scikit-learn
|
||||
- pip
|
||||
- pip:
|
||||
- python-dotenv
|
||||
- papermill[azure]
|
||||
- azureml-sdk[notebooks,automl,explain]==1.0.76
|
||||
- docker
|
|
@ -0,0 +1,211 @@
|
|||
name: fwi_dev_conda_environment
|
||||
channels:
|
||||
- anaconda
|
||||
- defaults
|
||||
dependencies:
|
||||
- attrs=19.3.0=py_0
|
||||
- backcall=0.1.0=py37_0
|
||||
- blas=1.0=mkl
|
||||
- bleach=3.1.0=py37_0
|
||||
- ca-certificates=2019.11.27=0
|
||||
- certifi=2019.11.28=py37_0
|
||||
- decorator=4.4.1=py_0
|
||||
- defusedxml=0.6.0=py_0
|
||||
- entrypoints=0.3=py37_0
|
||||
- gmp=6.1.2=hb3b607b_0
|
||||
- importlib_metadata=1.1.0=py37_0
|
||||
- intel-openmp=2019.5=281
|
||||
- ipykernel=5.1.3=py37h39e3cac_0
|
||||
- ipython=7.10.1=py37h39e3cac_0
|
||||
- ipython_genutils=0.2.0=py37_0
|
||||
- jedi=0.15.1=py37_0
|
||||
- jinja2=2.10.3=py_0
|
||||
- joblib=0.14.0=py_0
|
||||
- jsonschema=3.2.0=py37_0
|
||||
- jupyter_client=5.3.4=py37_0
|
||||
- jupyter_core=4.6.1=py37_0
|
||||
- libedit=3.1.20181209=hc058e9b_0
|
||||
- libffi=3.2.1=h4deb6c0_3
|
||||
- libgcc-ng=9.1.0=hdf63c60_0
|
||||
- libgfortran-ng=7.3.0=hdf63c60_0
|
||||
- libsodium=1.0.16=h1bed415_0
|
||||
- libstdcxx-ng=9.1.0=hdf63c60_0
|
||||
- markupsafe=1.1.1=py37h7b6447c_0
|
||||
- mistune=0.8.4=py37h7b6447c_0
|
||||
- mkl=2019.5=281
|
||||
- mkl-service=2.3.0=py37he904b0f_0
|
||||
- mkl_fft=1.0.15=py37ha843d7b_0
|
||||
- mkl_random=1.1.0=py37hd6b4f25_0
|
||||
- more-itertools=7.2.0=py37_0
|
||||
- nbconvert=5.6.1=py37_0
|
||||
- nbformat=4.4.0=py37_0
|
||||
- ncurses=6.1=he6710b0_1
|
||||
- notebook=6.0.2=py37_0
|
||||
- openssl=1.1.1=h7b6447c_0
|
||||
- pandoc=2.2.3.2=0
|
||||
- pandocfilters=1.4.2=py37_1
|
||||
- parso=0.5.1=py_0
|
||||
- pexpect=4.7.0=py37_0
|
||||
- pickleshare=0.7.5=py37_0
|
||||
- pip=19.3.1=py37_0
|
||||
- prometheus_client=0.7.1=py_0
|
||||
- prompt_toolkit=3.0.2=py_0
|
||||
- ptyprocess=0.6.0=py37_0
|
||||
- pygments=2.5.2=py_0
|
||||
- pyrsistent=0.15.6=py37h7b6447c_0
|
||||
- python=3.7.5=h0371630_0
|
||||
- python-dateutil=2.8.1=py_0
|
||||
- pyzmq=18.1.0=py37he6710b0_0
|
||||
- readline=7.0=h7b6447c_5
|
||||
- send2trash=1.5.0=py37_0
|
||||
- setuptools=42.0.2=py37_0
|
||||
- six=1.13.0=py37_0
|
||||
- sqlite=3.30.1=h7b6447c_0
|
||||
- terminado=0.8.3=py37_0
|
||||
- testpath=0.4.4=py_0
|
||||
- tk=8.6.8=hbc83047_0
|
||||
- tornado=6.0.3=py37h7b6447c_0
|
||||
- traitlets=4.3.3=py37_0
|
||||
- wcwidth=0.1.7=py37_0
|
||||
- webencodings=0.5.1=py37_1
|
||||
- xz=5.2.4=h14c3975_4
|
||||
- zeromq=4.3.1=he6710b0_3
|
||||
- zipp=0.6.0=py_0
|
||||
- zlib=1.2.11=h7b6447c_3
|
||||
- pip:
|
||||
- adal==1.2.2
|
||||
- ansiwrap==0.8.4
|
||||
- applicationinsights==0.11.9
|
||||
- azure-common==1.1.23
|
||||
- azure-core==1.1.1
|
||||
- azure-datalake-store==0.0.48
|
||||
- azure-graphrbac==0.61.1
|
||||
- azure-mgmt-authorization==0.60.0
|
||||
- azure-mgmt-containerregistry==2.8.0
|
||||
- azure-mgmt-keyvault==2.0.0
|
||||
- azure-mgmt-resource==7.0.0
|
||||
- azure-mgmt-storage==7.0.0
|
||||
- azure-storage-blob==12.1.0
|
||||
- azureml-automl-core==1.0.76
|
||||
- azureml-automl-runtime==1.0.76.1
|
||||
- azureml-contrib-notebook==1.0.76
|
||||
- azureml-core==1.0.76
|
||||
- azureml-dataprep==1.1.33
|
||||
- azureml-dataprep-native==13.1.0
|
||||
- azureml-defaults==1.0.76
|
||||
- azureml-explain-model==1.0.76
|
||||
- azureml-interpret==1.0.76
|
||||
- azureml-model-management-sdk==1.0.1b6.post1
|
||||
- azureml-pipeline==1.0.76
|
||||
- azureml-pipeline-core==1.0.76
|
||||
- azureml-pipeline-steps==1.0.76
|
||||
- azureml-sdk==1.0.76
|
||||
- azureml-telemetry==1.0.76
|
||||
- azureml-train==1.0.76
|
||||
- azureml-train-automl==1.0.76
|
||||
- azureml-train-automl-client==1.0.76
|
||||
- azureml-train-automl-runtime==1.0.76.1
|
||||
- azureml-train-core==1.0.76
|
||||
- azureml-train-restclients-hyperdrive==1.0.76
|
||||
- azureml-widgets==1.0.76
|
||||
- backports-tempfile==1.0
|
||||
- backports-weakref==1.0.post1
|
||||
- boto==2.49.0
|
||||
- boto3==1.10.37
|
||||
- botocore==1.13.37
|
||||
- cffi==1.13.2
|
||||
- chardet==3.0.4
|
||||
- click==7.0
|
||||
- cloudpickle==1.2.2
|
||||
- configparser==3.7.4
|
||||
- contextlib2==0.6.0.post1
|
||||
- cryptography==2.8
|
||||
- cycler==0.10.0
|
||||
- cython==0.29.14
|
||||
- dill==0.3.1.1
|
||||
- distro==1.4.0
|
||||
- docker==4.1.0
|
||||
- docutils==0.15.2
|
||||
- dotnetcore2==2.1.11
|
||||
- fire==0.2.1
|
||||
- flake8==3.7.9
|
||||
- flask==1.0.3
|
||||
- fusepy==3.0.1
|
||||
- future==0.18.2
|
||||
- gensim==3.8.1
|
||||
- gunicorn==19.9.0
|
||||
- idna==2.8
|
||||
- imageio==2.6.1
|
||||
- interpret-community==0.2.3
|
||||
- interpret-core==0.1.19
|
||||
- ipywidgets==7.5.1
|
||||
- isodate==0.6.0
|
||||
- itsdangerous==1.1.0
|
||||
- jeepney==0.4.1
|
||||
- jmespath==0.9.4
|
||||
- json-logging-py==0.2
|
||||
- jsonform==0.0.2
|
||||
- jsonpickle==1.2
|
||||
- jsonsir==0.0.2
|
||||
- keras2onnx==1.6.0
|
||||
- kiwisolver==1.1.0
|
||||
- liac-arff==2.4.0
|
||||
- lightgbm==2.3.0
|
||||
- matplotlib==3.1.2
|
||||
- mccabe==0.6.1
|
||||
- msrest==0.6.10
|
||||
- msrestazure==0.6.2
|
||||
- ndg-httpsclient==0.5.1
|
||||
- networkx==2.4
|
||||
- nimbusml==1.6.1
|
||||
- numpy==1.16.2
|
||||
- oauthlib==3.1.0
|
||||
- onnx==1.6.0
|
||||
- onnxconverter-common==1.6.0
|
||||
- onnxmltools==1.4.1
|
||||
- packaging==19.2
|
||||
- pandas==0.23.4
|
||||
- papermill==1.2.1
|
||||
- pathspec==0.6.0
|
||||
- patsy==0.5.1
|
||||
- pillow==6.2.1
|
||||
- pmdarima==1.1.1
|
||||
- protobuf==3.11.1
|
||||
- psutil==5.6.7
|
||||
- pyasn1==0.4.8
|
||||
- pycodestyle==2.5.0
|
||||
- pycparser==2.19
|
||||
- pyflakes==2.1.1
|
||||
- pyjwt==1.7.1
|
||||
- pyopenssl==19.1.0
|
||||
- pyparsing==2.4.5
|
||||
- python-dotenv==0.10.3
|
||||
- python-easyconfig==0.1.7
|
||||
- pytz==2019.3
|
||||
- pywavelets==1.1.1
|
||||
- pyyaml==5.2
|
||||
- requests==2.22.0
|
||||
- requests-oauthlib==1.3.0
|
||||
- resource==0.2.1
|
||||
- ruamel-yaml==0.15.89
|
||||
- s3transfer==0.2.1
|
||||
- scikit-image==0.16.2
|
||||
- scikit-learn==0.20.3
|
||||
- scipy==1.1.0
|
||||
- secretstorage==3.1.1
|
||||
- shap==0.29.3
|
||||
- skl2onnx==1.4.9
|
||||
- sklearn-pandas==1.7.0
|
||||
- smart-open==1.9.0
|
||||
- statsmodels==0.10.2
|
||||
- tenacity==6.0.0
|
||||
- termcolor==1.1.0
|
||||
- textwrap3==0.9.2
|
||||
- tqdm==4.40.2
|
||||
- typing-extensions==3.7.4.1
|
||||
- urllib3==1.25.7
|
||||
- websocket-client==0.56.0
|
||||
- werkzeug==0.16.0
|
||||
- wheel==0.30.0
|
||||
- widgetsnbextension==3.5.1
|
||||
prefix: /data/anaconda/envs/fwi_dev_conda_environment
|
|
@ -0,0 +1,923 @@
|
|||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Copyright (c) Microsoft Corporation. \n",
|
||||
"Licensed under the MIT License."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# FWI in Azure project\n",
|
||||
"\n",
|
||||
"## Set-up AzureML resources\n",
|
||||
"\n",
|
||||
"This project ports devito (https://github.com/opesci/devito) into Azure and runs tutorial notebooks at:\n",
|
||||
"https://nbviewer.jupyter.org/github/opesci/devito/blob/master/examples/seismic/tutorials/\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"In this notebook we setup AzureML resources. This notebook should be run once and will enable all subsequent notebooks.\n",
|
||||
"\n",
|
||||
"<a id='user_input_requiring_steps'></a>\n",
|
||||
"User input requiring steps:\n",
|
||||
" - [Fill in and save sensitive information](#dot_env_description)\n",
|
||||
" - [Azure login](#Azure_login) (may be required first time the notebook is run) \n",
|
||||
" - [Set __create_ACR_FLAG__ to true to trigger ACR creation and to save of ACR login info](#set_create_ACR_flag)\n",
|
||||
" - [Azure CLI login ](#Azure_cli_login) (may be required once to create an [ACR](https://azure.microsoft.com/en-us/services/container-registry/)) \n",
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Allow multiple displays per cell\n",
|
||||
"from IPython.core.interactiveshell import InteractiveShell\n",
|
||||
"InteractiveShell.ast_node_interactivity = \"all\" "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Azure Machine Learning and Pipeline SDK-specific imports"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import sys, os\n",
|
||||
"import shutil\n",
|
||||
"import urllib\n",
|
||||
"import azureml.core\n",
|
||||
"from azureml.core import Workspace, Experiment\n",
|
||||
"from azureml.core.compute import ComputeTarget, AmlCompute\n",
|
||||
"from azureml.core.compute_target import ComputeTargetException\n",
|
||||
"import platform, dotenv\n",
|
||||
"import pathlib"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Azure ML SDK Version: 1.0.76\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"'Linux-4.15.0-1064-azure-x86_64-with-debian-stretch-sid'"
|
||||
]
|
||||
},
|
||||
"execution_count": 3,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
},
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"'/datadrive01/prj/DeepSeismic/contrib/fwi/azureml_devito/notebooks'"
|
||||
]
|
||||
},
|
||||
"execution_count": 3,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"print(\"Azure ML SDK Version: \", azureml.core.VERSION)\n",
|
||||
"platform.platform()\n",
|
||||
"os.getcwd()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"#### 1. Create utilities file\n",
|
||||
"\n",
|
||||
"##### 1.1 Define utilities file (project_utils.py) path\n",
|
||||
"Utilities file created here has code for Azure resources access authorization, project configuration settings like directories and file names in __project_consts__ class."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"utils_file_name = 'project_utils'\n",
|
||||
"auxiliary_files_dir = os.path.join(*(['.', 'src']))\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"utils_path_name = os.path.join(os.getcwd(), auxiliary_files_dir)\n",
|
||||
"utils_full_name = os.path.join(utils_path_name, os.path.join(*([utils_file_name+'.py'])))\n",
|
||||
"os.makedirs(utils_path_name, exist_ok=True)\n",
|
||||
" \n",
|
||||
"def ls_l(a_dir):\n",
|
||||
" return ([f for f in os.listdir(a_dir) if os.path.isfile(os.path.join(a_dir, f))]) "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"##### 1.2. Edit/create project_utils.py file"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Overwriting /datadrive01/prj/DeepSeismic/contrib/fwi/azureml_devito/notebooks/./src/project_utils.py\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"%%writefile $utils_full_name\n",
|
||||
"\n",
|
||||
"from azureml.core.authentication import ServicePrincipalAuthentication\n",
|
||||
"from azureml.core.authentication import AzureCliAuthentication\n",
|
||||
"from azureml.core.authentication import InteractiveLoginAuthentication\n",
|
||||
"from azureml.core.authentication import AuthenticationException\n",
|
||||
"import dotenv, logging, pathlib, os\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"# credit Mathew Salvaris\n",
|
||||
"def get_auth(env_path):\n",
|
||||
" \"\"\"Tries to get authorization info by first trying to get Service Principal info, then CLI, then interactive. \n",
|
||||
" \"\"\"\n",
|
||||
" logger = logging.getLogger(__name__)\n",
|
||||
" crt_sp_pwd = os.environ.get(\"SP_PASSWORD\", None)\n",
|
||||
" if crt_sp_pwd:\n",
|
||||
" logger.debug(\"Trying to create Workspace with Service Principal\")\n",
|
||||
" aml_sp_password = crt_sp_pwd\n",
|
||||
" aml_sp_tennant_id = dotenv.get_key(env_path, 'SP_TENANT_ID')\n",
|
||||
" aml_sp_username = dotenv.get_key(env_path, 'SP_APPLICATION_ID')\n",
|
||||
" auth = ServicePrincipalAuthentication(\n",
|
||||
" tenant_id=aml_sp_tennant_id,\n",
|
||||
" username=aml_sp_username,\n",
|
||||
" password=aml_sp_password,\n",
|
||||
" )\n",
|
||||
" else:\n",
|
||||
" logger.debug(\"Trying to create Workspace with CLI Authentication\")\n",
|
||||
" try:\n",
|
||||
" auth = AzureCliAuthentication()\n",
|
||||
" auth.get_authentication_header()\n",
|
||||
" except AuthenticationException:\n",
|
||||
" logger.debug(\"Trying to create Workspace with Interactive login\")\n",
|
||||
" auth = InteractiveLoginAuthentication()\n",
|
||||
"\n",
|
||||
" return auth \n",
|
||||
"\n",
|
||||
"\n",
|
||||
"def set_dotenv_info(dotenv_file_path, env_dict):\n",
|
||||
" \"\"\"Use dict loop to set multiple keys in dotenv file.\n",
|
||||
" Minimal file error management.\n",
|
||||
" \"\"\"\n",
|
||||
" logger = logging.getLogger(__name__)\n",
|
||||
" if bool(env_dict):\n",
|
||||
" dotenv_file = pathlib.Path(dotenv_file_path)\n",
|
||||
" if not dotenv_file.is_file():\n",
|
||||
" logger.debug('dotenv file not found, will create \"{}\" using the sensitive info you provided.'.format(dotenv_file_path))\n",
|
||||
" dotenv_file.touch()\n",
|
||||
" else:\n",
|
||||
" logger.debug('dotenv file \"{}\" found, will (over)write it with current sensitive info you provided.'.format(dotenv_file_path))\n",
|
||||
" \n",
|
||||
" for crt_key, crt_val in env_dict.items():\n",
|
||||
" dotenv.set_key(dotenv_file_path, crt_key, crt_val)\n",
|
||||
"\n",
|
||||
" else:\n",
|
||||
" logger.debug(\\\n",
|
||||
" 'Trying to save empty env_dict variable into {}, please set your sensitive info in a dictionary.'\\\n",
|
||||
" .format(dotenv_file_path)) \n",
|
||||
" \n",
|
||||
"\n",
|
||||
"class project_consts(object):\n",
|
||||
" \"\"\"Keep project's file names and directory structure in one place.\n",
|
||||
" Minimal setattr error management.\n",
|
||||
" \"\"\"\n",
|
||||
" \n",
|
||||
" AML_WORKSPACE_CONFIG_DIR = ['.', '..', 'not_shared']\n",
|
||||
" AML_EXPERIMENT_DIR = ['.', '..', 'temp']\n",
|
||||
" AML_WORKSPACE_CONFIG_FILE_NAME = 'aml_ws_config.json'\n",
|
||||
" DOTENV_FILE_PATH = AML_WORKSPACE_CONFIG_DIR + ['general.env'] \n",
|
||||
" DOCKER_DOTENV_FILE_PATH = AML_WORKSPACE_CONFIG_DIR + ['dockerhub.env'] \n",
|
||||
"\n",
|
||||
" def __setattr__(self, *_):\n",
|
||||
" raise TypeError\n",
|
||||
"\n",
|
||||
" \n",
|
||||
"if __name__==\"__main__\":\n",
|
||||
" \"\"\"Basic function/class tests.\n",
|
||||
" \"\"\"\n",
|
||||
" import sys, os\n",
|
||||
" prj_consts = project_consts()\n",
|
||||
" logger = logging.getLogger(__name__)\n",
|
||||
" logging.basicConfig(level=logging.DEBUG) # Logging Levels: DEBUG\t10, NOTSET\t0\n",
|
||||
" logger.debug('AML ws file = {}'.format(os.path.join(*([os.path.join(*(prj_consts.AML_WORKSPACE_CONFIG_DIR)),\n",
|
||||
" prj_consts.AML_WORKSPACE_CONFIG_FILE_NAME]))))\n",
|
||||
"\n",
|
||||
" crt_dotenv_file_path = os.path.join(*(prj_consts.DOTENV_FILE_PATH))\n",
|
||||
" set_dotenv_info(crt_dotenv_file_path, {})\n",
|
||||
" "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"##### 1.3. Import utilities functions defined above"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"[None]"
|
||||
]
|
||||
},
|
||||
"execution_count": 6,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"def add_path_to_sys_path(path_to_append):\n",
|
||||
" if not (any(path_to_append in paths for paths in sys.path)):\n",
|
||||
" sys.path.append(path_to_append)\n",
|
||||
" \n",
|
||||
"paths_to_append = [os.path.join(os.getcwd(), auxiliary_files_dir)]\n",
|
||||
"[add_path_to_sys_path(crt_path) for crt_path in paths_to_append]\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"import project_utils\n",
|
||||
"prj_consts = project_utils.project_consts()\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"#### 2. Set-up the AML SDK infrastructure\n",
|
||||
"\n",
|
||||
"* Create Azure resource group (rsg), workspaces, \n",
|
||||
"* save sensitive info using [python-dotenv](https://github.com/theskumar/python-dotenv) \n",
|
||||
" \n",
|
||||
"Notebook repeateability notes:\n",
|
||||
"* The notebook tries to find and use an existing Azure resource group (rsg) defined by __crt_resource_group__. It creates a new one if needed. "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<a id='set_create_ACR_flag'></a>\n",
|
||||
"\n",
|
||||
"##### Create [ACR]() first time this notebook is run. \n",
|
||||
"Either docker hub or ACR can be used to store the experimentation image. To create the ACR, set: \n",
|
||||
"```\n",
|
||||
"create_ACR_FLAG=True \n",
|
||||
"```\n",
|
||||
"It will create an ACR by running severral steps described below in section 2.7. __Create an [ACR]__ \n",
|
||||
" \n",
|
||||
" \n",
|
||||
"[Back](#user_input_requiring_steps) to summary of user input requiring steps."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"create_ACR_FLAG = False #True False"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"sensitive_info = {}"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<a id='dot_env_description'></a>\n",
|
||||
"##### 2.1. Input here sensitive and configuration information\n",
|
||||
"[dotenv](https://github.com/theskumar/python-dotenv) is used to hide sensitive info, like Azure subscription name/ID. The serialized info needs to be manually input once. \n",
|
||||
" \n",
|
||||
"* REQUIRED ACTION for the 2 cells below: uncomment them, add the required info in first cell below, run both cells one. \n",
|
||||
" The sensitive information will be packed in __sensitive_info__ dictionary variable, which that will then be saved in a following cell in an .env file (__dotenv_file_path__) that should likely be git ignored. \n",
|
||||
"\n",
|
||||
"* OPTIONAL STEP: After running once the two cells below to save __sensitive_info__ dictionary variable with your custom info, you can comment them and leave the __sensitive_info__ variable defined above as an empty python dictionary. \n",
|
||||
" \n",
|
||||
" \n",
|
||||
"__Notes__:\n",
|
||||
"* An empty __sensitive_info__ dictionary is ignored by the __set_dotenv_info__ function defined above in project_utils.py . \n",
|
||||
"* The saved .env file will be used thereafter in each cell that starts with %dotenv. \n",
|
||||
"* The saved .env file contains user specific information and it shoulld __not__ be version-controlled in git.\n",
|
||||
"* If you would like to [use service principal authentication](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/manage-azureml-service/authentication-in-azureml/authentication-in-azure-ml.ipynb) make sure you provide the optional values as well (see get_auth function definition in project_utils.py file created above for details).\n",
|
||||
"\n",
|
||||
"[Back](#user_input_requiring_steps) to summary of user input requiring steps."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 9,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# subscription_id = \"\"\n",
|
||||
"# resource_group = \"ghiordanfwirsg01\"\n",
|
||||
"# workspace_name = \"ghiordanfwiws\"\n",
|
||||
"# workspace_region = \"eastus2\"\n",
|
||||
"# gpu_cluster_name = \"gpuclstfwi02\"\n",
|
||||
"# gpucluster_admin_user_name = \"\"\n",
|
||||
"# gpucluster_admin_user_password = \"\"\n",
|
||||
"\n",
|
||||
"# experimentation_docker_image_name = \"fwi01_azureml\"\n",
|
||||
"# experimentation_docker_image_tag = \"sdk.v1.0.60\"\n",
|
||||
"# docker_container_mount_point = os.getcwd() # use project directory or a subdirectory\n",
|
||||
"\n",
|
||||
"# docker_login = \"georgedockeraccount\"\n",
|
||||
"# docker_pwd = \"\"\n",
|
||||
"\n",
|
||||
"# acr_name=\"fwi01acr\""
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 10,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# sensitive_info = {\n",
|
||||
"# 'SUBSCRIPTION_ID':subscription_id,\n",
|
||||
"# 'RESOURCE_GROUP':resource_group, \n",
|
||||
"# 'WORKSPACE_NAME':workspace_name, \n",
|
||||
"# 'WORKSPACE_REGION':workspace_region,\n",
|
||||
"# 'GPU_CLUSTER_NAME':gpu_cluster_name,\n",
|
||||
"# 'GPU_CLUSTER_ADMIN_USER_NAME':gpucluster_admin_user_name,\n",
|
||||
"# 'GPU_CLUSTER_ADMIN_USER_PASSWORD':gpucluster_admin_user_password,\n",
|
||||
"# 'EXPERIMENTATION_DOCKER_IMAGE_NAME':experimentation_docker_image_name,\n",
|
||||
"# 'EXPERIMENTATION_DOCKER_IMAGE_TAG':experimentation_docker_image_tag,\n",
|
||||
"# 'DOCKER_CONTAINER_MOUNT_POINT':docker_container_mount_point,\n",
|
||||
"# 'DOCKER_LOGIN':docker_login,\n",
|
||||
"# 'DOCKER_PWD':docker_pwd,\n",
|
||||
"# 'ACR_NAME':acr_name\n",
|
||||
"# }"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"##### 2.2. Save sensitive info\n",
|
||||
"An empty __sensitive_info__ variable will be ingored. \n",
|
||||
"A non-empty __sensitive_info__ variable will overwrite info in an existing .env file."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 11,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"'./../not_shared/general.env'"
|
||||
]
|
||||
},
|
||||
"execution_count": 11,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"%load_ext dotenv\n",
|
||||
"dotenv_file_path = os.path.join(*(prj_consts.DOTENV_FILE_PATH)) \n",
|
||||
"os.makedirs(os.path.join(*(prj_consts.DOTENV_FILE_PATH[:-1])), exist_ok=True)\n",
|
||||
"pathlib.Path(dotenv_file_path).touch()\n",
|
||||
"\n",
|
||||
"# # show .env file path\n",
|
||||
"# !pwd\n",
|
||||
"dotenv_file_path\n",
|
||||
"\n",
|
||||
"#save your sensitive info\n",
|
||||
"project_utils.set_dotenv_info(dotenv_file_path, sensitive_info)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"##### 2.3. Use (load) saved sensitive info\n",
|
||||
"THis is how sensitive info will be retrieved in other notebooks"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 12,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"%dotenv $dotenv_file_path\n",
|
||||
"\n",
|
||||
"subscription_id = os.getenv('SUBSCRIPTION_ID')\n",
|
||||
"# # print a bit of subscription ID, to show dotenv file was found and loaded \n",
|
||||
"# subscription_id[:2]\n",
|
||||
"\n",
|
||||
"crt_resource_group = os.getenv('RESOURCE_GROUP')\n",
|
||||
"crt_workspace_name = os.getenv('WORKSPACE_NAME')\n",
|
||||
"crt_workspace_region = os.getenv('WORKSPACE_REGION') "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"##### 2.4. Access your workspace\n",
|
||||
"\n",
|
||||
"* In AML SDK we can get a ws in two ways: \n",
|
||||
" - via Workspace(subscription_id = ...) \n",
|
||||
" - via Workspace.from_config(path=some_file_path). \n",
|
||||
" \n",
|
||||
"For demo purposes, both ways are shown in this notebook.\n",
|
||||
"\n",
|
||||
"* At first notebook run:\n",
|
||||
" - the AML workspace ws is typically not found, so a new ws object is created and persisted on disk.\n",
|
||||
" - If the ws has been created other ways (e.g. via Azure portal), it may be persisted on disk by calling ws1.write_config(...)."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 13,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"workspace_config_dir = os.path.join(*(prj_consts.AML_WORKSPACE_CONFIG_DIR))\n",
|
||||
"workspace_config_file = prj_consts.AML_WORKSPACE_CONFIG_FILE_NAME\n",
|
||||
"\n",
|
||||
"# # print debug info if needed \n",
|
||||
"# workspace_config_dir \n",
|
||||
"# ls_l(os.path.join(os.getcwd(), os.path.join(*([workspace_config_dir]))))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<a id='Azure_login'></a>\n",
|
||||
"###### Login into Azure may be required here\n",
|
||||
"[Back](#user_input_requiring_steps) to summary of user input requiring steps."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 14,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"WARNING - Warning: Falling back to use azure cli login credentials.\n",
|
||||
"If you run your code in unattended mode, i.e., where you can't give a user input, then we recommend to use ServicePrincipalAuthentication or MsiAuthentication.\n",
|
||||
"Please refer to aka.ms/aml-notebook-auth for different authentication mechanisms in azureml-sdk.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Workspace configuration loading succeeded. \n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"try:\n",
|
||||
" ws1 = Workspace(\n",
|
||||
" subscription_id = subscription_id, \n",
|
||||
" resource_group = crt_resource_group, \n",
|
||||
" workspace_name = crt_workspace_name,\n",
|
||||
" auth=project_utils.get_auth(dotenv_file_path))\n",
|
||||
" print(\"Workspace configuration loading succeeded. \")\n",
|
||||
"# ws1.write_config(path=os.path.join(os.getcwd(), os.path.join(*([workspace_config_dir]))),\n",
|
||||
"# file_name=workspace_config_file)\n",
|
||||
" del ws1 # ws will be (re)created later using from_config() function\n",
|
||||
"except Exception as e :\n",
|
||||
" print('Exception msg: {}'.format(str(e )))\n",
|
||||
" print(\"Workspace not accessible. Will create a new workspace below\")\n",
|
||||
" \n",
|
||||
" workspace_region = crt_workspace_region\n",
|
||||
"\n",
|
||||
" # Create the workspace using the specified parameters\n",
|
||||
" ws2 = Workspace.create(name = crt_workspace_name,\n",
|
||||
" subscription_id = subscription_id,\n",
|
||||
" resource_group = crt_resource_group, \n",
|
||||
" location = workspace_region,\n",
|
||||
" create_resource_group = True,\n",
|
||||
" exist_ok = False)\n",
|
||||
" ws2.get_details()\n",
|
||||
"\n",
|
||||
" # persist the subscription id, resource group name, and workspace name in aml_config/config.json.\n",
|
||||
" ws2.write_config(path=os.path.join(os.getcwd(), os.path.join(*([workspace_config_dir]))),\n",
|
||||
" file_name=workspace_config_file)\n",
|
||||
" \n",
|
||||
" #Delete ws2 and use ws = Workspace.from_config() as shwon below to recover the ws, rather than rely on what we get from one time creation\n",
|
||||
" del ws2"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"##### 2.5. Demo access to created workspace\n",
|
||||
"\n",
|
||||
"From now on, even in other notebooks, the provisioned AML workspace will be accesible using Workspace.from_config() as shown below:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 15,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# path arg is:\n",
|
||||
"# - a file path which explictly lists aml_config subdir for function from_config() \n",
|
||||
"# - a dir path with a silently added <<aml_config>> subdir for function write_config(). \n",
|
||||
"ws = Workspace.from_config(path=os.path.join(os.getcwd(), \n",
|
||||
" os.path.join(*([workspace_config_dir, '.azureml', workspace_config_file]))))\n",
|
||||
"# # print debug info if needed\n",
|
||||
"# print(ws.name, ws.resource_group, ws.location, ws.subscription_id[0], sep = '\\n')"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"##### 2.6. Create compute cluster used in following notebooks"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 16,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"'gpuclstfwi02'"
|
||||
]
|
||||
},
|
||||
"execution_count": 16,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"gpu_cluster_name = os.getenv('GPU_CLUSTER_NAME')\n",
|
||||
"gpu_cluster_name"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 17,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Found existing gpu cluster\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"max_nodes_value = 3\n",
|
||||
"\n",
|
||||
"try:\n",
|
||||
" gpu_cluster = ComputeTarget(workspace=ws, name=gpu_cluster_name)\n",
|
||||
" print(\"Found existing gpu cluster\")\n",
|
||||
"except ComputeTargetException:\n",
|
||||
" print(\"Could not find gpu cluster, please create one\")\n",
|
||||
" \n",
|
||||
"# # Specify the configuration for the new cluster, add admin_user_ssh_key='ssh-rsa ... ghiordan@microsoft.com' if needed\n",
|
||||
"# compute_config = AmlCompute.provisioning_configuration(vm_size=\"Standard_NC12\",\n",
|
||||
"# min_nodes=0,\n",
|
||||
"# max_nodes=max_nodes_value,\n",
|
||||
"# admin_username=os.getenv('GPU_CLUSTER_ADMIN_USER_NAME'), \n",
|
||||
"# admin_user_password=os.getenv('GPU_CLUSTER_ADMIN_USER_NAME'))\n",
|
||||
"# # Create the cluster with the specified name and configuration\n",
|
||||
"# gpu_cluster = ComputeTarget.create(ws, gpu_cluster_name, compute_config)\n",
|
||||
"\n",
|
||||
"# # Wait for the cluster to complete, show the output log\n",
|
||||
"# gpu_cluster.wait_for_completion(show_output=True)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"##### 2.7. Create an [ACR](https://docs.microsoft.com/en-us/azure/container-registry/) if you have not done so using the [portal](https://docs.microsoft.com/en-us/azure/container-registry/container-registry-get-started-portal) \n",
|
||||
" - Follow the 4 ACR steps described below. \n",
|
||||
" - Uncomment cells' lines as needed to login and see commands responses while you set the right subscription and then create the ACR. \n",
|
||||
" - You need [Azure CLI](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli) to run the commands below. \n",
|
||||
"\n",
|
||||
"<a id='Azure_cli_login'></a>\n",
|
||||
"##### ACR Step 1. Select ACR subscription (az cli login into Azure may be required here)\n",
|
||||
"[Back](#user_input_requiring_steps) to summary of user input requiring steps."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 18,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"azure-cli 2.0.58 *\r\n",
|
||||
"\r\n",
|
||||
"acr 2.2.0 *\r\n",
|
||||
"acs 2.3.17 *\r\n",
|
||||
"advisor 2.0.0 *\r\n",
|
||||
"ams 0.4.1 *\r\n",
|
||||
"appservice 0.2.13 *\r\n",
|
||||
"backup 1.2.1 *\r\n",
|
||||
"batch 3.4.1 *\r\n",
|
||||
"batchai 0.4.7 *\r\n",
|
||||
"billing 0.2.0 *\r\n",
|
||||
"botservice 0.1.6 *\r\n",
|
||||
"cdn 0.2.0 *\r\n",
|
||||
"cloud 2.1.0 *\r\n",
|
||||
"cognitiveservices 0.2.4 *\r\n",
|
||||
"command-modules-nspkg 2.0.2 *\r\n",
|
||||
"configure 2.0.20 *\r\n",
|
||||
"consumption 0.4.2 *\r\n",
|
||||
"container 0.3.13 *\r\n",
|
||||
"core 2.0.58 *\r\n",
|
||||
"cosmosdb 0.2.7 *\r\n",
|
||||
"dla 0.2.4 *\r\n",
|
||||
"dls 0.1.8 *\r\n",
|
||||
"dms 0.1.2 *\r\n",
|
||||
"eventgrid 0.2.1 *\r\n",
|
||||
"eventhubs 0.3.3 *\r\n",
|
||||
"extension 0.2.3 *\r\n",
|
||||
"feedback 2.1.4 *\r\n",
|
||||
"find 0.2.13 *\r\n",
|
||||
"hdinsight 0.3.0 *\r\n",
|
||||
"interactive 0.4.1 *\r\n",
|
||||
"iot 0.3.6 *\r\n",
|
||||
"iotcentral 0.1.6 *\r\n",
|
||||
"keyvault 2.2.11 *\r\n",
|
||||
"kusto 0.1.0 *\r\n",
|
||||
"lab 0.1.5 *\r\n",
|
||||
"maps 0.3.3 *\r\n",
|
||||
"monitor 0.2.10 *\r\n",
|
||||
"network 2.3.2 *\r\n",
|
||||
"nspkg 3.0.3 *\r\n",
|
||||
"policyinsights 0.1.1 *\r\n",
|
||||
"profile 2.1.3 *\r\n",
|
||||
"rdbms 0.3.7 *\r\n",
|
||||
"redis 0.4.0 *\r\n",
|
||||
"relay 0.1.3 *\r\n",
|
||||
"reservations 0.4.1 *\r\n",
|
||||
"resource 2.1.10 *\r\n",
|
||||
"role 2.4.0 *\r\n",
|
||||
"search 0.1.1 *\r\n",
|
||||
"security 0.1.0 *\r\n",
|
||||
"servicebus 0.3.3 *\r\n",
|
||||
"servicefabric 0.1.12 *\r\n",
|
||||
"signalr 1.0.0 *\r\n",
|
||||
"sql 2.1.9 *\r\n",
|
||||
"sqlvm 0.1.0 *\r\n",
|
||||
"storage 2.3.1 *\r\n",
|
||||
"telemetry 1.0.1 *\r\n",
|
||||
"vm 2.2.15 *\r\n",
|
||||
"\r\n",
|
||||
"Extensions:\r\n",
|
||||
"azure-ml-admin-cli 0.0.1\r\n",
|
||||
"azure-cli-ml Unknown\r\n",
|
||||
"\r\n",
|
||||
"Python location '/opt/az/bin/python3'\r\n",
|
||||
"Extensions directory '/opt/az/extensions'\r\n",
|
||||
"\r\n",
|
||||
"Python (Linux) 3.6.5 (default, Feb 12 2019, 02:10:43) \r\n",
|
||||
"[GCC 5.4.0 20160609]\r\n",
|
||||
"\r\n",
|
||||
"Legal docs and information: aka.ms/AzureCliLegal\r\n",
|
||||
"\r\n",
|
||||
"\r\n",
|
||||
"\u001b[33mYou have 57 updates available. Consider updating your CLI installation.\u001b[0m\r\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"!az --version\n",
|
||||
"if create_ACR_FLAG:\n",
|
||||
" !az login\n",
|
||||
" response01 = ! az account list --all --refresh -o table\n",
|
||||
" response02 = ! az account set --subscription $subscription_id\n",
|
||||
" response03 = ! az account list -o table\n",
|
||||
" response04 = ! $cli_command\n",
|
||||
"\n",
|
||||
" response01\n",
|
||||
" response02\n",
|
||||
" response03\n",
|
||||
" response04"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"##### ACR Step 2. Create the ACR"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 19,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"'az acr create --resource-group ghiordanfwirsg01 --name fwi01acr --sku Basic'"
|
||||
]
|
||||
},
|
||||
"execution_count": 19,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
},
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"[' \"loginServer\": \"fwi01acr.azurecr.io\",',\n",
|
||||
" ' \"name\": \"fwi01acr\",',\n",
|
||||
" ' \"networkRuleSet\": null,',\n",
|
||||
" ' \"provisioningState\": \"Succeeded\",',\n",
|
||||
" ' \"resourceGroup\": \"ghiordanfwirsg01\",',\n",
|
||||
" ' \"sku\": {',\n",
|
||||
" ' \"name\": \"Basic\",',\n",
|
||||
" ' \"tier\": \"Basic\"',\n",
|
||||
" ' },',\n",
|
||||
" ' \"status\": null,',\n",
|
||||
" ' \"storageAccount\": null,',\n",
|
||||
" ' \"tags\": {},',\n",
|
||||
" ' \"type\": \"Microsoft.ContainerRegistry/registries\"',\n",
|
||||
" '}']"
|
||||
]
|
||||
},
|
||||
"execution_count": 19,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"%dotenv $dotenv_file_path\n",
|
||||
"acr_name = os.getenv('ACR_NAME')\n",
|
||||
"\n",
|
||||
"cli_command='az acr create --resource-group '+ crt_resource_group +' --name ' + acr_name + ' --sku Basic'\n",
|
||||
"cli_command\n",
|
||||
"\n",
|
||||
"response = !$cli_command\n",
|
||||
"response[-14:]"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"##### ACR Step 3. Also enable password and login via __ [--admin-enabled true](https://docs.microsoft.com/en-us/azure/container-registry/container-registry-authentication) __ and then use the az cli or portal to set up the credentials"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 20,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"'az acr update -n fwi01acr --admin-enabled true'"
|
||||
]
|
||||
},
|
||||
"execution_count": 20,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# per https://docs.microsoft.com/en-us/azure/container-registry/container-registry-authentication\n",
|
||||
"cli_command='az acr update -n '+acr_name+' --admin-enabled true'\n",
|
||||
"cli_command\n",
|
||||
"\n",
|
||||
"response = !$cli_command\n",
|
||||
"# response"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"##### ACR Step 4. Save the ACR password and login"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 21,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# create_ACR_FLAG=False\n",
|
||||
"if create_ACR_FLAG:\n",
|
||||
" import subprocess\n",
|
||||
" cli_command = 'az acr credential show -n '+acr_name\n",
|
||||
"\n",
|
||||
"acr_username = subprocess.Popen(cli_command+' --query username',shell=True,stdout=subprocess.PIPE, stderr=subprocess.PIPE).\\\n",
|
||||
"communicate()[0].decode(\"utf-8\").split()[0].strip('\\\"')\n",
|
||||
"\n",
|
||||
"acr_password = subprocess.Popen(cli_command+' --query passwords[0].value',shell=True,stdout=subprocess.PIPE, stderr=subprocess.PIPE).\\\n",
|
||||
"communicate()[0].decode(\"utf-8\").split()[0].strip('\\\"')\n",
|
||||
"\n",
|
||||
"response = dotenv.set_key(dotenv_file_path, 'ACR_PASSWORD', acr_password)\n",
|
||||
"response = dotenv.set_key(dotenv_file_path, 'ACR_USERNAME', acr_username)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 22,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"%reload_ext dotenv\n",
|
||||
"%dotenv -o $dotenv_file_path\n",
|
||||
"\n",
|
||||
"# print acr password and login info saved in dotenv file\n",
|
||||
"if create_ACR_FLAG:\n",
|
||||
" os.getenv('ACR_PASSWORD')\n",
|
||||
" os.getenv('ACR_USERNAME')"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"print('Finished running 000_Setup_GeophysicsTutorial_FWI_Azure_devito!')"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python [conda env:fwi_dev_conda_environment] *",
|
||||
"language": "python",
|
||||
"name": "conda-env-fwi_dev_conda_environment-py"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.7.5"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
Различия файлов скрыты, потому что одна или несколько строк слишком длинны
Различия файлов скрыты, потому что одна или несколько строк слишком длинны
Различия файлов скрыты, потому что одна или несколько строк слишком длинны
|
@ -0,0 +1,6 @@
|
|||
This folder contains a variety of scripts which might be useful.
|
||||
|
||||
# Ablation Study
|
||||
|
||||
Contained in `ablation.sh`, the script demonstrates running the HRNet model with various patch sizes.
|
||||
|
|
@ -0,0 +1,24 @@
|
|||
#!/bin/bash
|
||||
|
||||
source activate seismic-interpretation
|
||||
|
||||
# Patch_Size 100: Patch vs Section Depth
|
||||
python scripts/prepare_dutchf3.py split_train_val patch --data-dir=/mnt/dutch --stride=50 --patch=100
|
||||
python train.py OUTPUT_DIR /data/output/hrnet_patch TRAIN.DEPTH patch TRAIN.PATCH_SIZE 100 --cfg 'configs/hrnet.yaml'
|
||||
python train.py OUTPUT_DIR /data/output/hrnet_section TRAIN.DEPTH section TRAIN.PATCH_SIZE 100 --cfg 'configs/hrnet.yaml'
|
||||
|
||||
# Patch_Size 150: Patch vs Section Depth
|
||||
python scripts/prepare_dutchf3.py split_train_val patch --data-dir=/mnt/dutch --stride=50 --patch=150
|
||||
python train.py OUTPUT_DIR /data/output/hrnet_patch TRAIN.DEPTH patch TRAIN.PATCH_SIZE 150 --cfg 'configs/hrnet.yaml'
|
||||
python train.py OUTPUT_DIR /data/output/hrnet_section TRAIN.DEPTH section TRAIN.PATCH_SIZE 150 --cfg 'configs/hrnet.yaml'
|
||||
|
||||
# Patch_Size 200: Patch vs Section Depth
|
||||
python scripts/prepare_dutchf3.py split_train_val patch --data-dir=/mnt/dutch --stride=50 --patch=200
|
||||
python train.py OUTPUT_DIR /data/output/hrnet_patch TRAIN.DEPTH patch TRAIN.PATCH_SIZE 200 --cfg 'configs/hrnet.yaml'
|
||||
python train.py OUTPUT_DIR /data/output/hrnet_section TRAIN.DEPTH section TRAIN.PATCH_SIZE 200 --cfg 'configs/hrnet.yaml'
|
||||
|
||||
# Patch_Size 250: Patch vs Section Depth
|
||||
python scripts/prepare_dutchf3.py split_train_val patch --data-dir=/mnt/dutch --stride=50 --patch=250
|
||||
python train.py OUTPUT_DIR /data/output/hrnet_patch TRAIN.DEPTH patch TRAIN.PATCH_SIZE 250 TRAIN.AUGMENTATIONS.RESIZE.HEIGHT 250 TRAIN.AUGMENTATIONS.RESIZE.WIDTH 250 --cfg 'configs/hrnet.yaml'
|
||||
python train.py OUTPUT_DIR /data/output/hrnet_section TRAIN.DEPTH section TRAIN.PATCH_SIZE 250 TRAIN.AUGMENTATIONS.RESIZE.HEIGHT 250 TRAIN.AUGMENTATIONS.RESIZE.WIDTH 250 --cfg 'configs/hrnet.yaml'
|
||||
|
|
@ -0,0 +1,27 @@
|
|||
#!/bin/bash
|
||||
# Copyright (c) Microsoft Corporation. All rights reserved.
|
||||
# Licensed under the MIT License.
|
||||
#
|
||||
# Example:
|
||||
# download_hrnet.sh /data/models hrnet.pth
|
||||
#
|
||||
|
||||
echo Using "$1" as the download directory
|
||||
|
||||
if [ ! -d "$1" ]
|
||||
then
|
||||
echo "Directory does not exist - creating..."
|
||||
mkdir -p "$1"
|
||||
fi
|
||||
|
||||
full_path=$1/$2
|
||||
|
||||
echo "Downloading to ${full_path}"
|
||||
|
||||
wget --header 'Host: optgaw.dm.files.1drv.com' \
|
||||
--user-agent 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:70.0) Gecko/20100101 Firefox/70.0' \
|
||||
--header 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' \
|
||||
--header 'Accept-Language: en-GB,en;q=0.5' \
|
||||
--referer 'https://onedrive.live.com/' \
|
||||
--header 'Upgrade-Insecure-Requests: 1' 'https://optgaw.dm.files.1drv.com/y4m14W1OEuoniQMCT4m64UV8CSQT-dFe2ZRhU0LAZSal80V4phgVIlTYxI2tUi6BPVOy7l5rK8MKpZNywVvtz-NKL2ZWq-UYRL6MAjbLgdFA6zyW8RRrKBe_FcqcWr4YTXeJ18xfVqco6CdGZHFfORBE6EtFxEIrHWNjM032dWZLdqZ0eXd7RZTrHs1KKYa92zcs0Rj91CAyIK4hIaOomzEWA/hrnetv2_w48_imagenet_pretrained.pth?download&psid=1' \
|
||||
--output-document ${full_path}
|
|
@ -0,0 +1,24 @@
|
|||
#!/bin/bash
|
||||
# Copyright (c) Microsoft Corporation. All rights reserved.
|
||||
# Licensed under the MIT License.
|
||||
|
||||
echo "Make sure you also download Dutch F3 data from https://github.com/bolgebrygg/MalenoV"
|
||||
# fetch Dutch F3 from Malenov project.
|
||||
# wget https://drive.google.com/open?id=0B7brcf-eGK8CUUZKLXJURFNYeXM -O interpretation/voxel2pixel/F3/data.segy
|
||||
|
||||
if [ $# -eq 0 ]
|
||||
then
|
||||
downdirtrain='experiments/interpretation/voxel2pixel/F3/train'
|
||||
downdirval='experiments/interpretation/voxel2pixel/F3/val'
|
||||
else
|
||||
downdirtrain=$1
|
||||
downdirval=$1
|
||||
fi
|
||||
|
||||
mkdir -p ${downdirtrain}
|
||||
mkdir -p ${downdirval}
|
||||
|
||||
echo "Downloading train label to $downdirtrain and validation label to $downdirval"
|
||||
wget https://github.com/waldeland/CNN-for-ASI/raw/master/F3/train/inline_339.png -O ${downdirtrain}/inline_339.png
|
||||
wget https://github.com/waldeland/CNN-for-ASI/raw/master/F3/val/inline_405.png -O ${downdirval}/inline_405.png
|
||||
echo "Download complete"
|
|
@ -0,0 +1 @@
|
|||
[Mathew Salvaris] [@msalvaris](http://github.com/msalvaris/)
|
|
@ -0,0 +1,11 @@
|
|||
# CVLib
|
||||
|
||||
A set of utility functions for computer vision
|
||||
|
||||
## Install
|
||||
|
||||
```bash
|
||||
pip install -e .
|
||||
```
|
||||
|
||||
This will install the package cv_lib
|
|
@ -0,0 +1,4 @@
|
|||
# Copyright (c) Microsoft Corporation.
|
||||
# Licensed under the MIT License.
|
||||
|
||||
__version__ = "0.0.1"
|
|
@ -0,0 +1,42 @@
|
|||
# Copyright (c) Microsoft Corporation.
|
||||
# Licensed under the MIT License.
|
||||
|
||||
from ignite.handlers import ModelCheckpoint
|
||||
import glob
|
||||
import os
|
||||
from shutil import copyfile
|
||||
|
||||
|
||||
class SnapshotHandler:
|
||||
def __init__(self, dir_name, filename_prefix, score_function, snapshot_function):
|
||||
self._model_save_location = dir_name
|
||||
self._running_model_prefix = filename_prefix + "_running"
|
||||
self._snapshot_prefix = filename_prefix + "_snapshot"
|
||||
self._snapshot_function = snapshot_function
|
||||
self._snapshot_num = 1
|
||||
self._score_function = score_function
|
||||
self._checkpoint_handler = self._create_checkpoint_handler()
|
||||
|
||||
def _create_checkpoint_handler(self):
|
||||
return ModelCheckpoint(
|
||||
self._model_save_location,
|
||||
self._running_model_prefix,
|
||||
score_function=self._score_function,
|
||||
n_saved=1,
|
||||
create_dir=True,
|
||||
save_as_state_dict=True,
|
||||
require_empty=False,
|
||||
)
|
||||
|
||||
def __call__(self, engine, to_save):
|
||||
self._checkpoint_handler(engine, to_save)
|
||||
if self._snapshot_function():
|
||||
files = glob.glob(os.path.join(self._model_save_location, self._running_model_prefix + "*"))
|
||||
print(files)
|
||||
name_postfix = os.path.basename(files[0]).lstrip(self._running_model_prefix)
|
||||
copyfile(
|
||||
files[0],
|
||||
os.path.join(self._model_save_location, f"{self._snapshot_prefix}{self._snapshot_num}{name_postfix}",),
|
||||
)
|
||||
self._checkpoint_handler = self._create_checkpoint_handler() # Reset the checkpoint handler
|
||||
self._snapshot_num += 1
|
|
@ -0,0 +1,90 @@
|
|||
# Copyright (c) Microsoft Corporation.
|
||||
# Licensed under the MIT License.
|
||||
|
||||
import logging
|
||||
import logging.config
|
||||
from toolz import curry
|
||||
|
||||
import numpy as np
|
||||
|
||||
np.set_printoptions(precision=3)
|
||||
|
||||
|
||||
@curry
|
||||
def log_training_output(engine, log_interval=100):
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
if engine.state.iteration % log_interval == 0:
|
||||
logger.info(f"Epoch: {engine.state.epoch} Iter: {engine.state.iteration} loss {engine.state.output['loss']}")
|
||||
|
||||
|
||||
@curry
|
||||
def log_lr(optimizer, engine):
|
||||
logger = logging.getLogger(__name__)
|
||||
lr = [param_group["lr"] for param_group in optimizer.param_groups]
|
||||
logger.info(f"lr - {lr}")
|
||||
|
||||
|
||||
_DEFAULT_METRICS = {"pixacc": "Avg accuracy :", "nll": "Avg loss :"}
|
||||
|
||||
|
||||
@curry
|
||||
def log_metrics(log_msg, engine, metrics_dict=_DEFAULT_METRICS):
|
||||
logger = logging.getLogger(__name__)
|
||||
metrics = engine.state.metrics
|
||||
metrics_msg = " ".join([f"{metrics_dict[k]} {metrics[k]:.2f}" for k in metrics_dict])
|
||||
logger.info(f"{log_msg} - Epoch {engine.state.epoch} [{engine.state.max_epochs}] " + metrics_msg)
|
||||
|
||||
|
||||
@curry
|
||||
def log_class_metrics(log_msg, engine, metrics_dict):
|
||||
logger = logging.getLogger(__name__)
|
||||
metrics = engine.state.metrics
|
||||
metrics_msg = "\n".join(f"{metrics_dict[k]} {metrics[k].numpy()}" for k in metrics_dict)
|
||||
logger.info(f"{log_msg} - Epoch {engine.state.epoch} [{engine.state.max_epochs}]\n" + metrics_msg)
|
||||
|
||||
|
||||
class Evaluator:
|
||||
def __init__(self, evaluation_engine, data_loader):
|
||||
self._evaluation_engine = evaluation_engine
|
||||
self._data_loader = data_loader
|
||||
|
||||
def __call__(self, engine):
|
||||
self._evaluation_engine.run(self._data_loader)
|
||||
|
||||
|
||||
class HorovodLRScheduler:
|
||||
"""
|
||||
Horovod: using `lr = base_lr * hvd.size()` from the very beginning leads to worse final
|
||||
accuracy. Scale the learning rate `lr = base_lr` ---> `lr = base_lr * hvd.size()` during
|
||||
the first five epochs. See https://arxiv.org/abs/1706.02677 for details.
|
||||
After the warmup reduce learning rate by 10 on the 30th, 60th and 80th epochs.
|
||||
"""
|
||||
|
||||
def __init__(
|
||||
self, base_lr, warmup_epochs, cluster_size, data_loader, optimizer, batches_per_allreduce,
|
||||
):
|
||||
self._warmup_epochs = warmup_epochs
|
||||
self._cluster_size = cluster_size
|
||||
self._data_loader = data_loader
|
||||
self._optimizer = optimizer
|
||||
self._base_lr = base_lr
|
||||
self._batches_per_allreduce = batches_per_allreduce
|
||||
self._logger = logging.getLogger(__name__)
|
||||
|
||||
def __call__(self, engine):
|
||||
epoch = engine.state.epoch
|
||||
if epoch < self._warmup_epochs:
|
||||
epoch += float(engine.state.iteration + 1) / len(self._data_loader)
|
||||
lr_adj = 1.0 / self._cluster_size * (epoch * (self._cluster_size - 1) / self._warmup_epochs + 1)
|
||||
elif epoch < 30:
|
||||
lr_adj = 1.0
|
||||
elif epoch < 60:
|
||||
lr_adj = 1e-1
|
||||
elif epoch < 80:
|
||||
lr_adj = 1e-2
|
||||
else:
|
||||
lr_adj = 1e-3
|
||||
for param_group in self._optimizer.param_groups:
|
||||
param_group["lr"] = self._base_lr * self._cluster_size * self._batches_per_allreduce * lr_adj
|
||||
self._logger.debug(f"Adjust learning rate {param_group['lr']}")
|
|
@ -0,0 +1,69 @@
|
|||
# Copyright (c) Microsoft Corporation.
|
||||
# Licensed under the MIT License.
|
||||
|
||||
from toolz import curry
|
||||
import torchvision
|
||||
import logging
|
||||
import logging.config
|
||||
|
||||
try:
|
||||
from tensorboardX import SummaryWriter
|
||||
except ImportError:
|
||||
raise RuntimeError("No tensorboardX package is found. Please install with the command: \npip install tensorboardX")
|
||||
|
||||
|
||||
def create_summary_writer(log_dir):
|
||||
writer = SummaryWriter(logdir=log_dir)
|
||||
return writer
|
||||
|
||||
|
||||
def _log_model_output(log_label, summary_writer, engine):
|
||||
summary_writer.add_scalar(log_label, engine.state.output["loss"], engine.state.iteration)
|
||||
|
||||
|
||||
@curry
|
||||
def log_training_output(summary_writer, engine):
|
||||
_log_model_output("training/loss", summary_writer, engine)
|
||||
|
||||
|
||||
@curry
|
||||
def log_validation_output(summary_writer, engine):
|
||||
_log_model_output("validation/loss", summary_writer, engine)
|
||||
|
||||
|
||||
@curry
|
||||
def log_lr(summary_writer, optimizer, log_interval, engine):
|
||||
"""[summary]
|
||||
|
||||
Args:
|
||||
optimizer ([type]): [description]
|
||||
log_interval ([type]): iteration or epoch
|
||||
summary_writer ([type]): [description]
|
||||
engine ([type]): [description]
|
||||
"""
|
||||
lr = [param_group["lr"] for param_group in optimizer.param_groups]
|
||||
summary_writer.add_scalar("lr", lr[0], getattr(engine.state, log_interval))
|
||||
|
||||
|
||||
_DEFAULT_METRICS = {"accuracy": "Avg accuracy :", "nll": "Avg loss :"}
|
||||
|
||||
|
||||
@curry
|
||||
def log_metrics(summary_writer, train_engine, log_interval, engine, metrics_dict=_DEFAULT_METRICS):
|
||||
metrics = engine.state.metrics
|
||||
for m in metrics_dict:
|
||||
summary_writer.add_scalar(metrics_dict[m], metrics[m], getattr(train_engine.state, log_interval))
|
||||
|
||||
|
||||
def create_image_writer(summary_writer, label, output_variable, normalize=False, transform_func=lambda x: x):
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
def write_to(engine):
|
||||
try:
|
||||
data_tensor = transform_func(engine.state.output[output_variable])
|
||||
image_grid = torchvision.utils.make_grid(data_tensor, normalize=normalize, scale_each=True)
|
||||
summary_writer.add_image(label, image_grid, engine.state.epoch)
|
||||
except KeyError:
|
||||
logger.warning("Predictions and or ground truth labels not available to report")
|
||||
|
||||
return write_to
|
|
@ -0,0 +1,17 @@
|
|||
# Copyright (c) Microsoft Corporation.
|
||||
# Licensed under the MIT License.
|
||||
|
||||
from toolz import curry
|
||||
import torch.nn.functional as F
|
||||
|
||||
|
||||
@curry
|
||||
def extract_metric_from(metric, engine):
|
||||
metrics = engine.state.metrics
|
||||
return metrics[metric]
|
||||
|
||||
|
||||
@curry
|
||||
def padded_val_transform(pad_left, fine_size, x, y, y_pred):
|
||||
y_pred = y_pred[:, :, pad_left : pad_left + fine_size, pad_left : pad_left + fine_size].contiguous()
|
||||
return {"image": x, "y_pred": F.sigmoid(y_pred).detach(), "mask": y.detach()}
|
|
@ -0,0 +1,221 @@
|
|||
# Copyright (c) Microsoft Corporation.
|
||||
# Licensed under the MIT License.
|
||||
|
||||
import math
|
||||
import numbers
|
||||
import random
|
||||
import numpy as np
|
||||
|
||||
from PIL import Image, ImageOps
|
||||
|
||||
|
||||
class Compose(object):
|
||||
def __init__(self, augmentations):
|
||||
self.augmentations = augmentations
|
||||
|
||||
def __call__(self, img, mask):
|
||||
|
||||
img, mask = Image.fromarray(img, mode=None), Image.fromarray(mask, mode="L")
|
||||
assert img.size == mask.size
|
||||
|
||||
for a in self.augmentations:
|
||||
img, mask = a(img, mask)
|
||||
return np.array(img), np.array(mask, dtype=np.uint8)
|
||||
|
||||
|
||||
class AddNoise(object):
|
||||
def __call__(self, img, mask):
|
||||
noise = np.random.normal(loc=0, scale=0.02, size=(img.size[1], img.size[0]))
|
||||
return img + noise, mask
|
||||
|
||||
|
||||
class RandomCrop(object):
|
||||
def __init__(self, size, padding=0):
|
||||
if isinstance(size, numbers.Number):
|
||||
self.size = (int(size), int(size))
|
||||
else:
|
||||
self.size = size
|
||||
self.padding = padding
|
||||
|
||||
def __call__(self, img, mask):
|
||||
if self.padding > 0:
|
||||
img = ImageOps.expand(img, border=self.padding, fill=0)
|
||||
mask = ImageOps.expand(mask, border=self.padding, fill=0)
|
||||
|
||||
assert img.size == mask.size
|
||||
w, h = img.size
|
||||
th, tw = self.size
|
||||
if w == tw and h == th:
|
||||
return img, mask
|
||||
if w < tw or h < th:
|
||||
return (
|
||||
img.resize((tw, th), Image.BILINEAR),
|
||||
mask.resize((tw, th), Image.NEAREST),
|
||||
)
|
||||
|
||||
x1 = random.randint(0, w - tw)
|
||||
y1 = random.randint(0, h - th)
|
||||
return (
|
||||
img.crop((x1, y1, x1 + tw, y1 + th)),
|
||||
mask.crop((x1, y1, x1 + tw, y1 + th)),
|
||||
)
|
||||
|
||||
|
||||
class CenterCrop(object):
|
||||
def __init__(self, size):
|
||||
if isinstance(size, numbers.Number):
|
||||
self.size = (int(size), int(size))
|
||||
else:
|
||||
self.size = size
|
||||
|
||||
def __call__(self, img, mask):
|
||||
assert img.size == mask.size
|
||||
w, h = img.size
|
||||
th, tw = self.size
|
||||
x1 = int(round((w - tw) / 2.0))
|
||||
y1 = int(round((h - th) / 2.0))
|
||||
return (
|
||||
img.crop((x1, y1, x1 + tw, y1 + th)),
|
||||
mask.crop((x1, y1, x1 + tw, y1 + th)),
|
||||
)
|
||||
|
||||
|
||||
class RandomHorizontallyFlip(object):
|
||||
def __call__(self, img, mask):
|
||||
if random.random() < 0.5:
|
||||
# Note: we use FLIP_TOP_BOTTOM here intentionaly. Due to the dimensions of the image,
|
||||
# it ends up being a horizontal flip.
|
||||
return (
|
||||
img.transpose(Image.FLIP_TOP_BOTTOM),
|
||||
mask.transpose(Image.FLIP_TOP_BOTTOM),
|
||||
)
|
||||
return img, mask
|
||||
|
||||
|
||||
class RandomVerticallyFlip(object):
|
||||
def __call__(self, img, mask):
|
||||
if random.random() < 0.5:
|
||||
return (
|
||||
img.transpose(Image.FLIP_LEFT_RIGHT),
|
||||
mask.transpose(Image.FLIP_LEFT_RIGHT),
|
||||
)
|
||||
return img, mask
|
||||
|
||||
|
||||
class FreeScale(object):
|
||||
def __init__(self, size):
|
||||
self.size = tuple(reversed(size)) # size: (h, w)
|
||||
|
||||
def __call__(self, img, mask):
|
||||
assert img.size == mask.size
|
||||
return (
|
||||
img.resize(self.size, Image.BILINEAR),
|
||||
mask.resize(self.size, Image.NEAREST),
|
||||
)
|
||||
|
||||
|
||||
class Scale(object):
|
||||
def __init__(self, size):
|
||||
self.size = size
|
||||
|
||||
def __call__(self, img, mask):
|
||||
assert img.size == mask.size
|
||||
w, h = img.size
|
||||
if (w >= h and w == self.size) or (h >= w and h == self.size):
|
||||
return img, mask
|
||||
if w > h:
|
||||
ow = self.size
|
||||
oh = int(self.size * h / w)
|
||||
return (
|
||||
img.resize((ow, oh), Image.BILINEAR),
|
||||
mask.resize((ow, oh), Image.NEAREST),
|
||||
)
|
||||
else:
|
||||
oh = self.size
|
||||
ow = int(self.size * w / h)
|
||||
return (
|
||||
img.resize((ow, oh), Image.BILINEAR),
|
||||
mask.resize((ow, oh), Image.NEAREST),
|
||||
)
|
||||
|
||||
|
||||
class RandomSizedCrop(object):
|
||||
def __init__(self, size):
|
||||
self.size = size
|
||||
|
||||
def __call__(self, img, mask):
|
||||
assert img.size == mask.size
|
||||
for attempt in range(10):
|
||||
area = img.size[0] * img.size[1]
|
||||
target_area = random.uniform(0.45, 1.0) * area
|
||||
aspect_ratio = random.uniform(0.5, 2)
|
||||
|
||||
w = int(round(math.sqrt(target_area * aspect_ratio)))
|
||||
h = int(round(math.sqrt(target_area / aspect_ratio)))
|
||||
|
||||
if random.random() < 0.5:
|
||||
w, h = h, w
|
||||
|
||||
if w <= img.size[0] and h <= img.size[1]:
|
||||
x1 = random.randint(0, img.size[0] - w)
|
||||
y1 = random.randint(0, img.size[1] - h)
|
||||
|
||||
img = img.crop((x1, y1, x1 + w, y1 + h))
|
||||
mask = mask.crop((x1, y1, x1 + w, y1 + h))
|
||||
assert img.size == (w, h)
|
||||
|
||||
return (
|
||||
img.resize((self.size, self.size), Image.BILINEAR),
|
||||
mask.resize((self.size, self.size), Image.NEAREST),
|
||||
)
|
||||
|
||||
# Fallback
|
||||
scale = Scale(self.size)
|
||||
crop = CenterCrop(self.size)
|
||||
return crop(*scale(img, mask))
|
||||
|
||||
|
||||
class RandomRotate(object):
|
||||
def __init__(self, degree):
|
||||
self.degree = degree
|
||||
|
||||
def __call__(self, img, mask):
|
||||
"""
|
||||
PIL automatically adds zeros to the borders of images that rotated. To fix this
|
||||
issue, the code in the botton sets anywhere in the labels (mask) that is zero to
|
||||
255 (the value used for ignore_index).
|
||||
"""
|
||||
rotate_degree = random.random() * 2 * self.degree - self.degree
|
||||
|
||||
img = img.rotate(rotate_degree, Image.BILINEAR)
|
||||
mask = mask.rotate(rotate_degree, Image.NEAREST)
|
||||
|
||||
binary_mask = Image.fromarray(np.ones([mask.size[1], mask.size[0]]))
|
||||
binary_mask = binary_mask.rotate(rotate_degree, Image.NEAREST)
|
||||
binary_mask = np.array(binary_mask)
|
||||
|
||||
mask_arr = np.array(mask)
|
||||
mask_arr[binary_mask == 0] = 255
|
||||
mask = Image.fromarray(mask_arr)
|
||||
|
||||
return img, mask
|
||||
|
||||
|
||||
class RandomSized(object):
|
||||
def __init__(self, size):
|
||||
self.size = size
|
||||
self.scale = Scale(self.size)
|
||||
self.crop = RandomCrop(self.size)
|
||||
|
||||
def __call__(self, img, mask):
|
||||
assert img.size == mask.size
|
||||
|
||||
w = int(random.uniform(0.5, 2) * img.size[0])
|
||||
h = int(random.uniform(0.5, 2) * img.size[1])
|
||||
|
||||
img, mask = (
|
||||
img.resize((w, h), Image.BILINEAR),
|
||||
mask.resize((w, h), Image.NEAREST),
|
||||
)
|
||||
|
||||
return self.crop(*self.scale(img, mask))
|
|
@ -0,0 +1,130 @@
|
|||
# Copyright (c) Microsoft Corporation.
|
||||
# Licensed under the MIT License.
|
||||
|
||||
import torch
|
||||
|
||||
from ignite.engine.engine import Engine, State, Events
|
||||
from ignite.utils import convert_tensor
|
||||
import torch.nn.functional as F
|
||||
from toolz import curry
|
||||
from torch.nn import functional as F
|
||||
import numpy as np
|
||||
|
||||
|
||||
def _upscale_model_output(y_pred, y):
|
||||
ph, pw = y_pred.size(2), y_pred.size(3)
|
||||
h, w = y.size(2), y.size(3)
|
||||
if ph != h or pw != w:
|
||||
y_pred = F.upsample(input=y_pred, size=(h, w), mode="bilinear")
|
||||
return y_pred
|
||||
|
||||
|
||||
def create_supervised_trainer(
|
||||
model,
|
||||
optimizer,
|
||||
loss_fn,
|
||||
prepare_batch,
|
||||
device=None,
|
||||
non_blocking=False,
|
||||
output_transform=lambda x, y, y_pred, loss: {"loss": loss.item()},
|
||||
):
|
||||
if device:
|
||||
model.to(device)
|
||||
|
||||
def _update(engine, batch):
|
||||
model.train()
|
||||
optimizer.zero_grad()
|
||||
x, y = prepare_batch(batch, device=device, non_blocking=non_blocking)
|
||||
y_pred = model(x)
|
||||
y_pred = _upscale_model_output(y_pred, y)
|
||||
loss = loss_fn(y_pred.squeeze(1), y.squeeze(1))
|
||||
loss.backward()
|
||||
optimizer.step()
|
||||
return output_transform(x, y, y_pred, loss)
|
||||
|
||||
return Engine(_update)
|
||||
|
||||
|
||||
@curry
|
||||
def val_transform(x, y, y_pred):
|
||||
return {"image": x, "y_pred": y_pred.detach(), "mask": y.detach()}
|
||||
|
||||
|
||||
def create_supervised_evaluator(
|
||||
model, prepare_batch, metrics=None, device=None, non_blocking=False, output_transform=val_transform,
|
||||
):
|
||||
metrics = metrics or {}
|
||||
|
||||
if device:
|
||||
model.to(device)
|
||||
|
||||
def _inference(engine, batch):
|
||||
model.eval()
|
||||
with torch.no_grad():
|
||||
x, y = prepare_batch(batch, device=device, non_blocking=non_blocking)
|
||||
y_pred = model(x)
|
||||
y_pred = _upscale_model_output(y_pred, x)
|
||||
return output_transform(x, y, y_pred)
|
||||
|
||||
engine = Engine(_inference)
|
||||
|
||||
for name, metric in metrics.items():
|
||||
metric.attach(engine, name)
|
||||
|
||||
return engine
|
||||
|
||||
|
||||
def create_supervised_trainer_apex(
|
||||
model,
|
||||
optimizer,
|
||||
loss_fn,
|
||||
prepare_batch,
|
||||
device=None,
|
||||
non_blocking=False,
|
||||
output_transform=lambda x, y, y_pred, loss: {"loss": loss.item()},
|
||||
):
|
||||
from apex import amp
|
||||
|
||||
if device:
|
||||
model.to(device)
|
||||
|
||||
def _update(engine, batch):
|
||||
model.train()
|
||||
optimizer.zero_grad()
|
||||
x, y = prepare_batch(batch, device=device, non_blocking=non_blocking)
|
||||
y_pred = model(x)
|
||||
loss = loss_fn(y_pred.squeeze(1), y.squeeze(1))
|
||||
with amp.scale_loss(loss, optimizer) as scaled_loss:
|
||||
scaled_loss.backward()
|
||||
optimizer.step()
|
||||
return output_transform(x, y, y_pred, loss)
|
||||
|
||||
return Engine(_update)
|
||||
|
||||
|
||||
# def create_supervised_evaluator_apex(
|
||||
# model,
|
||||
# prepare_batch,
|
||||
# metrics=None,
|
||||
# device=None,
|
||||
# non_blocking=False,
|
||||
# output_transform=lambda x, y, y_pred: (x, y, pred),
|
||||
# ):
|
||||
# metrics = metrics or {}
|
||||
|
||||
# if device:
|
||||
# model.to(device)
|
||||
|
||||
# def _inference(engine, batch):
|
||||
# model.eval()
|
||||
# with torch.no_grad():
|
||||
# x, y = prepare_batch(batch, device=device, non_blocking=non_blocking)
|
||||
# y_pred = model(x)
|
||||
# return output_transform(x, y, y_pred)
|
||||
|
||||
# engine = Engine(_inference)
|
||||
|
||||
# for name, metric in metrics.items():
|
||||
# metric.attach(engine, name)
|
||||
|
||||
# return engine
|
|
@ -0,0 +1,46 @@
|
|||
# Copyright (c) Microsoft Corporation.
|
||||
# Licensed under the MIT License.
|
||||
|
||||
import numpy as np
|
||||
import torch
|
||||
from git import Repo
|
||||
from datetime import datetime
|
||||
import os
|
||||
|
||||
|
||||
def np_to_tb(array):
|
||||
# if 2D :
|
||||
if array.ndim == 2:
|
||||
# HW => CHW
|
||||
array = np.expand_dims(array, axis=0)
|
||||
# CHW => NCHW
|
||||
array = np.expand_dims(array, axis=0)
|
||||
elif array.ndim == 3:
|
||||
# HWC => CHW
|
||||
array = array.transpose(2, 0, 1)
|
||||
# CHW => NCHW
|
||||
array = np.expand_dims(array, axis=0)
|
||||
|
||||
array = torch.from_numpy(array)
|
||||
return array
|
||||
|
||||
|
||||
def current_datetime():
|
||||
return datetime.now().strftime("%b%d_%H%M%S")
|
||||
|
||||
|
||||
def git_branch():
|
||||
repo = Repo(search_parent_directories=True)
|
||||
return repo.active_branch.name
|
||||
|
||||
|
||||
def git_hash():
|
||||
repo = Repo(search_parent_directories=True)
|
||||
return repo.active_branch.commit.hexsha
|
||||
|
||||
|
||||
def generate_path(base_path, *directories):
|
||||
path = os.path.join(base_path, *directories)
|
||||
if not os.path.exists(path):
|
||||
os.makedirs(path)
|
||||
return path
|
|
@ -0,0 +1,94 @@
|
|||
# Copyright (c) Microsoft Corporation.
|
||||
# Licensed under the MIT License.
|
||||
|
||||
import torch
|
||||
import ignite
|
||||
|
||||
|
||||
def pixelwise_accuracy(num_classes, output_transform=lambda x: x, device=None):
|
||||
"""Calculates class accuracy
|
||||
|
||||
Args:
|
||||
num_classes (int): number of classes
|
||||
output_transform (callable, optional): a callable that is used to transform the
|
||||
output into the form expected by the metric.
|
||||
|
||||
Returns:
|
||||
MetricsLambda
|
||||
|
||||
"""
|
||||
cm = ignite.metrics.ConfusionMatrix(num_classes=num_classes, output_transform=output_transform, device=device)
|
||||
# Increase floating point precision and pass to CPU
|
||||
cm = cm.type(torch.DoubleTensor)
|
||||
|
||||
pix_cls = ignite.metrics.confusion_matrix.cmAccuracy(cm)
|
||||
|
||||
return pix_cls
|
||||
|
||||
|
||||
def class_accuracy(num_classes, output_transform=lambda x: x, device=None):
|
||||
"""Calculates class accuracy
|
||||
|
||||
Args:
|
||||
num_classes (int): number of classes
|
||||
output_transform (callable, optional): a callable that is used to transform the
|
||||
output into the form expected by the metric.
|
||||
|
||||
Returns:
|
||||
MetricsLambda
|
||||
|
||||
"""
|
||||
cm = ignite.metrics.ConfusionMatrix(num_classes=num_classes, output_transform=output_transform, device=device)
|
||||
# Increase floating point precision and pass to CPU
|
||||
cm = cm.type(torch.DoubleTensor)
|
||||
|
||||
acc_cls = cm.diag() / (cm.sum(dim=1) + 1e-15)
|
||||
|
||||
return acc_cls
|
||||
|
||||
|
||||
def mean_class_accuracy(num_classes, output_transform=lambda x: x, device=None):
|
||||
"""Calculates mean class accuracy
|
||||
|
||||
Args:
|
||||
num_classes (int): number of classes
|
||||
output_transform (callable, optional): a callable that is used to transform the
|
||||
output into the form expected by the metric.
|
||||
|
||||
Returns:
|
||||
MetricsLambda
|
||||
|
||||
"""
|
||||
return class_accuracy(num_classes=num_classes, output_transform=output_transform, device=device).mean()
|
||||
|
||||
|
||||
def class_iou(num_classes, output_transform=lambda x: x, device=None, ignore_index=None):
|
||||
"""Calculates per-class intersection-over-union
|
||||
|
||||
Args:
|
||||
num_classes (int): number of classes
|
||||
output_transform (callable, optional): a callable that is used to transform the
|
||||
output into the form expected by the metric.
|
||||
|
||||
Returns:
|
||||
MetricsLambda
|
||||
|
||||
"""
|
||||
cm = ignite.metrics.ConfusionMatrix(num_classes=num_classes, output_transform=output_transform, device=device)
|
||||
return ignite.metrics.IoU(cm, ignore_index=ignore_index)
|
||||
|
||||
|
||||
def mean_iou(num_classes, output_transform=lambda x: x, device=None, ignore_index=None):
|
||||
"""Calculates mean intersection-over-union
|
||||
|
||||
Args:
|
||||
num_classes (int): number of classes
|
||||
output_transform (callable, optional): a callable that is used to transform the
|
||||
output into the form expected by the metric.
|
||||
|
||||
Returns:
|
||||
MetricsLambda
|
||||
|
||||
"""
|
||||
cm = ignite.metrics.ConfusionMatrix(num_classes=num_classes, output_transform=output_transform, device=device)
|
||||
return ignite.metrics.mIoU(cm, ignore_index=ignore_index)
|
|
@ -0,0 +1,10 @@
|
|||
# Copyright (c) Microsoft Corporation.
|
||||
# Licensed under the MIT License.
|
||||
|
||||
import cv_lib.segmentation.models.seg_hrnet # noqa: F401
|
||||
import cv_lib.segmentation.models.resnet_unet # noqa: F401
|
||||
import cv_lib.segmentation.models.unet # noqa: F401
|
||||
import cv_lib.segmentation.models.section_deconvnet # noqa: F401
|
||||
import cv_lib.segmentation.models.patch_deconvnet # noqa: F401
|
||||
import cv_lib.segmentation.models.patch_deconvnet_skip # noqa: F401
|
||||
import cv_lib.segmentation.models.section_deconvnet_skip # noqa: F401
|
|
@ -0,0 +1,308 @@
|
|||
# Copyright (c) Microsoft Corporation.
|
||||
# Licensed under the MIT License.
|
||||
|
||||
import torch.nn as nn
|
||||
|
||||
|
||||
class patch_deconvnet(nn.Module):
|
||||
def __init__(self, n_classes=4, learned_billinear=False):
|
||||
super(patch_deconvnet, self).__init__()
|
||||
self.learned_billinear = learned_billinear
|
||||
self.n_classes = n_classes
|
||||
self.unpool = nn.MaxUnpool2d(2, stride=2)
|
||||
self.conv_block1 = nn.Sequential(
|
||||
# conv1_1
|
||||
nn.Conv2d(1, 64, 3, padding=1),
|
||||
nn.BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# conv1_2
|
||||
nn.Conv2d(64, 64, 3, padding=1),
|
||||
nn.BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# pool1
|
||||
nn.MaxPool2d(2, stride=2, return_indices=True, ceil_mode=True),
|
||||
)
|
||||
# it returns outputs and pool_indices_1
|
||||
|
||||
# 48*48
|
||||
|
||||
self.conv_block2 = nn.Sequential(
|
||||
# conv2_1
|
||||
nn.Conv2d(64, 128, 3, padding=1),
|
||||
nn.BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# conv2_2
|
||||
nn.Conv2d(128, 128, 3, padding=1),
|
||||
nn.BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# pool2
|
||||
nn.MaxPool2d(2, stride=2, return_indices=True, ceil_mode=True),
|
||||
)
|
||||
# it returns outputs and pool_indices_2
|
||||
|
||||
# 24*24
|
||||
|
||||
self.conv_block3 = nn.Sequential(
|
||||
# conv3_1
|
||||
nn.Conv2d(128, 256, 3, padding=1),
|
||||
nn.BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# conv3_2
|
||||
nn.Conv2d(256, 256, 3, padding=1),
|
||||
nn.BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# conv3_3
|
||||
nn.Conv2d(256, 256, 3, padding=1),
|
||||
nn.BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# pool3
|
||||
nn.MaxPool2d(2, stride=2, return_indices=True, ceil_mode=True),
|
||||
)
|
||||
# it returns outputs and pool_indices_3
|
||||
|
||||
# 12*12
|
||||
|
||||
self.conv_block4 = nn.Sequential(
|
||||
# conv4_1
|
||||
nn.Conv2d(256, 512, 3, padding=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# conv4_2
|
||||
nn.Conv2d(512, 512, 3, padding=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# conv4_3
|
||||
nn.Conv2d(512, 512, 3, padding=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# pool4
|
||||
nn.MaxPool2d(2, stride=2, return_indices=True, ceil_mode=True),
|
||||
)
|
||||
# it returns outputs and pool_indices_4
|
||||
|
||||
# 6*6
|
||||
|
||||
self.conv_block5 = nn.Sequential(
|
||||
# conv5_1
|
||||
nn.Conv2d(512, 512, 3, padding=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# conv5_2
|
||||
nn.Conv2d(512, 512, 3, padding=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# conv5_3
|
||||
nn.Conv2d(512, 512, 3, padding=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# pool5
|
||||
nn.MaxPool2d(2, stride=2, return_indices=True, ceil_mode=True),
|
||||
)
|
||||
# it returns outputs and pool_indices_5
|
||||
|
||||
# 3*3
|
||||
|
||||
self.conv_block6 = nn.Sequential(
|
||||
# fc6
|
||||
nn.Conv2d(512, 4096, 3),
|
||||
# set the filter size and nor padding to make output into 1*1
|
||||
nn.BatchNorm2d(4096, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
)
|
||||
|
||||
# 1*1
|
||||
|
||||
self.conv_block7 = nn.Sequential(
|
||||
# fc7
|
||||
nn.Conv2d(4096, 4096, 1),
|
||||
# set the filter size to make output into 1*1
|
||||
nn.BatchNorm2d(4096, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
)
|
||||
|
||||
self.deconv_block8 = nn.Sequential(
|
||||
# fc6-deconv
|
||||
nn.ConvTranspose2d(4096, 512, 3, stride=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
)
|
||||
|
||||
# 3*3
|
||||
|
||||
self.unpool_block9 = nn.Sequential(
|
||||
# unpool5
|
||||
nn.MaxUnpool2d(2, stride=2),
|
||||
)
|
||||
# usage unpool(output, indices)
|
||||
|
||||
# 6*6
|
||||
|
||||
self.deconv_block10 = nn.Sequential(
|
||||
# deconv5_1
|
||||
nn.ConvTranspose2d(512, 512, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# deconv5_2
|
||||
nn.ConvTranspose2d(512, 512, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# deconv5_3
|
||||
nn.ConvTranspose2d(512, 512, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
)
|
||||
|
||||
self.unpool_block11 = nn.Sequential(
|
||||
# unpool4
|
||||
nn.MaxUnpool2d(2, stride=2),
|
||||
)
|
||||
|
||||
# 12*12
|
||||
|
||||
self.deconv_block12 = nn.Sequential(
|
||||
# deconv4_1
|
||||
nn.ConvTranspose2d(512, 512, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# deconv4_2
|
||||
nn.ConvTranspose2d(512, 512, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# deconv4_3
|
||||
nn.ConvTranspose2d(512, 256, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
)
|
||||
|
||||
self.unpool_block13 = nn.Sequential(
|
||||
# unpool3
|
||||
nn.MaxUnpool2d(2, stride=2),
|
||||
)
|
||||
|
||||
# 24*24
|
||||
|
||||
self.deconv_block14 = nn.Sequential(
|
||||
# deconv3_1
|
||||
nn.ConvTranspose2d(256, 256, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# deconv3_2
|
||||
nn.ConvTranspose2d(256, 256, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# deconv3_3
|
||||
nn.ConvTranspose2d(256, 128, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
)
|
||||
|
||||
self.unpool_block15 = nn.Sequential(
|
||||
# unpool2
|
||||
nn.MaxUnpool2d(2, stride=2),
|
||||
)
|
||||
|
||||
# 48*48
|
||||
|
||||
self.deconv_block16 = nn.Sequential(
|
||||
# deconv2_1
|
||||
nn.ConvTranspose2d(128, 128, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# deconv2_2
|
||||
nn.ConvTranspose2d(128, 64, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
)
|
||||
|
||||
self.unpool_block17 = nn.Sequential(
|
||||
# unpool1
|
||||
nn.MaxUnpool2d(2, stride=2),
|
||||
)
|
||||
|
||||
# 96*96
|
||||
|
||||
self.deconv_block18 = nn.Sequential(
|
||||
# deconv1_1
|
||||
nn.ConvTranspose2d(64, 64, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# deconv1_2
|
||||
nn.ConvTranspose2d(64, 64, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
)
|
||||
|
||||
self.seg_score19 = nn.Sequential(
|
||||
# seg-score
|
||||
nn.Conv2d(64, self.n_classes, 1),
|
||||
)
|
||||
|
||||
if self.learned_billinear:
|
||||
raise NotImplementedError
|
||||
|
||||
def forward(self, x):
|
||||
size0 = x.size()
|
||||
conv1, indices1 = self.conv_block1(x)
|
||||
size1 = conv1.size()
|
||||
conv2, indices2 = self.conv_block2(conv1)
|
||||
size2 = conv2.size()
|
||||
conv3, indices3 = self.conv_block3(conv2)
|
||||
size3 = conv3.size()
|
||||
conv4, indices4 = self.conv_block4(conv3)
|
||||
size4 = conv4.size()
|
||||
conv5, indices5 = self.conv_block5(conv4)
|
||||
|
||||
conv6 = self.conv_block6(conv5)
|
||||
conv7 = self.conv_block7(conv6)
|
||||
conv8 = self.deconv_block8(conv7)
|
||||
conv9 = self.unpool(conv8, indices5, output_size=size4)
|
||||
conv10 = self.deconv_block10(conv9)
|
||||
conv11 = self.unpool(conv10, indices4, output_size=size3)
|
||||
conv12 = self.deconv_block12(conv11)
|
||||
conv13 = self.unpool(conv12, indices3, output_size=size2)
|
||||
conv14 = self.deconv_block14(conv13)
|
||||
conv15 = self.unpool(conv14, indices2, output_size=size1)
|
||||
conv16 = self.deconv_block16(conv15)
|
||||
conv17 = self.unpool(conv16, indices1, output_size=size0)
|
||||
conv18 = self.deconv_block18(conv17)
|
||||
out = self.seg_score19(conv18)
|
||||
|
||||
return out
|
||||
|
||||
def init_vgg16_params(self, vgg16, copy_fc8=True):
|
||||
blocks = [
|
||||
self.conv_block1,
|
||||
self.conv_block2,
|
||||
self.conv_block3,
|
||||
self.conv_block4,
|
||||
self.conv_block5,
|
||||
]
|
||||
|
||||
ranges = [[0, 4], [5, 9], [10, 16], [17, 23], [24, 29]]
|
||||
features = list(vgg16.features.children())
|
||||
i_layer = 0
|
||||
# copy convolutional filters from vgg16
|
||||
for idx, conv_block in enumerate(blocks):
|
||||
for l1, l2 in zip(features[ranges[idx][0] : ranges[idx][1]], conv_block):
|
||||
if isinstance(l1, nn.Conv2d) and isinstance(l2, nn.Conv2d):
|
||||
if i_layer == 0:
|
||||
l2.weight.data = (
|
||||
(l1.weight.data[:, 0, :, :] + l1.weight.data[:, 1, :, :] + l1.weight.data[:, 2, :, :]) / 3.0
|
||||
).view(l2.weight.size())
|
||||
l2.bias.data = l1.bias.data
|
||||
i_layer = i_layer + 1
|
||||
else:
|
||||
assert l1.weight.size() == l2.weight.size()
|
||||
assert l1.bias.size() == l2.bias.size()
|
||||
l2.weight.data = l1.weight.data
|
||||
l2.bias.data = l1.bias.data
|
||||
i_layer = i_layer + 1
|
||||
|
||||
|
||||
def get_seg_model(cfg, **kwargs):
|
||||
assert (
|
||||
cfg.MODEL.IN_CHANNELS == 1
|
||||
), f"Patch deconvnet is not implemented to accept {cfg.MODEL.IN_CHANNELS} channels. Please only pass 1 for cfg.MODEL.IN_CHANNELS"
|
||||
model = patch_deconvnet(n_classes=cfg.DATASET.NUM_CLASSES)
|
||||
|
||||
return model
|
|
@ -0,0 +1,307 @@
|
|||
# Copyright (c) Microsoft Corporation.
|
||||
# Licensed under the MIT License.
|
||||
|
||||
import torch.nn as nn
|
||||
|
||||
|
||||
class patch_deconvnet_skip(nn.Module):
|
||||
def __init__(self, n_classes=4, learned_billinear=False):
|
||||
super(patch_deconvnet_skip, self).__init__()
|
||||
self.learned_billinear = learned_billinear
|
||||
self.n_classes = n_classes
|
||||
self.unpool = nn.MaxUnpool2d(2, stride=2)
|
||||
self.conv_block1 = nn.Sequential(
|
||||
# conv1_1
|
||||
nn.Conv2d(1, 64, 3, padding=1),
|
||||
nn.BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# conv1_2
|
||||
nn.Conv2d(64, 64, 3, padding=1),
|
||||
nn.BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# pool1
|
||||
nn.MaxPool2d(2, stride=2, return_indices=True, ceil_mode=True),
|
||||
)
|
||||
# it returns outputs and pool_indices_1
|
||||
|
||||
# 48*48
|
||||
|
||||
self.conv_block2 = nn.Sequential(
|
||||
# conv2_1
|
||||
nn.Conv2d(64, 128, 3, padding=1),
|
||||
nn.BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# conv2_2
|
||||
nn.Conv2d(128, 128, 3, padding=1),
|
||||
nn.BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# pool2
|
||||
nn.MaxPool2d(2, stride=2, return_indices=True, ceil_mode=True),
|
||||
)
|
||||
# it returns outputs and pool_indices_2
|
||||
|
||||
# 24*24
|
||||
|
||||
self.conv_block3 = nn.Sequential(
|
||||
# conv3_1
|
||||
nn.Conv2d(128, 256, 3, padding=1),
|
||||
nn.BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# conv3_2
|
||||
nn.Conv2d(256, 256, 3, padding=1),
|
||||
nn.BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# conv3_3
|
||||
nn.Conv2d(256, 256, 3, padding=1),
|
||||
nn.BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# pool3
|
||||
nn.MaxPool2d(2, stride=2, return_indices=True, ceil_mode=True),
|
||||
)
|
||||
# it returns outputs and pool_indices_3
|
||||
|
||||
# 12*12
|
||||
|
||||
self.conv_block4 = nn.Sequential(
|
||||
# conv4_1
|
||||
nn.Conv2d(256, 512, 3, padding=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# conv4_2
|
||||
nn.Conv2d(512, 512, 3, padding=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# conv4_3
|
||||
nn.Conv2d(512, 512, 3, padding=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# pool4
|
||||
nn.MaxPool2d(2, stride=2, return_indices=True, ceil_mode=True),
|
||||
)
|
||||
# it returns outputs and pool_indices_4
|
||||
|
||||
# 6*6
|
||||
|
||||
self.conv_block5 = nn.Sequential(
|
||||
# conv5_1
|
||||
nn.Conv2d(512, 512, 3, padding=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# conv5_2
|
||||
nn.Conv2d(512, 512, 3, padding=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# conv5_3
|
||||
nn.Conv2d(512, 512, 3, padding=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# pool5
|
||||
nn.MaxPool2d(2, stride=2, return_indices=True, ceil_mode=True),
|
||||
)
|
||||
# it returns outputs and pool_indices_5
|
||||
|
||||
# 3*3
|
||||
|
||||
self.conv_block6 = nn.Sequential(
|
||||
# fc6
|
||||
nn.Conv2d(512, 4096, 3),
|
||||
# set the filter size and nor padding to make output into 1*1
|
||||
nn.BatchNorm2d(4096, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
)
|
||||
|
||||
# 1*1
|
||||
|
||||
self.conv_block7 = nn.Sequential(
|
||||
# fc7
|
||||
nn.Conv2d(4096, 4096, 1),
|
||||
# set the filter size to make output into 1*1
|
||||
nn.BatchNorm2d(4096, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
)
|
||||
|
||||
self.deconv_block8 = nn.Sequential(
|
||||
# fc6-deconv
|
||||
nn.ConvTranspose2d(4096, 512, 3, stride=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
)
|
||||
|
||||
# 3*3
|
||||
|
||||
self.unpool_block9 = nn.Sequential(
|
||||
# unpool5
|
||||
nn.MaxUnpool2d(2, stride=2),
|
||||
)
|
||||
# usage unpool(output, indices)
|
||||
|
||||
# 6*6
|
||||
|
||||
self.deconv_block10 = nn.Sequential(
|
||||
# deconv5_1
|
||||
nn.ConvTranspose2d(512, 512, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# deconv5_2
|
||||
nn.ConvTranspose2d(512, 512, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# deconv5_3
|
||||
nn.ConvTranspose2d(512, 512, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
)
|
||||
|
||||
self.unpool_block11 = nn.Sequential(
|
||||
# unpool4
|
||||
nn.MaxUnpool2d(2, stride=2),
|
||||
)
|
||||
|
||||
# 12*12
|
||||
|
||||
self.deconv_block12 = nn.Sequential(
|
||||
# deconv4_1
|
||||
nn.ConvTranspose2d(512, 512, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# deconv4_2
|
||||
nn.ConvTranspose2d(512, 512, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# deconv4_3
|
||||
nn.ConvTranspose2d(512, 256, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
)
|
||||
|
||||
self.unpool_block13 = nn.Sequential(
|
||||
# unpool3
|
||||
nn.MaxUnpool2d(2, stride=2),
|
||||
)
|
||||
|
||||
# 24*24
|
||||
|
||||
self.deconv_block14 = nn.Sequential(
|
||||
# deconv3_1
|
||||
nn.ConvTranspose2d(256, 256, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# deconv3_2
|
||||
nn.ConvTranspose2d(256, 256, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# deconv3_3
|
||||
nn.ConvTranspose2d(256, 128, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
)
|
||||
|
||||
self.unpool_block15 = nn.Sequential(
|
||||
# unpool2
|
||||
nn.MaxUnpool2d(2, stride=2),
|
||||
)
|
||||
|
||||
# 48*48
|
||||
|
||||
self.deconv_block16 = nn.Sequential(
|
||||
# deconv2_1
|
||||
nn.ConvTranspose2d(128, 128, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# deconv2_2
|
||||
nn.ConvTranspose2d(128, 64, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
)
|
||||
|
||||
self.unpool_block17 = nn.Sequential(
|
||||
# unpool1
|
||||
nn.MaxUnpool2d(2, stride=2),
|
||||
)
|
||||
|
||||
# 96*96
|
||||
|
||||
self.deconv_block18 = nn.Sequential(
|
||||
# deconv1_1
|
||||
nn.ConvTranspose2d(64, 64, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# deconv1_2
|
||||
nn.ConvTranspose2d(64, 64, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
)
|
||||
|
||||
self.seg_score19 = nn.Sequential(
|
||||
# seg-score
|
||||
nn.Conv2d(64, self.n_classes, 1),
|
||||
)
|
||||
|
||||
if self.learned_billinear:
|
||||
raise NotImplementedError
|
||||
|
||||
def forward(self, x):
|
||||
size0 = x.size()
|
||||
conv1, indices1 = self.conv_block1(x)
|
||||
size1 = conv1.size()
|
||||
conv2, indices2 = self.conv_block2(conv1)
|
||||
size2 = conv2.size()
|
||||
conv3, indices3 = self.conv_block3(conv2)
|
||||
size3 = conv3.size()
|
||||
conv4, indices4 = self.conv_block4(conv3)
|
||||
size4 = conv4.size()
|
||||
conv5, indices5 = self.conv_block5(conv4)
|
||||
|
||||
conv6 = self.conv_block6(conv5)
|
||||
conv7 = self.conv_block7(conv6)
|
||||
conv8 = self.deconv_block8(conv7) + conv5
|
||||
conv9 = self.unpool(conv8, indices5, output_size=size4)
|
||||
conv10 = self.deconv_block10(conv9) + conv4
|
||||
conv11 = self.unpool(conv10, indices4, output_size=size3)
|
||||
conv12 = self.deconv_block12(conv11) + conv3
|
||||
conv13 = self.unpool(conv12, indices3, output_size=size2)
|
||||
conv14 = self.deconv_block14(conv13) + conv2
|
||||
conv15 = self.unpool(conv14, indices2, output_size=size1)
|
||||
conv16 = self.deconv_block16(conv15) + conv1
|
||||
conv17 = self.unpool(conv16, indices1, output_size=size0)
|
||||
conv18 = self.deconv_block18(conv17)
|
||||
out = self.seg_score19(conv18)
|
||||
|
||||
return out
|
||||
|
||||
def init_vgg16_params(self, vgg16, copy_fc8=True):
|
||||
blocks = [
|
||||
self.conv_block1,
|
||||
self.conv_block2,
|
||||
self.conv_block3,
|
||||
self.conv_block4,
|
||||
self.conv_block5,
|
||||
]
|
||||
|
||||
ranges = [[0, 4], [5, 9], [10, 16], [17, 23], [24, 29]]
|
||||
features = list(vgg16.features.children())
|
||||
i_layer = 0
|
||||
# copy convolutional filters from vgg16
|
||||
for idx, conv_block in enumerate(blocks):
|
||||
for l1, l2 in zip(features[ranges[idx][0] : ranges[idx][1]], conv_block):
|
||||
if isinstance(l1, nn.Conv2d) and isinstance(l2, nn.Conv2d):
|
||||
if i_layer == 0:
|
||||
l2.weight.data = (
|
||||
(l1.weight.data[:, 0, :, :] + l1.weight.data[:, 1, :, :] + l1.weight.data[:, 2, :, :]) / 3.0
|
||||
).view(l2.weight.size())
|
||||
l2.bias.data = l1.bias.data
|
||||
i_layer = i_layer + 1
|
||||
else:
|
||||
assert l1.weight.size() == l2.weight.size()
|
||||
assert l1.bias.size() == l2.bias.size()
|
||||
l2.weight.data = l1.weight.data
|
||||
l2.bias.data = l1.bias.data
|
||||
i_layer = i_layer + 1
|
||||
|
||||
|
||||
def get_seg_model(cfg, **kwargs):
|
||||
assert (
|
||||
cfg.MODEL.IN_CHANNELS == 1
|
||||
), f"Patch deconvnet is not implemented to accept {cfg.MODEL.IN_CHANNELS} channels. Please only pass 1 for cfg.MODEL.IN_CHANNELS"
|
||||
model = patch_deconvnet_skip(n_classes=cfg.DATASET.NUM_CLASSES)
|
||||
return model
|
|
@ -0,0 +1,365 @@
|
|||
# Copyright (c) Microsoft Corporation.
|
||||
# Licensed under the MIT License.
|
||||
|
||||
import torch
|
||||
import torch.nn as nn
|
||||
import torch.nn.functional as F
|
||||
import torchvision
|
||||
|
||||
|
||||
class FPAv2(nn.Module):
|
||||
def __init__(self, input_dim, output_dim):
|
||||
super(FPAv2, self).__init__()
|
||||
self.glob = nn.Sequential(nn.AdaptiveAvgPool2d(1), nn.Conv2d(input_dim, output_dim, kernel_size=1, bias=False),)
|
||||
|
||||
self.down2_1 = nn.Sequential(
|
||||
nn.Conv2d(input_dim, input_dim, kernel_size=5, stride=2, padding=2, bias=False),
|
||||
nn.BatchNorm2d(input_dim),
|
||||
nn.ELU(True),
|
||||
)
|
||||
self.down2_2 = nn.Sequential(
|
||||
nn.Conv2d(input_dim, output_dim, kernel_size=5, padding=2, bias=False),
|
||||
nn.BatchNorm2d(output_dim),
|
||||
nn.ELU(True),
|
||||
)
|
||||
|
||||
self.down3_1 = nn.Sequential(
|
||||
nn.Conv2d(input_dim, input_dim, kernel_size=3, stride=2, padding=1, bias=False),
|
||||
nn.BatchNorm2d(input_dim),
|
||||
nn.ELU(True),
|
||||
)
|
||||
self.down3_2 = nn.Sequential(
|
||||
nn.Conv2d(input_dim, output_dim, kernel_size=3, padding=1, bias=False),
|
||||
nn.BatchNorm2d(output_dim),
|
||||
nn.ELU(True),
|
||||
)
|
||||
|
||||
self.conv1 = nn.Sequential(
|
||||
nn.Conv2d(input_dim, output_dim, kernel_size=1, bias=False), nn.BatchNorm2d(output_dim), nn.ELU(True),
|
||||
)
|
||||
|
||||
def forward(self, x):
|
||||
# x shape: 512, 16, 16
|
||||
x_glob = self.glob(x) # 256, 1, 1
|
||||
x_glob = F.upsample(x_glob, scale_factor=16, mode="bilinear", align_corners=True) # 256, 16, 16
|
||||
|
||||
d2 = self.down2_1(x) # 512, 8, 8
|
||||
d3 = self.down3_1(d2) # 512, 4, 4
|
||||
|
||||
d2 = self.down2_2(d2) # 256, 8, 8
|
||||
d3 = self.down3_2(d3) # 256, 4, 4
|
||||
|
||||
d3 = F.upsample(d3, scale_factor=2, mode="bilinear", align_corners=True) # 256, 8, 8
|
||||
d2 = d2 + d3
|
||||
|
||||
d2 = F.upsample(d2, scale_factor=2, mode="bilinear", align_corners=True) # 256, 16, 16
|
||||
x = self.conv1(x) # 256, 16, 16
|
||||
x = x * d2
|
||||
|
||||
x = x + x_glob
|
||||
|
||||
return x
|
||||
|
||||
|
||||
def conv3x3(input_dim, output_dim, rate=1):
|
||||
return nn.Sequential(
|
||||
nn.Conv2d(input_dim, output_dim, kernel_size=3, dilation=rate, padding=rate, bias=False,),
|
||||
nn.BatchNorm2d(output_dim),
|
||||
nn.ELU(True),
|
||||
)
|
||||
|
||||
|
||||
class SpatialAttention2d(nn.Module):
|
||||
def __init__(self, channel):
|
||||
super(SpatialAttention2d, self).__init__()
|
||||
self.squeeze = nn.Conv2d(channel, 1, kernel_size=1, bias=False)
|
||||
self.sigmoid = nn.Sigmoid()
|
||||
|
||||
def forward(self, x):
|
||||
z = self.squeeze(x)
|
||||
z = self.sigmoid(z)
|
||||
return x * z
|
||||
|
||||
|
||||
class GAB(nn.Module):
|
||||
def __init__(self, input_dim, reduction=4):
|
||||
super(GAB, self).__init__()
|
||||
self.global_avgpool = nn.AdaptiveAvgPool2d(1)
|
||||
self.conv1 = nn.Conv2d(input_dim, input_dim // reduction, kernel_size=1, stride=1)
|
||||
self.conv2 = nn.Conv2d(input_dim // reduction, input_dim, kernel_size=1, stride=1)
|
||||
self.relu = nn.ReLU(inplace=True)
|
||||
self.sigmoid = nn.Sigmoid()
|
||||
|
||||
def forward(self, x):
|
||||
z = self.global_avgpool(x)
|
||||
z = self.relu(self.conv1(z))
|
||||
z = self.sigmoid(self.conv2(z))
|
||||
return x * z
|
||||
|
||||
|
||||
class Decoder(nn.Module):
|
||||
def __init__(self, in_channels, channels, out_channels):
|
||||
super(Decoder, self).__init__()
|
||||
self.conv1 = conv3x3(in_channels, channels)
|
||||
self.conv2 = conv3x3(channels, out_channels)
|
||||
self.s_att = SpatialAttention2d(out_channels)
|
||||
self.c_att = GAB(out_channels, 16)
|
||||
|
||||
def forward(self, x, e=None):
|
||||
x = F.upsample(input=x, scale_factor=2, mode="bilinear", align_corners=True)
|
||||
if e is not None:
|
||||
x = torch.cat([x, e], 1)
|
||||
x = self.conv1(x)
|
||||
x = self.conv2(x)
|
||||
s = self.s_att(x)
|
||||
c = self.c_att(x)
|
||||
output = s + c
|
||||
return output
|
||||
|
||||
|
||||
class Decoderv2(nn.Module):
|
||||
def __init__(self, up_in, x_in, n_out):
|
||||
super(Decoderv2, self).__init__()
|
||||
up_out = x_out = n_out // 2
|
||||
self.x_conv = nn.Conv2d(x_in, x_out, 1, bias=False)
|
||||
self.tr_conv = nn.ConvTranspose2d(up_in, up_out, 2, stride=2)
|
||||
self.bn = nn.BatchNorm2d(n_out)
|
||||
self.relu = nn.ReLU(True)
|
||||
self.s_att = SpatialAttention2d(n_out)
|
||||
self.c_att = GAB(n_out, 16)
|
||||
|
||||
def forward(self, up_p, x_p):
|
||||
up_p = self.tr_conv(up_p)
|
||||
x_p = self.x_conv(x_p)
|
||||
|
||||
cat_p = torch.cat([up_p, x_p], 1)
|
||||
cat_p = self.relu(self.bn(cat_p))
|
||||
s = self.s_att(cat_p)
|
||||
c = self.c_att(cat_p)
|
||||
return s + c
|
||||
|
||||
|
||||
class SCse(nn.Module):
|
||||
def __init__(self, dim):
|
||||
super(SCse, self).__init__()
|
||||
self.satt = SpatialAttention2d(dim)
|
||||
self.catt = GAB(dim)
|
||||
|
||||
def forward(self, x):
|
||||
return self.satt(x) + self.catt(x)
|
||||
|
||||
|
||||
# stage1 model
|
||||
class Res34Unetv4(nn.Module):
|
||||
def __init__(self, n_classes=1):
|
||||
super(Res34Unetv4, self).__init__()
|
||||
self.resnet = torchvision.models.resnet34(True)
|
||||
|
||||
self.conv1 = nn.Sequential(self.resnet.conv1, self.resnet.bn1, self.resnet.relu)
|
||||
|
||||
self.encode2 = nn.Sequential(self.resnet.layer1, SCse(64))
|
||||
self.encode3 = nn.Sequential(self.resnet.layer2, SCse(128))
|
||||
self.encode4 = nn.Sequential(self.resnet.layer3, SCse(256))
|
||||
self.encode5 = nn.Sequential(self.resnet.layer4, SCse(512))
|
||||
|
||||
self.center = nn.Sequential(FPAv2(512, 256), nn.MaxPool2d(2, 2))
|
||||
|
||||
self.decode5 = Decoderv2(256, 512, 64)
|
||||
self.decode4 = Decoderv2(64, 256, 64)
|
||||
self.decode3 = Decoderv2(64, 128, 64)
|
||||
self.decode2 = Decoderv2(64, 64, 64)
|
||||
self.decode1 = Decoder(64, 32, 64)
|
||||
|
||||
self.logit = nn.Sequential(
|
||||
nn.Conv2d(320, 64, kernel_size=3, padding=1),
|
||||
nn.ELU(True),
|
||||
nn.Conv2d(64, n_classes, kernel_size=1, bias=False),
|
||||
)
|
||||
|
||||
def forward(self, x):
|
||||
# x: (batch_size, 3, 256, 256)
|
||||
|
||||
x = self.conv1(x) # 64, 128, 128
|
||||
e2 = self.encode2(x) # 64, 128, 128
|
||||
e3 = self.encode3(e2) # 128, 64, 64
|
||||
e4 = self.encode4(e3) # 256, 32, 32
|
||||
e5 = self.encode5(e4) # 512, 16, 16
|
||||
|
||||
f = self.center(e5) # 256, 8, 8
|
||||
|
||||
d5 = self.decode5(f, e5) # 64, 16, 16
|
||||
d4 = self.decode4(d5, e4) # 64, 32, 32
|
||||
d3 = self.decode3(d4, e3) # 64, 64, 64
|
||||
d2 = self.decode2(d3, e2) # 64, 128, 128
|
||||
d1 = self.decode1(d2) # 64, 256, 256
|
||||
|
||||
f = torch.cat(
|
||||
(
|
||||
d1,
|
||||
F.upsample(d2, scale_factor=2, mode="bilinear", align_corners=True),
|
||||
F.upsample(d3, scale_factor=4, mode="bilinear", align_corners=True),
|
||||
F.upsample(d4, scale_factor=8, mode="bilinear", align_corners=True),
|
||||
F.upsample(d5, scale_factor=16, mode="bilinear", align_corners=True),
|
||||
),
|
||||
1,
|
||||
) # 320, 256, 256
|
||||
|
||||
logit = self.logit(f) # 1, 256, 256
|
||||
|
||||
return logit
|
||||
|
||||
|
||||
# stage2 model
|
||||
class Res34Unetv3(nn.Module):
|
||||
def __init__(self):
|
||||
super(Res34Unetv3, self).__init__()
|
||||
self.resnet = torchvision.models.resnet34(True)
|
||||
|
||||
self.conv1 = nn.Sequential(self.resnet.conv1, self.resnet.bn1, self.resnet.relu)
|
||||
|
||||
self.encode2 = nn.Sequential(self.resnet.layer1, SCse(64))
|
||||
self.encode3 = nn.Sequential(self.resnet.layer2, SCse(128))
|
||||
self.encode4 = nn.Sequential(self.resnet.layer3, SCse(256))
|
||||
self.encode5 = nn.Sequential(self.resnet.layer4, SCse(512))
|
||||
|
||||
self.center = nn.Sequential(FPAv2(512, 256), nn.MaxPool2d(2, 2))
|
||||
|
||||
self.decode5 = Decoderv2(256, 512, 64)
|
||||
self.decode4 = Decoderv2(64, 256, 64)
|
||||
self.decode3 = Decoderv2(64, 128, 64)
|
||||
self.decode2 = Decoderv2(64, 64, 64)
|
||||
self.decode1 = Decoder(64, 32, 64)
|
||||
|
||||
self.dropout2d = nn.Dropout2d(0.4)
|
||||
self.dropout = nn.Dropout(0.4)
|
||||
|
||||
self.fuse_pixel = conv3x3(320, 64)
|
||||
self.logit_pixel = nn.Conv2d(64, 1, kernel_size=1, bias=False)
|
||||
|
||||
self.fuse_image = nn.Sequential(nn.Linear(512, 64), nn.ELU(True))
|
||||
self.logit_image = nn.Sequential(nn.Linear(64, 1), nn.Sigmoid())
|
||||
self.logit = nn.Sequential(
|
||||
nn.Conv2d(128, 64, kernel_size=3, padding=1, bias=False),
|
||||
nn.ELU(True),
|
||||
nn.Conv2d(64, 1, kernel_size=1, bias=False),
|
||||
)
|
||||
|
||||
def forward(self, x):
|
||||
# x: (batch_size, 3, 256, 256)
|
||||
batch_size, c, h, w = x.shape
|
||||
|
||||
x = self.conv1(x) # 64, 128, 128
|
||||
e2 = self.encode2(x) # 64, 128, 128
|
||||
e3 = self.encode3(e2) # 128, 64, 64
|
||||
e4 = self.encode4(e3) # 256, 32, 32
|
||||
e5 = self.encode5(e4) # 512, 16, 16
|
||||
|
||||
e = F.adaptive_avg_pool2d(e5, output_size=1).view(batch_size, -1) # 512
|
||||
e = self.dropout(e)
|
||||
|
||||
f = self.center(e5) # 256, 8, 8
|
||||
|
||||
d5 = self.decode5(f, e5) # 64, 16, 16
|
||||
d4 = self.decode4(d5, e4) # 64, 32, 32
|
||||
d3 = self.decode3(d4, e3) # 64, 64, 64
|
||||
d2 = self.decode2(d3, e2) # 64, 128, 128
|
||||
d1 = self.decode1(d2) # 64, 256, 256
|
||||
|
||||
f = torch.cat(
|
||||
(
|
||||
d1,
|
||||
F.upsample(d2, scale_factor=2, mode="bilinear", align_corners=True),
|
||||
F.upsample(d3, scale_factor=4, mode="bilinear", align_corners=True),
|
||||
F.upsample(d4, scale_factor=8, mode="bilinear", align_corners=True),
|
||||
F.upsample(d5, scale_factor=16, mode="bilinear", align_corners=True),
|
||||
),
|
||||
1,
|
||||
) # 320, 256, 256
|
||||
f = self.dropout2d(f)
|
||||
|
||||
# segmentation process
|
||||
fuse_pixel = self.fuse_pixel(f) # 64, 256, 256
|
||||
logit_pixel = self.logit_pixel(fuse_pixel) # 1, 256, 256
|
||||
|
||||
# classification process
|
||||
fuse_image = self.fuse_image(e) # 64
|
||||
logit_image = self.logit_image(fuse_image) # 1
|
||||
|
||||
# combine segmentation and classification
|
||||
fuse = torch.cat(
|
||||
[
|
||||
fuse_pixel,
|
||||
F.upsample(
|
||||
fuse_image.view(batch_size, -1, 1, 1), scale_factor=256, mode="bilinear", align_corners=True,
|
||||
),
|
||||
],
|
||||
1,
|
||||
) # 128, 256, 256
|
||||
logit = self.logit(fuse) # 1, 256, 256
|
||||
|
||||
return logit, logit_pixel, logit_image.view(-1)
|
||||
|
||||
|
||||
# stage3 model
|
||||
class Res34Unetv5(nn.Module):
|
||||
def __init__(self):
|
||||
super(Res34Unetv5, self).__init__()
|
||||
self.resnet = torchvision.models.resnet34(True)
|
||||
|
||||
self.conv1 = nn.Sequential(
|
||||
nn.Conv2d(3, 64, kernel_size=3, padding=1, bias=False), self.resnet.bn1, self.resnet.relu,
|
||||
)
|
||||
|
||||
self.encode2 = nn.Sequential(self.resnet.layer1, SCse(64))
|
||||
self.encode3 = nn.Sequential(self.resnet.layer2, SCse(128))
|
||||
self.encode4 = nn.Sequential(self.resnet.layer3, SCse(256))
|
||||
self.encode5 = nn.Sequential(self.resnet.layer4, SCse(512))
|
||||
|
||||
self.center = nn.Sequential(FPAv2(512, 256), nn.MaxPool2d(2, 2))
|
||||
|
||||
self.decode5 = Decoderv2(256, 512, 64)
|
||||
self.decode4 = Decoderv2(64, 256, 64)
|
||||
self.decode3 = Decoderv2(64, 128, 64)
|
||||
self.decode2 = Decoderv2(64, 64, 64)
|
||||
|
||||
self.logit = nn.Sequential(
|
||||
nn.Conv2d(256, 32, kernel_size=3, padding=1), nn.ELU(True), nn.Conv2d(32, 1, kernel_size=1, bias=False),
|
||||
)
|
||||
|
||||
def forward(self, x):
|
||||
# x: batch_size, 3, 128, 128
|
||||
x = self.conv1(x) # 64, 128, 128
|
||||
e2 = self.encode2(x) # 64, 128, 128
|
||||
e3 = self.encode3(e2) # 128, 64, 64
|
||||
e4 = self.encode4(e3) # 256, 32, 32
|
||||
e5 = self.encode5(e4) # 512, 16, 16
|
||||
|
||||
f = self.center(e5) # 256, 8, 8
|
||||
|
||||
d5 = self.decode5(f, e5) # 64, 16, 16
|
||||
d4 = self.decode4(d5, e4) # 64, 32, 32
|
||||
d3 = self.decode3(d4, e3) # 64, 64, 64
|
||||
d2 = self.decode2(d3, e2) # 64, 128, 128
|
||||
|
||||
f = torch.cat(
|
||||
(
|
||||
d2,
|
||||
F.upsample(d3, scale_factor=2, mode="bilinear", align_corners=True),
|
||||
F.upsample(d4, scale_factor=4, mode="bilinear", align_corners=True),
|
||||
F.upsample(d5, scale_factor=8, mode="bilinear", align_corners=True),
|
||||
),
|
||||
1,
|
||||
) # 256, 128, 128
|
||||
|
||||
f = F.dropout2d(f, p=0.4)
|
||||
logit = self.logit(f) # 1, 128, 128
|
||||
|
||||
return logit
|
||||
|
||||
|
||||
def get_seg_model(cfg, **kwargs):
|
||||
assert (
|
||||
cfg.MODEL.IN_CHANNELS == 3
|
||||
), f"SEResnet Unet deconvnet is not implemented to accept {cfg.MODEL.IN_CHANNELS} channels. Please only pass 3 for cfg.MODEL.IN_CHANNELS"
|
||||
model = Res34Unetv4(n_classes=cfg.DATASET.NUM_CLASSES)
|
||||
return model
|
|
@ -0,0 +1,307 @@
|
|||
# Copyright (c) Microsoft Corporation.
|
||||
# Licensed under the MIT License.
|
||||
|
||||
import torch.nn as nn
|
||||
|
||||
|
||||
class section_deconvnet(nn.Module):
|
||||
def __init__(self, n_classes=4, learned_billinear=False):
|
||||
super(section_deconvnet, self).__init__()
|
||||
self.learned_billinear = learned_billinear
|
||||
self.n_classes = n_classes
|
||||
self.unpool = nn.MaxUnpool2d(2, stride=2)
|
||||
self.conv_block1 = nn.Sequential(
|
||||
# conv1_1
|
||||
nn.Conv2d(1, 64, 3, padding=1),
|
||||
nn.BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# conv1_2
|
||||
nn.Conv2d(64, 64, 3, padding=1),
|
||||
nn.BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# pool1
|
||||
nn.MaxPool2d(2, stride=2, return_indices=True, ceil_mode=True),
|
||||
)
|
||||
# it returns outputs and pool_indices_1
|
||||
|
||||
# 48*48
|
||||
|
||||
self.conv_block2 = nn.Sequential(
|
||||
# conv2_1
|
||||
nn.Conv2d(64, 128, 3, padding=1),
|
||||
nn.BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# conv2_2
|
||||
nn.Conv2d(128, 128, 3, padding=1),
|
||||
nn.BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# pool2
|
||||
nn.MaxPool2d(2, stride=2, return_indices=True, ceil_mode=True),
|
||||
)
|
||||
# it returns outputs and pool_indices_2
|
||||
|
||||
# 24*24
|
||||
|
||||
self.conv_block3 = nn.Sequential(
|
||||
# conv3_1
|
||||
nn.Conv2d(128, 256, 3, padding=1),
|
||||
nn.BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# conv3_2
|
||||
nn.Conv2d(256, 256, 3, padding=1),
|
||||
nn.BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# conv3_3
|
||||
nn.Conv2d(256, 256, 3, padding=1),
|
||||
nn.BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# pool3
|
||||
nn.MaxPool2d(2, stride=2, return_indices=True, ceil_mode=True),
|
||||
)
|
||||
# it returns outputs and pool_indices_3
|
||||
|
||||
# 12*12
|
||||
|
||||
self.conv_block4 = nn.Sequential(
|
||||
# conv4_1
|
||||
nn.Conv2d(256, 512, 3, padding=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# conv4_2
|
||||
nn.Conv2d(512, 512, 3, padding=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# conv4_3
|
||||
nn.Conv2d(512, 512, 3, padding=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# pool4
|
||||
nn.MaxPool2d(2, stride=2, return_indices=True, ceil_mode=True),
|
||||
)
|
||||
# it returns outputs and pool_indices_4
|
||||
|
||||
# 6*6
|
||||
|
||||
self.conv_block5 = nn.Sequential(
|
||||
# conv5_1
|
||||
nn.Conv2d(512, 512, 3, padding=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# conv5_2
|
||||
nn.Conv2d(512, 512, 3, padding=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# conv5_3
|
||||
nn.Conv2d(512, 512, 3, padding=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# pool5
|
||||
nn.MaxPool2d(2, stride=2, return_indices=True, ceil_mode=True),
|
||||
)
|
||||
# it returns outputs and pool_indices_5
|
||||
|
||||
# 3*3
|
||||
|
||||
self.conv_block6 = nn.Sequential(
|
||||
# fc6
|
||||
nn.Conv2d(512, 4096, 3),
|
||||
# set the filter size and nor padding to make output into 1*1
|
||||
nn.BatchNorm2d(4096, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
)
|
||||
|
||||
# 1*1
|
||||
|
||||
self.conv_block7 = nn.Sequential(
|
||||
# fc7
|
||||
nn.Conv2d(4096, 4096, 1),
|
||||
# set the filter size to make output into 1*1
|
||||
nn.BatchNorm2d(4096, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
)
|
||||
|
||||
self.deconv_block8 = nn.Sequential(
|
||||
# fc6-deconv
|
||||
nn.ConvTranspose2d(4096, 512, 3, stride=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
)
|
||||
|
||||
# 3*3
|
||||
|
||||
self.unpool_block9 = nn.Sequential(
|
||||
# unpool5
|
||||
nn.MaxUnpool2d(2, stride=2),
|
||||
)
|
||||
# usage unpool(output, indices)
|
||||
|
||||
# 6*6
|
||||
|
||||
self.deconv_block10 = nn.Sequential(
|
||||
# deconv5_1
|
||||
nn.ConvTranspose2d(512, 512, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# deconv5_2
|
||||
nn.ConvTranspose2d(512, 512, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# deconv5_3
|
||||
nn.ConvTranspose2d(512, 512, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
)
|
||||
|
||||
self.unpool_block11 = nn.Sequential(
|
||||
# unpool4
|
||||
nn.MaxUnpool2d(2, stride=2),
|
||||
)
|
||||
|
||||
# 12*12
|
||||
|
||||
self.deconv_block12 = nn.Sequential(
|
||||
# deconv4_1
|
||||
nn.ConvTranspose2d(512, 512, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# deconv4_2
|
||||
nn.ConvTranspose2d(512, 512, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# deconv4_3
|
||||
nn.ConvTranspose2d(512, 256, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
)
|
||||
|
||||
self.unpool_block13 = nn.Sequential(
|
||||
# unpool3
|
||||
nn.MaxUnpool2d(2, stride=2),
|
||||
)
|
||||
|
||||
# 24*24
|
||||
|
||||
self.deconv_block14 = nn.Sequential(
|
||||
# deconv3_1
|
||||
nn.ConvTranspose2d(256, 256, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# deconv3_2
|
||||
nn.ConvTranspose2d(256, 256, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# deconv3_3
|
||||
nn.ConvTranspose2d(256, 128, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
)
|
||||
|
||||
self.unpool_block15 = nn.Sequential(
|
||||
# unpool2
|
||||
nn.MaxUnpool2d(2, stride=2),
|
||||
)
|
||||
|
||||
# 48*48
|
||||
|
||||
self.deconv_block16 = nn.Sequential(
|
||||
# deconv2_1
|
||||
nn.ConvTranspose2d(128, 128, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# deconv2_2
|
||||
nn.ConvTranspose2d(128, 64, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
)
|
||||
|
||||
self.unpool_block17 = nn.Sequential(
|
||||
# unpool1
|
||||
nn.MaxUnpool2d(2, stride=2),
|
||||
)
|
||||
|
||||
# 96*96
|
||||
|
||||
self.deconv_block18 = nn.Sequential(
|
||||
# deconv1_1
|
||||
nn.ConvTranspose2d(64, 64, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# deconv1_2
|
||||
nn.ConvTranspose2d(64, 64, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
)
|
||||
|
||||
self.seg_score19 = nn.Sequential(
|
||||
# seg-score
|
||||
nn.Conv2d(64, self.n_classes, 1),
|
||||
)
|
||||
|
||||
if self.learned_billinear:
|
||||
raise NotImplementedError
|
||||
|
||||
def forward(self, x):
|
||||
size0 = x.size()
|
||||
conv1, indices1 = self.conv_block1(x)
|
||||
size1 = conv1.size()
|
||||
conv2, indices2 = self.conv_block2(conv1)
|
||||
size2 = conv2.size()
|
||||
conv3, indices3 = self.conv_block3(conv2)
|
||||
size3 = conv3.size()
|
||||
conv4, indices4 = self.conv_block4(conv3)
|
||||
size4 = conv4.size()
|
||||
conv5, indices5 = self.conv_block5(conv4)
|
||||
|
||||
conv6 = self.conv_block6(conv5)
|
||||
conv7 = self.conv_block7(conv6)
|
||||
conv8 = self.deconv_block8(conv7)
|
||||
conv9 = self.unpool(conv8, indices5, output_size=size4)
|
||||
conv10 = self.deconv_block10(conv9)
|
||||
conv11 = self.unpool(conv10, indices4, output_size=size3)
|
||||
conv12 = self.deconv_block12(conv11)
|
||||
conv13 = self.unpool(conv12, indices3, output_size=size2)
|
||||
conv14 = self.deconv_block14(conv13)
|
||||
conv15 = self.unpool(conv14, indices2, output_size=size1)
|
||||
conv16 = self.deconv_block16(conv15)
|
||||
conv17 = self.unpool(conv16, indices1, output_size=size0)
|
||||
conv18 = self.deconv_block18(conv17)
|
||||
out = self.seg_score19(conv18)
|
||||
|
||||
return out
|
||||
|
||||
def init_vgg16_params(self, vgg16, copy_fc8=True):
|
||||
blocks = [
|
||||
self.conv_block1,
|
||||
self.conv_block2,
|
||||
self.conv_block3,
|
||||
self.conv_block4,
|
||||
self.conv_block5,
|
||||
]
|
||||
|
||||
ranges = [[0, 4], [5, 9], [10, 16], [17, 23], [24, 29]]
|
||||
features = list(vgg16.features.children())
|
||||
i_layer = 0
|
||||
# copy convolutional filters from vgg16
|
||||
for idx, conv_block in enumerate(blocks):
|
||||
for l1, l2 in zip(features[ranges[idx][0] : ranges[idx][1]], conv_block):
|
||||
if isinstance(l1, nn.Conv2d) and isinstance(l2, nn.Conv2d):
|
||||
if i_layer == 0:
|
||||
l2.weight.data = (
|
||||
(l1.weight.data[:, 0, :, :] + l1.weight.data[:, 1, :, :] + l1.weight.data[:, 2, :, :]) / 3.0
|
||||
).view(l2.weight.size())
|
||||
l2.bias.data = l1.bias.data
|
||||
i_layer = i_layer + 1
|
||||
else:
|
||||
assert l1.weight.size() == l2.weight.size()
|
||||
assert l1.bias.size() == l2.bias.size()
|
||||
l2.weight.data = l1.weight.data
|
||||
l2.bias.data = l1.bias.data
|
||||
i_layer = i_layer + 1
|
||||
|
||||
|
||||
def get_seg_model(cfg, **kwargs):
|
||||
assert (
|
||||
cfg.MODEL.IN_CHANNELS == 1
|
||||
), f"Section deconvnet is not implemented to accept {cfg.MODEL.IN_CHANNELS} channels. Please only pass 1 for cfg.MODEL.IN_CHANNELS"
|
||||
model = section_deconvnet(n_classes=cfg.DATASET.NUM_CLASSES)
|
||||
return model
|
|
@ -0,0 +1,307 @@
|
|||
# Copyright (c) Microsoft Corporation.
|
||||
# Licensed under the MIT License.
|
||||
|
||||
import torch.nn as nn
|
||||
|
||||
|
||||
class section_deconvnet_skip(nn.Module):
|
||||
def __init__(self, n_classes=4, learned_billinear=False):
|
||||
super(section_deconvnet_skip, self).__init__()
|
||||
self.learned_billinear = learned_billinear
|
||||
self.n_classes = n_classes
|
||||
self.unpool = nn.MaxUnpool2d(2, stride=2)
|
||||
self.conv_block1 = nn.Sequential(
|
||||
# conv1_1
|
||||
nn.Conv2d(1, 64, 3, padding=1),
|
||||
nn.BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# conv1_2
|
||||
nn.Conv2d(64, 64, 3, padding=1),
|
||||
nn.BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# pool1
|
||||
nn.MaxPool2d(2, stride=2, return_indices=True, ceil_mode=True),
|
||||
)
|
||||
# it returns outputs and pool_indices_1
|
||||
|
||||
# 48*48
|
||||
|
||||
self.conv_block2 = nn.Sequential(
|
||||
# conv2_1
|
||||
nn.Conv2d(64, 128, 3, padding=1),
|
||||
nn.BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# conv2_2
|
||||
nn.Conv2d(128, 128, 3, padding=1),
|
||||
nn.BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# pool2
|
||||
nn.MaxPool2d(2, stride=2, return_indices=True, ceil_mode=True),
|
||||
)
|
||||
# it returns outputs and pool_indices_2
|
||||
|
||||
# 24*24
|
||||
|
||||
self.conv_block3 = nn.Sequential(
|
||||
# conv3_1
|
||||
nn.Conv2d(128, 256, 3, padding=1),
|
||||
nn.BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# conv3_2
|
||||
nn.Conv2d(256, 256, 3, padding=1),
|
||||
nn.BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# conv3_3
|
||||
nn.Conv2d(256, 256, 3, padding=1),
|
||||
nn.BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# pool3
|
||||
nn.MaxPool2d(2, stride=2, return_indices=True, ceil_mode=True),
|
||||
)
|
||||
# it returns outputs and pool_indices_3
|
||||
|
||||
# 12*12
|
||||
|
||||
self.conv_block4 = nn.Sequential(
|
||||
# conv4_1
|
||||
nn.Conv2d(256, 512, 3, padding=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# conv4_2
|
||||
nn.Conv2d(512, 512, 3, padding=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# conv4_3
|
||||
nn.Conv2d(512, 512, 3, padding=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# pool4
|
||||
nn.MaxPool2d(2, stride=2, return_indices=True, ceil_mode=True),
|
||||
)
|
||||
# it returns outputs and pool_indices_4
|
||||
|
||||
# 6*6
|
||||
|
||||
self.conv_block5 = nn.Sequential(
|
||||
# conv5_1
|
||||
nn.Conv2d(512, 512, 3, padding=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# conv5_2
|
||||
nn.Conv2d(512, 512, 3, padding=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# conv5_3
|
||||
nn.Conv2d(512, 512, 3, padding=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# pool5
|
||||
nn.MaxPool2d(2, stride=2, return_indices=True, ceil_mode=True),
|
||||
)
|
||||
# it returns outputs and pool_indices_5
|
||||
|
||||
# 3*3
|
||||
|
||||
self.conv_block6 = nn.Sequential(
|
||||
# fc6
|
||||
nn.Conv2d(512, 4096, 3),
|
||||
# set the filter size and nor padding to make output into 1*1
|
||||
nn.BatchNorm2d(4096, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
)
|
||||
|
||||
# 1*1
|
||||
|
||||
self.conv_block7 = nn.Sequential(
|
||||
# fc7
|
||||
nn.Conv2d(4096, 4096, 1),
|
||||
# set the filter size to make output into 1*1
|
||||
nn.BatchNorm2d(4096, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
)
|
||||
|
||||
self.deconv_block8 = nn.Sequential(
|
||||
# fc6-deconv
|
||||
nn.ConvTranspose2d(4096, 512, 3, stride=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
)
|
||||
|
||||
# 3*3
|
||||
|
||||
self.unpool_block9 = nn.Sequential(
|
||||
# unpool5
|
||||
nn.MaxUnpool2d(2, stride=2),
|
||||
)
|
||||
# usage unpool(output, indices)
|
||||
|
||||
# 6*6
|
||||
|
||||
self.deconv_block10 = nn.Sequential(
|
||||
# deconv5_1
|
||||
nn.ConvTranspose2d(512, 512, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# deconv5_2
|
||||
nn.ConvTranspose2d(512, 512, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# deconv5_3
|
||||
nn.ConvTranspose2d(512, 512, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
)
|
||||
|
||||
self.unpool_block11 = nn.Sequential(
|
||||
# unpool4
|
||||
nn.MaxUnpool2d(2, stride=2),
|
||||
)
|
||||
|
||||
# 12*12
|
||||
|
||||
self.deconv_block12 = nn.Sequential(
|
||||
# deconv4_1
|
||||
nn.ConvTranspose2d(512, 512, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# deconv4_2
|
||||
nn.ConvTranspose2d(512, 512, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# deconv4_3
|
||||
nn.ConvTranspose2d(512, 256, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
)
|
||||
|
||||
self.unpool_block13 = nn.Sequential(
|
||||
# unpool3
|
||||
nn.MaxUnpool2d(2, stride=2),
|
||||
)
|
||||
|
||||
# 24*24
|
||||
|
||||
self.deconv_block14 = nn.Sequential(
|
||||
# deconv3_1
|
||||
nn.ConvTranspose2d(256, 256, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# deconv3_2
|
||||
nn.ConvTranspose2d(256, 256, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# deconv3_3
|
||||
nn.ConvTranspose2d(256, 128, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
)
|
||||
|
||||
self.unpool_block15 = nn.Sequential(
|
||||
# unpool2
|
||||
nn.MaxUnpool2d(2, stride=2),
|
||||
)
|
||||
|
||||
# 48*48
|
||||
|
||||
self.deconv_block16 = nn.Sequential(
|
||||
# deconv2_1
|
||||
nn.ConvTranspose2d(128, 128, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# deconv2_2
|
||||
nn.ConvTranspose2d(128, 64, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
)
|
||||
|
||||
self.unpool_block17 = nn.Sequential(
|
||||
# unpool1
|
||||
nn.MaxUnpool2d(2, stride=2),
|
||||
)
|
||||
|
||||
# 96*96
|
||||
|
||||
self.deconv_block18 = nn.Sequential(
|
||||
# deconv1_1
|
||||
nn.ConvTranspose2d(64, 64, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
# deconv1_2
|
||||
nn.ConvTranspose2d(64, 64, 3, stride=1, padding=1),
|
||||
nn.BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True),
|
||||
nn.ReLU(inplace=True),
|
||||
)
|
||||
|
||||
self.seg_score19 = nn.Sequential(
|
||||
# seg-score
|
||||
nn.Conv2d(64, self.n_classes, 1),
|
||||
)
|
||||
|
||||
if self.learned_billinear:
|
||||
raise NotImplementedError
|
||||
|
||||
def forward(self, x):
|
||||
size0 = x.size()
|
||||
conv1, indices1 = self.conv_block1(x)
|
||||
size1 = conv1.size()
|
||||
conv2, indices2 = self.conv_block2(conv1)
|
||||
size2 = conv2.size()
|
||||
conv3, indices3 = self.conv_block3(conv2)
|
||||
size3 = conv3.size()
|
||||
conv4, indices4 = self.conv_block4(conv3)
|
||||
size4 = conv4.size()
|
||||
conv5, indices5 = self.conv_block5(conv4)
|
||||
|
||||
conv6 = self.conv_block6(conv5)
|
||||
conv7 = self.conv_block7(conv6)
|
||||
conv8 = self.deconv_block8(conv7) + conv5
|
||||
conv9 = self.unpool(conv8, indices5, output_size=size4)
|
||||
conv10 = self.deconv_block10(conv9) + conv4
|
||||
conv11 = self.unpool(conv10, indices4, output_size=size3)
|
||||
conv12 = self.deconv_block12(conv11) + conv3
|
||||
conv13 = self.unpool(conv12, indices3, output_size=size2)
|
||||
conv14 = self.deconv_block14(conv13) + conv2
|
||||
conv15 = self.unpool(conv14, indices2, output_size=size1)
|
||||
conv16 = self.deconv_block16(conv15) + conv1
|
||||
conv17 = self.unpool(conv16, indices1, output_size=size0)
|
||||
conv18 = self.deconv_block18(conv17)
|
||||
out = self.seg_score19(conv18)
|
||||
|
||||
return out
|
||||
|
||||
def init_vgg16_params(self, vgg16, copy_fc8=True):
|
||||
blocks = [
|
||||
self.conv_block1,
|
||||
self.conv_block2,
|
||||
self.conv_block3,
|
||||
self.conv_block4,
|
||||
self.conv_block5,
|
||||
]
|
||||
|
||||
ranges = [[0, 4], [5, 9], [10, 16], [17, 23], [24, 29]]
|
||||
features = list(vgg16.features.children())
|
||||
i_layer = 0
|
||||
# copy convolutional filters from vgg16
|
||||
for idx, conv_block in enumerate(blocks):
|
||||
for l1, l2 in zip(features[ranges[idx][0] : ranges[idx][1]], conv_block):
|
||||
if isinstance(l1, nn.Conv2d) and isinstance(l2, nn.Conv2d):
|
||||
if i_layer == 0:
|
||||
l2.weight.data = (
|
||||
(l1.weight.data[:, 0, :, :] + l1.weight.data[:, 1, :, :] + l1.weight.data[:, 2, :, :]) / 3.0
|
||||
).view(l2.weight.size())
|
||||
l2.bias.data = l1.bias.data
|
||||
i_layer = i_layer + 1
|
||||
else:
|
||||
assert l1.weight.size() == l2.weight.size()
|
||||
assert l1.bias.size() == l2.bias.size()
|
||||
l2.weight.data = l1.weight.data
|
||||
l2.bias.data = l1.bias.data
|
||||
i_layer = i_layer + 1
|
||||
|
||||
|
||||
def get_seg_model(cfg, **kwargs):
|
||||
assert (
|
||||
cfg.MODEL.IN_CHANNELS == 1
|
||||
), f"Section deconvnet is not implemented to accept {cfg.MODEL.IN_CHANNELS} channels. Please only pass 1 for cfg.MODEL.IN_CHANNELS"
|
||||
model = section_deconvnet_skip(n_classes=cfg.DATASET.NUM_CLASSES)
|
||||
return model
|
|
@ -0,0 +1,446 @@
|
|||
# ------------------------------------------------------------------------------
|
||||
# Copyright (c) Microsoft
|
||||
# Licensed under the MIT License.
|
||||
# Written by Ke Sun (sunk@mail.ustc.edu.cn)
|
||||
# ------------------------------------------------------------------------------
|
||||
"""HRNET for segmentation taken from https://github.com/HRNet/HRNet-Semantic-Segmentation
|
||||
pytorch-v1.1 branch
|
||||
hash: 06142dc1c7026e256a7561c3e875b06622b5670f
|
||||
|
||||
"""
|
||||
|
||||
from __future__ import absolute_import
|
||||
from __future__ import division
|
||||
from __future__ import print_function
|
||||
|
||||
import logging
|
||||
import os
|
||||
|
||||
import numpy as np
|
||||
import torch
|
||||
import torch._utils
|
||||
import torch.nn as nn
|
||||
import torch.nn.functional as F
|
||||
|
||||
BatchNorm2d = nn.BatchNorm2d
|
||||
BN_MOMENTUM = 0.1
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
def conv3x3(in_planes, out_planes, stride=1):
|
||||
"""3x3 convolution with padding"""
|
||||
return nn.Conv2d(in_planes, out_planes, kernel_size=3, stride=stride, padding=1, bias=False)
|
||||
|
||||
|
||||
class BasicBlock(nn.Module):
|
||||
expansion = 1
|
||||
|
||||
def __init__(self, inplanes, planes, stride=1, downsample=None):
|
||||
super(BasicBlock, self).__init__()
|
||||
self.conv1 = conv3x3(inplanes, planes, stride)
|
||||
self.bn1 = BatchNorm2d(planes, momentum=BN_MOMENTUM)
|
||||
self.relu = nn.ReLU(inplace=True)
|
||||
self.conv2 = conv3x3(planes, planes)
|
||||
self.bn2 = BatchNorm2d(planes, momentum=BN_MOMENTUM)
|
||||
self.downsample = downsample
|
||||
self.stride = stride
|
||||
|
||||
def forward(self, x):
|
||||
residual = x
|
||||
|
||||
out = self.conv1(x)
|
||||
out = self.bn1(out)
|
||||
out = self.relu(out)
|
||||
|
||||
out = self.conv2(out)
|
||||
out = self.bn2(out)
|
||||
|
||||
if self.downsample is not None:
|
||||
residual = self.downsample(x)
|
||||
|
||||
out += residual
|
||||
out = self.relu(out)
|
||||
|
||||
return out
|
||||
|
||||
|
||||
class Bottleneck(nn.Module):
|
||||
expansion = 4
|
||||
|
||||
def __init__(self, inplanes, planes, stride=1, downsample=None):
|
||||
super(Bottleneck, self).__init__()
|
||||
self.conv1 = nn.Conv2d(inplanes, planes, kernel_size=1, bias=False)
|
||||
self.bn1 = BatchNorm2d(planes, momentum=BN_MOMENTUM)
|
||||
self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=stride, padding=1, bias=False)
|
||||
self.bn2 = BatchNorm2d(planes, momentum=BN_MOMENTUM)
|
||||
self.conv3 = nn.Conv2d(planes, planes * self.expansion, kernel_size=1, bias=False)
|
||||
self.bn3 = BatchNorm2d(planes * self.expansion, momentum=BN_MOMENTUM)
|
||||
self.relu = nn.ReLU(inplace=True)
|
||||
self.downsample = downsample
|
||||
self.stride = stride
|
||||
|
||||
def forward(self, x):
|
||||
residual = x
|
||||
|
||||
out = self.conv1(x)
|
||||
out = self.bn1(out)
|
||||
out = self.relu(out)
|
||||
|
||||
out = self.conv2(out)
|
||||
out = self.bn2(out)
|
||||
out = self.relu(out)
|
||||
|
||||
out = self.conv3(out)
|
||||
out = self.bn3(out)
|
||||
|
||||
if self.downsample is not None:
|
||||
residual = self.downsample(x)
|
||||
|
||||
out += residual
|
||||
out = self.relu(out)
|
||||
|
||||
return out
|
||||
|
||||
|
||||
class HighResolutionModule(nn.Module):
|
||||
def __init__(
|
||||
self, num_branches, blocks, num_blocks, num_inchannels, num_channels, fuse_method, multi_scale_output=True,
|
||||
):
|
||||
super(HighResolutionModule, self).__init__()
|
||||
self._check_branches(num_branches, blocks, num_blocks, num_inchannels, num_channels)
|
||||
|
||||
self.num_inchannels = num_inchannels
|
||||
self.fuse_method = fuse_method
|
||||
self.num_branches = num_branches
|
||||
|
||||
self.multi_scale_output = multi_scale_output
|
||||
|
||||
self.branches = self._make_branches(num_branches, blocks, num_blocks, num_channels)
|
||||
self.fuse_layers = self._make_fuse_layers()
|
||||
self.relu = nn.ReLU(inplace=True)
|
||||
|
||||
def _check_branches(self, num_branches, blocks, num_blocks, num_inchannels, num_channels):
|
||||
if num_branches != len(num_blocks):
|
||||
error_msg = "NUM_BRANCHES({}) <> NUM_BLOCKS({})".format(num_branches, len(num_blocks))
|
||||
logger.error(error_msg)
|
||||
raise ValueError(error_msg)
|
||||
|
||||
if num_branches != len(num_channels):
|
||||
error_msg = "NUM_BRANCHES({}) <> NUM_CHANNELS({})".format(num_branches, len(num_channels))
|
||||
logger.error(error_msg)
|
||||
raise ValueError(error_msg)
|
||||
|
||||
if num_branches != len(num_inchannels):
|
||||
error_msg = "NUM_BRANCHES({}) <> NUM_INCHANNELS({})".format(num_branches, len(num_inchannels))
|
||||
logger.error(error_msg)
|
||||
raise ValueError(error_msg)
|
||||
|
||||
def _make_one_branch(self, branch_index, block, num_blocks, num_channels, stride=1):
|
||||
downsample = None
|
||||
if stride != 1 or self.num_inchannels[branch_index] != num_channels[branch_index] * block.expansion:
|
||||
downsample = nn.Sequential(
|
||||
nn.Conv2d(
|
||||
self.num_inchannels[branch_index],
|
||||
num_channels[branch_index] * block.expansion,
|
||||
kernel_size=1,
|
||||
stride=stride,
|
||||
bias=False,
|
||||
),
|
||||
BatchNorm2d(num_channels[branch_index] * block.expansion, momentum=BN_MOMENTUM),
|
||||
)
|
||||
|
||||
layers = []
|
||||
layers.append(block(self.num_inchannels[branch_index], num_channels[branch_index], stride, downsample,))
|
||||
self.num_inchannels[branch_index] = num_channels[branch_index] * block.expansion
|
||||
for i in range(1, num_blocks[branch_index]):
|
||||
layers.append(block(self.num_inchannels[branch_index], num_channels[branch_index]))
|
||||
|
||||
return nn.Sequential(*layers)
|
||||
|
||||
def _make_branches(self, num_branches, block, num_blocks, num_channels):
|
||||
branches = []
|
||||
|
||||
for i in range(num_branches):
|
||||
branches.append(self._make_one_branch(i, block, num_blocks, num_channels))
|
||||
|
||||
return nn.ModuleList(branches)
|
||||
|
||||
def _make_fuse_layers(self):
|
||||
if self.num_branches == 1:
|
||||
return None
|
||||
|
||||
num_branches = self.num_branches
|
||||
num_inchannels = self.num_inchannels
|
||||
fuse_layers = []
|
||||
for i in range(num_branches if self.multi_scale_output else 1):
|
||||
fuse_layer = []
|
||||
for j in range(num_branches):
|
||||
if j > i:
|
||||
fuse_layer.append(
|
||||
nn.Sequential(
|
||||
nn.Conv2d(num_inchannels[j], num_inchannels[i], 1, 1, 0, bias=False,),
|
||||
BatchNorm2d(num_inchannels[i], momentum=BN_MOMENTUM),
|
||||
)
|
||||
)
|
||||
elif j == i:
|
||||
fuse_layer.append(None)
|
||||
else:
|
||||
conv3x3s = []
|
||||
for k in range(i - j):
|
||||
if k == i - j - 1:
|
||||
num_outchannels_conv3x3 = num_inchannels[i]
|
||||
conv3x3s.append(
|
||||
nn.Sequential(
|
||||
nn.Conv2d(num_inchannels[j], num_outchannels_conv3x3, 3, 2, 1, bias=False,),
|
||||
BatchNorm2d(num_outchannels_conv3x3, momentum=BN_MOMENTUM),
|
||||
)
|
||||
)
|
||||
else:
|
||||
num_outchannels_conv3x3 = num_inchannels[j]
|
||||
conv3x3s.append(
|
||||
nn.Sequential(
|
||||
nn.Conv2d(num_inchannels[j], num_outchannels_conv3x3, 3, 2, 1, bias=False,),
|
||||
BatchNorm2d(num_outchannels_conv3x3, momentum=BN_MOMENTUM),
|
||||
nn.ReLU(inplace=True),
|
||||
)
|
||||
)
|
||||
fuse_layer.append(nn.Sequential(*conv3x3s))
|
||||
fuse_layers.append(nn.ModuleList(fuse_layer))
|
||||
|
||||
return nn.ModuleList(fuse_layers)
|
||||
|
||||
def get_num_inchannels(self):
|
||||
return self.num_inchannels
|
||||
|
||||
def forward(self, x):
|
||||
if self.num_branches == 1:
|
||||
return [self.branches[0](x[0])]
|
||||
|
||||
for i in range(self.num_branches):
|
||||
x[i] = self.branches[i](x[i])
|
||||
|
||||
x_fuse = []
|
||||
for i in range(len(self.fuse_layers)):
|
||||
y = x[0] if i == 0 else self.fuse_layers[i][0](x[0])
|
||||
for j in range(1, self.num_branches):
|
||||
if i == j:
|
||||
y = y + x[j]
|
||||
elif j > i:
|
||||
width_output = x[i].shape[-1]
|
||||
height_output = x[i].shape[-2]
|
||||
y = y + F.interpolate(
|
||||
self.fuse_layers[i][j](x[j]), size=[height_output, width_output], mode="bilinear",
|
||||
)
|
||||
else:
|
||||
y = y + self.fuse_layers[i][j](x[j])
|
||||
x_fuse.append(self.relu(y))
|
||||
|
||||
return x_fuse
|
||||
|
||||
|
||||
blocks_dict = {"BASIC": BasicBlock, "BOTTLENECK": Bottleneck}
|
||||
|
||||
|
||||
class HighResolutionNet(nn.Module):
|
||||
def __init__(self, config, **kwargs):
|
||||
extra = config.MODEL.EXTRA
|
||||
super(HighResolutionNet, self).__init__()
|
||||
|
||||
# stem net
|
||||
self.conv1 = nn.Conv2d(config.MODEL.IN_CHANNELS, 64, kernel_size=3, stride=2, padding=1, bias=False)
|
||||
self.bn1 = BatchNorm2d(64, momentum=BN_MOMENTUM)
|
||||
self.conv2 = nn.Conv2d(64, 64, kernel_size=3, stride=2, padding=1, bias=False)
|
||||
self.bn2 = BatchNorm2d(64, momentum=BN_MOMENTUM)
|
||||
self.relu = nn.ReLU(inplace=True)
|
||||
|
||||
self.layer1 = self._make_layer(Bottleneck, 64, 64, 4)
|
||||
|
||||
self.stage2_cfg = extra["STAGE2"]
|
||||
num_channels = self.stage2_cfg["NUM_CHANNELS"]
|
||||
block = blocks_dict[self.stage2_cfg["BLOCK"]]
|
||||
num_channels = [num_channels[i] * block.expansion for i in range(len(num_channels))]
|
||||
self.transition1 = self._make_transition_layer([256], num_channels)
|
||||
self.stage2, pre_stage_channels = self._make_stage(self.stage2_cfg, num_channels)
|
||||
|
||||
self.stage3_cfg = extra["STAGE3"]
|
||||
num_channels = self.stage3_cfg["NUM_CHANNELS"]
|
||||
block = blocks_dict[self.stage3_cfg["BLOCK"]]
|
||||
num_channels = [num_channels[i] * block.expansion for i in range(len(num_channels))]
|
||||
self.transition2 = self._make_transition_layer(pre_stage_channels, num_channels)
|
||||
self.stage3, pre_stage_channels = self._make_stage(self.stage3_cfg, num_channels)
|
||||
|
||||
self.stage4_cfg = extra["STAGE4"]
|
||||
num_channels = self.stage4_cfg["NUM_CHANNELS"]
|
||||
block = blocks_dict[self.stage4_cfg["BLOCK"]]
|
||||
num_channels = [num_channels[i] * block.expansion for i in range(len(num_channels))]
|
||||
self.transition3 = self._make_transition_layer(pre_stage_channels, num_channels)
|
||||
self.stage4, pre_stage_channels = self._make_stage(self.stage4_cfg, num_channels, multi_scale_output=True)
|
||||
|
||||
last_inp_channels = np.int(np.sum(pre_stage_channels))
|
||||
|
||||
self.last_layer = nn.Sequential(
|
||||
nn.Conv2d(
|
||||
in_channels=last_inp_channels, out_channels=last_inp_channels, kernel_size=1, stride=1, padding=0,
|
||||
),
|
||||
BatchNorm2d(last_inp_channels, momentum=BN_MOMENTUM),
|
||||
nn.ReLU(inplace=True),
|
||||
nn.Conv2d(
|
||||
in_channels=last_inp_channels,
|
||||
out_channels=config.DATASET.NUM_CLASSES,
|
||||
kernel_size=extra.FINAL_CONV_KERNEL,
|
||||
stride=1,
|
||||
padding=1 if extra.FINAL_CONV_KERNEL == 3 else 0,
|
||||
),
|
||||
)
|
||||
|
||||
def _make_transition_layer(self, num_channels_pre_layer, num_channels_cur_layer):
|
||||
num_branches_cur = len(num_channels_cur_layer)
|
||||
num_branches_pre = len(num_channels_pre_layer)
|
||||
|
||||
transition_layers = []
|
||||
for i in range(num_branches_cur):
|
||||
if i < num_branches_pre:
|
||||
if num_channels_cur_layer[i] != num_channels_pre_layer[i]:
|
||||
transition_layers.append(
|
||||
nn.Sequential(
|
||||
nn.Conv2d(num_channels_pre_layer[i], num_channels_cur_layer[i], 3, 1, 1, bias=False,),
|
||||
BatchNorm2d(num_channels_cur_layer[i], momentum=BN_MOMENTUM),
|
||||
nn.ReLU(inplace=True),
|
||||
)
|
||||
)
|
||||
else:
|
||||
transition_layers.append(None)
|
||||
else:
|
||||
conv3x3s = []
|
||||
for j in range(i + 1 - num_branches_pre):
|
||||
inchannels = num_channels_pre_layer[-1]
|
||||
outchannels = num_channels_cur_layer[i] if j == i - num_branches_pre else inchannels
|
||||
conv3x3s.append(
|
||||
nn.Sequential(
|
||||
nn.Conv2d(inchannels, outchannels, 3, 2, 1, bias=False),
|
||||
BatchNorm2d(outchannels, momentum=BN_MOMENTUM),
|
||||
nn.ReLU(inplace=True),
|
||||
)
|
||||
)
|
||||
transition_layers.append(nn.Sequential(*conv3x3s))
|
||||
|
||||
return nn.ModuleList(transition_layers)
|
||||
|
||||
def _make_layer(self, block, inplanes, planes, blocks, stride=1):
|
||||
downsample = None
|
||||
if stride != 1 or inplanes != planes * block.expansion:
|
||||
downsample = nn.Sequential(
|
||||
nn.Conv2d(inplanes, planes * block.expansion, kernel_size=1, stride=stride, bias=False,),
|
||||
BatchNorm2d(planes * block.expansion, momentum=BN_MOMENTUM),
|
||||
)
|
||||
|
||||
layers = []
|
||||
layers.append(block(inplanes, planes, stride, downsample))
|
||||
inplanes = planes * block.expansion
|
||||
for i in range(1, blocks):
|
||||
layers.append(block(inplanes, planes))
|
||||
|
||||
return nn.Sequential(*layers)
|
||||
|
||||
def _make_stage(self, layer_config, num_inchannels, multi_scale_output=True):
|
||||
num_modules = layer_config["NUM_MODULES"]
|
||||
num_branches = layer_config["NUM_BRANCHES"]
|
||||
num_blocks = layer_config["NUM_BLOCKS"]
|
||||
num_channels = layer_config["NUM_CHANNELS"]
|
||||
block = blocks_dict[layer_config["BLOCK"]]
|
||||
fuse_method = layer_config["FUSE_METHOD"]
|
||||
|
||||
modules = []
|
||||
for i in range(num_modules):
|
||||
# multi_scale_output is only used last module
|
||||
if not multi_scale_output and i == num_modules - 1:
|
||||
reset_multi_scale_output = False
|
||||
else:
|
||||
reset_multi_scale_output = True
|
||||
modules.append(
|
||||
HighResolutionModule(
|
||||
num_branches,
|
||||
block,
|
||||
num_blocks,
|
||||
num_inchannels,
|
||||
num_channels,
|
||||
fuse_method,
|
||||
reset_multi_scale_output,
|
||||
)
|
||||
)
|
||||
num_inchannels = modules[-1].get_num_inchannels()
|
||||
|
||||
return nn.Sequential(*modules), num_inchannels
|
||||
|
||||
def forward(self, x):
|
||||
x = self.conv1(x)
|
||||
x = self.bn1(x)
|
||||
x = self.relu(x)
|
||||
x = self.conv2(x)
|
||||
x = self.bn2(x)
|
||||
x = self.relu(x)
|
||||
x = self.layer1(x)
|
||||
|
||||
x_list = []
|
||||
for i in range(self.stage2_cfg["NUM_BRANCHES"]):
|
||||
if self.transition1[i] is not None:
|
||||
x_list.append(self.transition1[i](x))
|
||||
else:
|
||||
x_list.append(x)
|
||||
y_list = self.stage2(x_list)
|
||||
|
||||
x_list = []
|
||||
for i in range(self.stage3_cfg["NUM_BRANCHES"]):
|
||||
if self.transition2[i] is not None:
|
||||
x_list.append(self.transition2[i](y_list[-1]))
|
||||
else:
|
||||
x_list.append(y_list[i])
|
||||
y_list = self.stage3(x_list)
|
||||
|
||||
x_list = []
|
||||
for i in range(self.stage4_cfg["NUM_BRANCHES"]):
|
||||
if self.transition3[i] is not None:
|
||||
x_list.append(self.transition3[i](y_list[-1]))
|
||||
else:
|
||||
x_list.append(y_list[i])
|
||||
x = self.stage4(x_list)
|
||||
|
||||
# Upsampling
|
||||
x0_h, x0_w = x[0].size(2), x[0].size(3)
|
||||
x1 = F.upsample(x[1], size=(x0_h, x0_w), mode="bilinear")
|
||||
x2 = F.upsample(x[2], size=(x0_h, x0_w), mode="bilinear")
|
||||
x3 = F.upsample(x[3], size=(x0_h, x0_w), mode="bilinear")
|
||||
|
||||
x = torch.cat([x[0], x1, x2, x3], 1)
|
||||
|
||||
x = self.last_layer(x)
|
||||
|
||||
return x
|
||||
|
||||
def init_weights(
|
||||
self, pretrained="",
|
||||
):
|
||||
logger.info("=> init weights from normal distribution")
|
||||
for m in self.modules():
|
||||
if isinstance(m, nn.Conv2d):
|
||||
nn.init.normal_(m.weight, std=0.001)
|
||||
elif isinstance(m, nn.BatchNorm2d):
|
||||
nn.init.constant_(m.weight, 1)
|
||||
nn.init.constant_(m.bias, 0)
|
||||
if os.path.isfile(pretrained):
|
||||
pretrained_dict = torch.load(pretrained)
|
||||
logger.info("=> loading pretrained model {}".format(pretrained))
|
||||
model_dict = self.state_dict()
|
||||
pretrained_dict = {k: v for k, v in pretrained_dict.items() if k in model_dict.keys()}
|
||||
# for k, _ in pretrained_dict.items():
|
||||
# logger.info(
|
||||
# '=> loading {} pretrained model {}'.format(k, pretrained))
|
||||
model_dict.update(pretrained_dict)
|
||||
self.load_state_dict(model_dict)
|
||||
|
||||
|
||||
def get_seg_model(cfg, **kwargs):
|
||||
model = HighResolutionNet(cfg, **kwargs)
|
||||
model.init_weights(cfg.MODEL.PRETRAINED)
|
||||
|
||||
return model
|
|
@ -0,0 +1,116 @@
|
|||
# Copyright (c) Microsoft Corporation.
|
||||
# Licensed under the MIT License.
|
||||
|
||||
""" Taken from https://github.com/milesial/Pytorch-UNet
|
||||
|
||||
"""
|
||||
import torch
|
||||
import torch.nn as nn
|
||||
import torch.nn.functional as F
|
||||
|
||||
|
||||
class double_conv(nn.Module):
|
||||
"""(conv => BN => ReLU) * 2"""
|
||||
|
||||
def __init__(self, in_ch, out_ch):
|
||||
super(double_conv, self).__init__()
|
||||
self.conv = nn.Sequential(
|
||||
nn.Conv2d(in_ch, out_ch, 3, padding=1),
|
||||
nn.BatchNorm2d(out_ch),
|
||||
nn.ReLU(inplace=True),
|
||||
nn.Conv2d(out_ch, out_ch, 3, padding=1),
|
||||
nn.BatchNorm2d(out_ch),
|
||||
nn.ReLU(inplace=True),
|
||||
)
|
||||
|
||||
def forward(self, x):
|
||||
x = self.conv(x)
|
||||
return x
|
||||
|
||||
|
||||
class inconv(nn.Module):
|
||||
def __init__(self, in_ch, out_ch):
|
||||
super(inconv, self).__init__()
|
||||
self.conv = double_conv(in_ch, out_ch)
|
||||
|
||||
def forward(self, x):
|
||||
x = self.conv(x)
|
||||
return x
|
||||
|
||||
|
||||
class down(nn.Module):
|
||||
def __init__(self, in_ch, out_ch):
|
||||
super(down, self).__init__()
|
||||
self.mpconv = nn.Sequential(nn.MaxPool2d(2), double_conv(in_ch, out_ch))
|
||||
|
||||
def forward(self, x):
|
||||
x = self.mpconv(x)
|
||||
return x
|
||||
|
||||
|
||||
class up(nn.Module):
|
||||
def __init__(self, in_ch, out_ch, bilinear=True):
|
||||
super(up, self).__init__()
|
||||
|
||||
if bilinear:
|
||||
self.up = nn.Upsample(scale_factor=2, mode="bilinear", align_corners=True)
|
||||
else:
|
||||
self.up = nn.ConvTranspose2d(in_ch // 2, in_ch // 2, 2, stride=2)
|
||||
|
||||
self.conv = double_conv(in_ch, out_ch)
|
||||
|
||||
def forward(self, x1, x2):
|
||||
x1 = self.up(x1)
|
||||
|
||||
# input is CHW
|
||||
diffY = x2.size()[2] - x1.size()[2]
|
||||
diffX = x2.size()[3] - x1.size()[3]
|
||||
|
||||
x1 = F.pad(x1, (diffX // 2, diffX - diffX // 2, diffY // 2, diffY - diffY // 2))
|
||||
|
||||
x = torch.cat([x2, x1], dim=1)
|
||||
x = self.conv(x)
|
||||
return x
|
||||
|
||||
|
||||
class outconv(nn.Module):
|
||||
def __init__(self, in_ch, out_ch):
|
||||
super(outconv, self).__init__()
|
||||
self.conv = nn.Conv2d(in_ch, out_ch, 1)
|
||||
|
||||
def forward(self, x):
|
||||
x = self.conv(x)
|
||||
return x
|
||||
|
||||
|
||||
class UNet(nn.Module):
|
||||
def __init__(self, n_channels, n_classes):
|
||||
super(UNet, self).__init__()
|
||||
self.inc = inconv(n_channels, 64)
|
||||
self.down1 = down(64, 128)
|
||||
self.down2 = down(128, 256)
|
||||
self.down3 = down(256, 512)
|
||||
self.down4 = down(512, 512)
|
||||
self.up1 = up(1024, 256)
|
||||
self.up2 = up(512, 128)
|
||||
self.up3 = up(256, 64)
|
||||
self.up4 = up(128, 64)
|
||||
self.outc = outconv(64, n_classes)
|
||||
|
||||
def forward(self, x):
|
||||
x1 = self.inc(x)
|
||||
x2 = self.down1(x1)
|
||||
x3 = self.down2(x2)
|
||||
x4 = self.down3(x3)
|
||||
x5 = self.down4(x4)
|
||||
x = self.up1(x5, x4)
|
||||
x = self.up2(x, x3)
|
||||
x = self.up3(x, x2)
|
||||
x = self.up4(x, x1)
|
||||
x = self.outc(x)
|
||||
return x
|
||||
|
||||
|
||||
def get_seg_model(cfg, **kwargs):
|
||||
model = UNet(cfg.MODEL.IN_CHANNELS, cfg.DATASET.NUM_CLASSES)
|
||||
return model
|
|
@ -0,0 +1,103 @@
|
|||
# Copyright (c) Microsoft Corporation.
|
||||
# Licensed under the MIT License.
|
||||
|
||||
import torch.nn as nn
|
||||
|
||||
|
||||
class conv2DBatchNorm(nn.Module):
|
||||
def __init__(self, in_channels, n_filters, k_size, stride, padding, bias=True, dilation=1):
|
||||
super(conv2DBatchNorm, self).__init__()
|
||||
|
||||
if dilation > 1:
|
||||
conv_mod = nn.Conv2d(
|
||||
int(in_channels),
|
||||
int(n_filters),
|
||||
kernel_size=k_size,
|
||||
padding=padding,
|
||||
stride=stride,
|
||||
bias=bias,
|
||||
dilation=dilation,
|
||||
)
|
||||
|
||||
else:
|
||||
conv_mod = nn.Conv2d(
|
||||
int(in_channels),
|
||||
int(n_filters),
|
||||
kernel_size=k_size,
|
||||
padding=padding,
|
||||
stride=stride,
|
||||
bias=bias,
|
||||
dilation=1,
|
||||
)
|
||||
|
||||
self.cb_unit = nn.Sequential(conv_mod, nn.BatchNorm2d(int(n_filters)),)
|
||||
|
||||
def forward(self, inputs):
|
||||
outputs = self.cb_unit(inputs)
|
||||
return outputs
|
||||
|
||||
|
||||
class deconv2DBatchNorm(nn.Module):
|
||||
def __init__(self, in_channels, n_filters, k_size, stride, padding, bias=True):
|
||||
super(deconv2DBatchNorm, self).__init__()
|
||||
|
||||
self.dcb_unit = nn.Sequential(
|
||||
nn.ConvTranspose2d(
|
||||
int(in_channels), int(n_filters), kernel_size=k_size, padding=padding, stride=stride, bias=bias,
|
||||
),
|
||||
nn.BatchNorm2d(int(n_filters)),
|
||||
)
|
||||
|
||||
def forward(self, inputs):
|
||||
outputs = self.dcb_unit(inputs)
|
||||
return outputs
|
||||
|
||||
|
||||
class conv2DBatchNormRelu(nn.Module):
|
||||
def __init__(self, in_channels, n_filters, k_size, stride, padding, bias=True, dilation=1):
|
||||
super(conv2DBatchNormRelu, self).__init__()
|
||||
|
||||
if dilation > 1:
|
||||
conv_mod = nn.Conv2d(
|
||||
int(in_channels),
|
||||
int(n_filters),
|
||||
kernel_size=k_size,
|
||||
padding=padding,
|
||||
stride=stride,
|
||||
bias=bias,
|
||||
dilation=dilation,
|
||||
)
|
||||
|
||||
else:
|
||||
conv_mod = nn.Conv2d(
|
||||
int(in_channels),
|
||||
int(n_filters),
|
||||
kernel_size=k_size,
|
||||
padding=padding,
|
||||
stride=stride,
|
||||
bias=bias,
|
||||
dilation=1,
|
||||
)
|
||||
|
||||
self.cbr_unit = nn.Sequential(conv_mod, nn.BatchNorm2d(int(n_filters)), nn.ReLU(inplace=True),)
|
||||
|
||||
def forward(self, inputs):
|
||||
outputs = self.cbr_unit(inputs)
|
||||
return outputs
|
||||
|
||||
|
||||
class deconv2DBatchNormRelu(nn.Module):
|
||||
def __init__(self, in_channels, n_filters, k_size, stride, padding, bias=True):
|
||||
super(deconv2DBatchNormRelu, self).__init__()
|
||||
|
||||
self.dcbr_unit = nn.Sequential(
|
||||
nn.ConvTranspose2d(
|
||||
int(in_channels), int(n_filters), kernel_size=k_size, padding=padding, stride=stride, bias=bias,
|
||||
),
|
||||
nn.BatchNorm2d(int(n_filters)),
|
||||
nn.ReLU(inplace=True),
|
||||
)
|
||||
|
||||
def forward(self, inputs):
|
||||
outputs = self.dcbr_unit(inputs)
|
||||
return outputs
|
|
@ -0,0 +1,119 @@
|
|||
# Copyright (c) Microsoft Corporation.
|
||||
# Licensed under the MIT License.
|
||||
|
||||
import torch
|
||||
|
||||
from ignite.engine.engine import Engine
|
||||
from toolz import curry
|
||||
from torch.nn import functional as F
|
||||
|
||||
|
||||
def _upscale_model_output(y_pred, y):
|
||||
ph, pw = y_pred.size(2), y_pred.size(3)
|
||||
h, w = y.size(2), y.size(3)
|
||||
if ph != h or pw != w:
|
||||
y_pred = F.upsample(input=y_pred, size=(h, w), mode="bilinear")
|
||||
return y_pred
|
||||
|
||||
|
||||
def create_supervised_trainer(
|
||||
model,
|
||||
optimizer,
|
||||
loss_fn,
|
||||
prepare_batch,
|
||||
device=None,
|
||||
non_blocking=False,
|
||||
output_transform=lambda x, y, y_pred, loss: {"loss": loss.item()},
|
||||
):
|
||||
"""Factory function for creating a trainer for supervised segmentation models.
|
||||
|
||||
Args:
|
||||
model (`torch.nn.Module`): the model to train.
|
||||
optimizer (`torch.optim.Optimizer`): the optimizer to use.
|
||||
loss_fn (torch.nn loss function): the loss function to use.
|
||||
prepare_batch (callable): function that receives `batch`, `device`, `non_blocking` and outputs
|
||||
tuple of tensors `(batch_x, batch_y, patch_id, patch_locations)`.
|
||||
device (str, optional): device type specification (default: None).
|
||||
Applies to both model and batches.
|
||||
non_blocking (bool, optional): if True and this copy is between CPU and GPU, the copy may occur asynchronously
|
||||
with respect to the host. For other cases, this argument has no effect.
|
||||
output_transform (callable, optional): function that receives 'x', 'y', 'y_pred', 'loss' and returns value
|
||||
to be assigned to engine's state.output after each iteration. Default is returning `loss.item()`.
|
||||
|
||||
Note: `engine.state.output` for this engine is defined by `output_transform` parameter and is the loss
|
||||
of the processed batch by default.
|
||||
|
||||
Returns:
|
||||
Engine: a trainer engine with supervised update function.
|
||||
"""
|
||||
if device:
|
||||
model.to(device)
|
||||
|
||||
def _update(engine, batch):
|
||||
model.train()
|
||||
optimizer.zero_grad()
|
||||
x, y, ids, patch_locations = prepare_batch(batch, device=device, non_blocking=non_blocking)
|
||||
y_pred = model(x)
|
||||
y_pred = _upscale_model_output(y_pred, y)
|
||||
loss = loss_fn(y_pred.squeeze(1), y.squeeze(1))
|
||||
loss.backward()
|
||||
optimizer.step()
|
||||
return output_transform(x, y, y_pred, loss)
|
||||
|
||||
return Engine(_update)
|
||||
|
||||
|
||||
@curry
|
||||
def val_transform(x, y, y_pred, ids, patch_locations):
|
||||
return {
|
||||
"image": x,
|
||||
"y_pred": y_pred.detach(),
|
||||
"mask": y.detach(),
|
||||
"ids": ids,
|
||||
"patch_locations": patch_locations,
|
||||
}
|
||||
|
||||
|
||||
def create_supervised_evaluator(
|
||||
model, prepare_batch, metrics=None, device=None, non_blocking=False, output_transform=val_transform,
|
||||
):
|
||||
"""Factory function for creating an evaluator for supervised segmentation models.
|
||||
|
||||
Args:
|
||||
model (`torch.nn.Module`): the model to train.
|
||||
prepare_batch (callable): function that receives `batch`, `device`, `non_blocking` and outputs
|
||||
tuple of tensors `(batch_x, batch_y, patch_id, patch_locations)`.
|
||||
metrics (dict of str - :class:`~ignite.metrics.Metric`): a map of metric names to Metrics.
|
||||
device (str, optional): device type specification (default: None).
|
||||
Applies to both model and batches.
|
||||
non_blocking (bool, optional): if True and this copy is between CPU and GPU, the copy may occur asynchronously
|
||||
with respect to the host. For other cases, this argument has no effect.
|
||||
output_transform (callable, optional): function that receives 'x', 'y', 'y_pred' and returns value
|
||||
to be assigned to engine's state.output after each iteration. Default is returning `(y_pred, y,)` which fits
|
||||
output expected by metrics. If you change it you should use `output_transform` in metrics.
|
||||
|
||||
Note: `engine.state.output` for this engine is defind by `output_transform` parameter and is
|
||||
a tuple of `(batch_pred, batch_y)` by default.
|
||||
|
||||
Returns:
|
||||
Engine: an evaluator engine with supervised inference function.
|
||||
"""
|
||||
metrics = metrics or {}
|
||||
|
||||
if device:
|
||||
model.to(device)
|
||||
|
||||
def _inference(engine, batch):
|
||||
model.eval()
|
||||
with torch.no_grad():
|
||||
x, y, ids, patch_locations = prepare_batch(batch, device=device, non_blocking=non_blocking)
|
||||
y_pred = model(x)
|
||||
y_pred = _upscale_model_output(y_pred, x)
|
||||
return output_transform(x, y, y_pred, ids, patch_locations)
|
||||
|
||||
engine = Engine(_inference)
|
||||
|
||||
for name, metric in metrics.items():
|
||||
metric.attach(engine, name)
|
||||
|
||||
return engine
|
|
@ -0,0 +1,39 @@
|
|||
# Copyright (c) Microsoft Corporation.
|
||||
# Licensed under the MIT License.
|
||||
|
||||
import numpy as np
|
||||
from deepseismic_interpretation.dutchf3.data import decode_segmap
|
||||
from os import path
|
||||
from PIL import Image
|
||||
from toolz import pipe
|
||||
|
||||
|
||||
def _chw_to_hwc(image_array_numpy):
|
||||
return np.moveaxis(image_array_numpy, 0, -1)
|
||||
|
||||
|
||||
def save_images(pred_dict, output_dir, num_classes, colours, extra_identifier=""):
|
||||
for id in pred_dict:
|
||||
save_image(
|
||||
pred_dict[id].unsqueeze(0).cpu().numpy(),
|
||||
output_dir,
|
||||
num_classes,
|
||||
colours,
|
||||
extra_identifier=extra_identifier,
|
||||
)
|
||||
|
||||
|
||||
def save_image(image_numpy_array, output_dir, num_classes, colours, extra_identifier=""):
|
||||
"""Save segmentation map as image
|
||||
|
||||
Args:
|
||||
image_numpy_array (numpy.Array): numpy array that represents an image
|
||||
output_dir ([type]):
|
||||
num_classes ([type]): [description]
|
||||
colours ([type]): [description]
|
||||
extra_identifier (str, optional): [description]. Defaults to "".
|
||||
"""
|
||||
im_array = decode_segmap(image_numpy_array, n_classes=num_classes, label_colours=colours,)
|
||||
im = pipe((im_array * 255).astype(np.uint8).squeeze(), _chw_to_hwc, Image.fromarray,)
|
||||
filename = path.join(output_dir, f"{id}_{extra_identifier}.png")
|
||||
im.save(filename)
|
|
@ -0,0 +1,19 @@
|
|||
import os
|
||||
import logging
|
||||
|
||||
|
||||
def load_log_configuration(log_config_file):
|
||||
"""
|
||||
Loads logging configuration from the given configuration file.
|
||||
"""
|
||||
if not os.path.exists(log_config_file) or not os.path.isfile(log_config_file):
|
||||
msg = "%s configuration file does not exist!", log_config_file
|
||||
logging.getLogger(__name__).error(msg)
|
||||
raise ValueError(msg)
|
||||
try:
|
||||
logging.config.fileConfig(log_config_file, disable_existing_loggers=False)
|
||||
logging.getLogger(__name__).info("%s configuration file was loaded.", log_config_file)
|
||||
except Exception as e:
|
||||
logging.getLogger(__name__).error("Failed to load configuration from %s!", log_config_file)
|
||||
logging.getLogger(__name__).debug(str(e), exc_info=True)
|
||||
raise e
|
|
@ -0,0 +1,9 @@
|
|||
numpy>=1.16.4
|
||||
toolz>=0.9.0
|
||||
pandas>=0.24.2
|
||||
ignite>=1.1.0
|
||||
scikit_learn>=0.21.3
|
||||
tensorboardX>=1.8
|
||||
torch>=1.2.0
|
||||
torchvision>=0.4.0
|
||||
tqdm>=4.33.0
|
|
@ -0,0 +1,54 @@
|
|||
# Copyright (c) Microsoft Corporation.
|
||||
# Licensed under the MIT License.
|
||||
# /* spell-checker: disable */
|
||||
import os
|
||||
|
||||
try:
|
||||
from setuptools import setup, find_packages
|
||||
except ImportError:
|
||||
from distutils.core import setup, find_packages
|
||||
|
||||
|
||||
# Package meta-data.
|
||||
NAME = "cv_lib"
|
||||
DESCRIPTION = "A library for computer vision"
|
||||
URL = ""
|
||||
EMAIL = "msalvaris@users.noreply.github.com"
|
||||
AUTHOR = "AUTHORS.md"
|
||||
LICENSE = ""
|
||||
LONG_DESCRIPTION = DESCRIPTION
|
||||
|
||||
|
||||
with open("requirements.txt") as f:
|
||||
requirements = f.read().splitlines()
|
||||
|
||||
|
||||
here = os.path.abspath(os.path.dirname(__file__))
|
||||
|
||||
# Load the package's __version__.py module as a dictionary.
|
||||
about = {}
|
||||
with open(os.path.join(here, NAME, "__version__.py")) as f:
|
||||
exec(f.read(), about)
|
||||
|
||||
|
||||
setup(
|
||||
name=NAME,
|
||||
version=about["__version__"],
|
||||
url=URL,
|
||||
license=LICENSE,
|
||||
author=AUTHOR,
|
||||
author_email=EMAIL,
|
||||
description=DESCRIPTION,
|
||||
long_description=LONG_DESCRIPTION,
|
||||
scripts=[],
|
||||
packages=find_packages(),
|
||||
include_package_data=True,
|
||||
install_requires=requirements,
|
||||
classifiers=[
|
||||
"Development Status :: 1 - Alpha",
|
||||
"Intended Audience :: Data Scientists & Developers",
|
||||
"Operating System :: POSIX",
|
||||
"Operating System :: POSIX :: Linux",
|
||||
"Programming Language :: Python :: 3.6",
|
||||
],
|
||||
)
|
|
@ -0,0 +1,126 @@
|
|||
import torch
|
||||
import numpy as np
|
||||
from pytest import approx
|
||||
|
||||
from ignite.metrics import ConfusionMatrix, MetricsLambda
|
||||
|
||||
from cv_lib.segmentation.metrics import class_accuracy, mean_class_accuracy
|
||||
|
||||
|
||||
# source repo:
|
||||
# https://github.com/pytorch/ignite/blob/master/tests/ignite/metrics/test_confusion_matrix.py
|
||||
def _get_y_true_y_pred():
|
||||
# Generate an image with labels 0 (background), 1, 2
|
||||
# 3 classes:
|
||||
y_true = np.zeros((30, 30), dtype=np.int)
|
||||
y_true[1:11, 1:11] = 1
|
||||
y_true[15:25, 15:25] = 2
|
||||
|
||||
y_pred = np.zeros((30, 30), dtype=np.int)
|
||||
y_pred[20:30, 1:11] = 1
|
||||
y_pred[20:30, 20:30] = 2
|
||||
return y_true, y_pred
|
||||
|
||||
|
||||
# source repo:
|
||||
# https://github.com/pytorch/ignite/blob/master/tests/ignite/metrics/test_confusion_matrix.py
|
||||
def _compute_th_y_true_y_logits(y_true, y_pred):
|
||||
# Create torch.tensor from numpy
|
||||
th_y_true = torch.from_numpy(y_true).unsqueeze(0)
|
||||
# Create logits torch.tensor:
|
||||
num_classes = max(np.max(y_true), np.max(y_pred)) + 1
|
||||
y_probas = np.ones((num_classes,) + y_true.shape) * -10
|
||||
for i in range(num_classes):
|
||||
y_probas[i, (y_pred == i)] = 720
|
||||
th_y_logits = torch.from_numpy(y_probas).unsqueeze(0)
|
||||
return th_y_true, th_y_logits
|
||||
|
||||
|
||||
# Dependency metrics do not get updated automatically, so need to retrieve and
|
||||
# update confusion matrix manually
|
||||
def _get_cm(metriclambda):
|
||||
metrics = list(metriclambda.args)
|
||||
while metrics:
|
||||
metric = metrics[0]
|
||||
if isinstance(metric, ConfusionMatrix):
|
||||
return metric
|
||||
elif isinstance(metric, MetricsLambda):
|
||||
metrics.extend(metric.args)
|
||||
del metrics[0]
|
||||
|
||||
|
||||
def test_class_accuracy():
|
||||
y_true, y_pred = _get_y_true_y_pred()
|
||||
|
||||
## Perfect prediction
|
||||
th_y_true, th_y_logits = _compute_th_y_true_y_logits(y_true, y_true)
|
||||
# Update metric
|
||||
output = (th_y_logits, th_y_true)
|
||||
acc_metric = class_accuracy(num_classes=3)
|
||||
acc_metric.update(output)
|
||||
|
||||
# Retrieve and update confusion matrix
|
||||
metric_cm = _get_cm(acc_metric)
|
||||
# assert confusion matrix exists and is all zeroes
|
||||
assert metric_cm is not None
|
||||
assert torch.min(metric_cm.confusion_matrix) == 0.0 and torch.max(metric_cm.confusion_matrix) == 0.0
|
||||
metric_cm.update(output)
|
||||
|
||||
# Expected result
|
||||
true_res = [1.0, 1.0, 1.0]
|
||||
res = acc_metric.compute().numpy()
|
||||
assert np.all(res == true_res), "Result {} vs. expected values {}".format(res, true_res)
|
||||
|
||||
## Imperfect prediction
|
||||
th_y_true, th_y_logits = _compute_th_y_true_y_logits(y_true, y_pred)
|
||||
# Update metric
|
||||
output = (th_y_logits, th_y_true)
|
||||
acc_metric = class_accuracy(num_classes=3)
|
||||
acc_metric.update(output)
|
||||
|
||||
# Retrieve and update confusion matrix
|
||||
metric_cm = _get_cm(acc_metric)
|
||||
assert metric_cm is not None
|
||||
assert torch.min(metric_cm.confusion_matrix) == 0.0 and torch.max(metric_cm.confusion_matrix) == 0.0
|
||||
metric_cm.update(output)
|
||||
|
||||
# Expected result
|
||||
true_res = [0.75, 0.0, 0.25]
|
||||
res = acc_metric.compute().numpy()
|
||||
assert np.all(res == true_res), "Result {} vs. expected values {}".format(res, true_res)
|
||||
|
||||
|
||||
def test_mean_class_accuracy():
|
||||
y_true, y_pred = _get_y_true_y_pred()
|
||||
|
||||
## Perfect prediction
|
||||
th_y_true, th_y_logits = _compute_th_y_true_y_logits(y_true, y_true)
|
||||
# Update metric
|
||||
output = (th_y_logits, th_y_true)
|
||||
acc_metric = mean_class_accuracy(num_classes=3)
|
||||
acc_metric.update(output)
|
||||
|
||||
# Retrieve and update confusion matrix
|
||||
metric_cm = _get_cm(acc_metric)
|
||||
metric_cm.update(output)
|
||||
|
||||
# Expected result
|
||||
true_res = 1.0
|
||||
res = acc_metric.compute().numpy()
|
||||
assert res == approx(true_res), "Result {} vs. expected value {}".format(res, true_res)
|
||||
|
||||
## Imperfect prediction
|
||||
th_y_true, th_y_logits = _compute_th_y_true_y_logits(y_true, y_pred)
|
||||
# Update metric
|
||||
output = (th_y_logits, th_y_true)
|
||||
acc_metric = mean_class_accuracy(num_classes=3)
|
||||
acc_metric.update(output)
|
||||
|
||||
# Retrieve and update confusion matrix
|
||||
metric_cm = _get_cm(acc_metric)
|
||||
metric_cm.update(output)
|
||||
|
||||
# Expected result
|
||||
true_res = 1 / 3
|
||||
res = acc_metric.compute().numpy()
|
||||
assert res == approx(true_res), "Result {} vs. expected value {}".format(res, true_res)
|
|
@ -1,3 +0,0 @@
|
|||
from . import cli, forward, velocity
|
||||
|
||||
__all__ = ["cli", "forward", "velocity"]
|
|
@ -1,21 +0,0 @@
|
|||
from functools import partial
|
||||
|
||||
import click
|
||||
|
||||
from . import forward, velocity
|
||||
|
||||
click.option = partial(click.option, show_default=True)
|
||||
|
||||
|
||||
@click.group()
|
||||
@click.pass_context
|
||||
def cli(ctx):
|
||||
ctx.ensure_object(dict)
|
||||
|
||||
|
||||
cli.add_command(forward.fwd)
|
||||
cli.add_command(velocity.vp)
|
||||
|
||||
|
||||
def main():
|
||||
cli(obj={})
|
|
@ -1,123 +0,0 @@
|
|||
from functools import partial
|
||||
|
||||
import click
|
||||
import h5py
|
||||
import numpy as np
|
||||
|
||||
from ..forward import Receiver, RickerSource, TimeAxis, VelocityModel
|
||||
|
||||
click.option = partial(click.option, show_default=True)
|
||||
|
||||
|
||||
@click.group()
|
||||
@click.argument("input", type=click.Path())
|
||||
@click.argument("output", type=click.Path())
|
||||
@click.option(
|
||||
"-d",
|
||||
"--duration",
|
||||
default=1000.0,
|
||||
type=float,
|
||||
help="Simulation duration (in ms)",
|
||||
)
|
||||
@click.option("-dt", default=2.0, type=float, help="Time increment (in ms)")
|
||||
@click.option(
|
||||
"--n-pml", default=10, type=int, help="PML size (in grid points)"
|
||||
)
|
||||
@click.option(
|
||||
"--n-receivers",
|
||||
default=11,
|
||||
type=int,
|
||||
help="Number of receivers per horizontal dimension",
|
||||
)
|
||||
@click.option("--space-order", default=2, type=int, help="Space order")
|
||||
@click.option(
|
||||
"--spacing", default=10.0, type=float, help="Spacing between grid points"
|
||||
)
|
||||
@click.pass_context
|
||||
def fwd(
|
||||
ctx,
|
||||
dt: float,
|
||||
duration: float,
|
||||
input: str,
|
||||
n_pml: int,
|
||||
n_receivers: int,
|
||||
output: str,
|
||||
space_order: int,
|
||||
spacing: float,
|
||||
):
|
||||
"""Forward modelling"""
|
||||
if dt:
|
||||
ctx.obj["dt"] = dt
|
||||
ctx.obj["duration"] = duration
|
||||
ctx.obj["input_file"] = h5py.File(input, mode="r")
|
||||
ctx.obj["n_pml"] = n_pml
|
||||
ctx.obj["n_receivers"] = n_receivers
|
||||
ctx.obj["output_file"] = h5py.File(output, mode="w")
|
||||
ctx.obj["space_order"] = space_order
|
||||
ctx.obj["spacing"] = spacing
|
||||
|
||||
|
||||
@fwd.command()
|
||||
@click.option(
|
||||
"-f0", default=0.01, type=float, help="Source peak frequency (in kHz)"
|
||||
)
|
||||
@click.pass_context
|
||||
def ricker(ctx, f0: float):
|
||||
"""Ricker source"""
|
||||
input_file = ctx.obj["input_file"]
|
||||
output_file = ctx.obj["output_file"]
|
||||
n = sum(len(x.values()) for x in input_file.values())
|
||||
with click.progressbar(length=n) as bar:
|
||||
for input_group_name, input_group in input_file.items():
|
||||
for dataset in input_group.values():
|
||||
first_dataset = dataset
|
||||
break
|
||||
model = VelocityModel(
|
||||
shape=first_dataset.shape,
|
||||
origin=tuple(0.0 for _ in first_dataset.shape),
|
||||
spacing=tuple(ctx.obj["spacing"] for _ in first_dataset.shape),
|
||||
vp=first_dataset[()],
|
||||
space_order=ctx.obj["space_order"],
|
||||
n_pml=ctx.obj["n_pml"],
|
||||
)
|
||||
time_range = TimeAxis(
|
||||
start=0.0, stop=ctx.obj["duration"], step=ctx.obj["dt"]
|
||||
)
|
||||
source = RickerSource(
|
||||
name="source",
|
||||
grid=model.grid,
|
||||
f0=f0,
|
||||
npoint=1,
|
||||
time_range=time_range,
|
||||
)
|
||||
source.coordinates.data[0, :] = np.array(model.domain_size) * 0.5
|
||||
source.coordinates.data[0, -1] = 0.0
|
||||
n_receivers = ctx.obj["n_receivers"]
|
||||
total_receivers = n_receivers ** (len(model.shape) - 1)
|
||||
receivers = Receiver(
|
||||
name="receivers",
|
||||
grid=model.grid,
|
||||
npoint=total_receivers,
|
||||
time_range=time_range,
|
||||
)
|
||||
receivers_coords = np.meshgrid(
|
||||
*(
|
||||
np.linspace(start=0, stop=s, num=n_receivers + 2)[1:-1]
|
||||
for s in model.domain_size[:-1]
|
||||
)
|
||||
)
|
||||
for d in range(len(receivers_coords)):
|
||||
receivers.coordinates.data[:, d] = receivers_coords[
|
||||
d
|
||||
].flatten()
|
||||
receivers.coordinates.data[:, -1] = 0.0
|
||||
output_group = output_file.create_group(input_group_name)
|
||||
for input_dataset_name, vp in input_group.items():
|
||||
model.vp = vp[()]
|
||||
seismograms = model.solve(
|
||||
source=source, receivers=receivers, time_range=time_range
|
||||
)
|
||||
output_group.create_dataset(
|
||||
input_dataset_name, data=seismograms
|
||||
)
|
||||
bar.update(1)
|
|
@ -1,96 +0,0 @@
|
|||
from functools import partial
|
||||
from itertools import islice
|
||||
from typing import Tuple
|
||||
|
||||
import click
|
||||
import h5py
|
||||
|
||||
from ..velocity import RoethTarantolaGenerator
|
||||
|
||||
click.option = partial(click.option, show_default=True)
|
||||
|
||||
|
||||
@click.group()
|
||||
@click.argument("output", type=click.Path())
|
||||
@click.option(
|
||||
"--append/--no-append",
|
||||
default=False,
|
||||
help="Whether to append to output file",
|
||||
)
|
||||
@click.option("-n", default=1, type=int, help="Number of simulations")
|
||||
@click.option(
|
||||
"-nx",
|
||||
default=100,
|
||||
type=int,
|
||||
help="Number of grid points along the first dimension",
|
||||
)
|
||||
@click.option(
|
||||
"-ny",
|
||||
default=100,
|
||||
type=int,
|
||||
help="Number of grid points along the second dimension",
|
||||
)
|
||||
@click.option(
|
||||
"-nz", type=int, help="Number of grid points along the third dimension"
|
||||
)
|
||||
@click.option("-s", "--seed", default=42, type=int, help="Random seed")
|
||||
@click.pass_context
|
||||
def vp(
|
||||
ctx,
|
||||
append: bool,
|
||||
n: int,
|
||||
nx: int,
|
||||
ny: int,
|
||||
nz: int,
|
||||
output: str,
|
||||
seed: int,
|
||||
):
|
||||
"""Vp simulation"""
|
||||
shape = (nx, ny)
|
||||
if nz is not None:
|
||||
shape += (nz,)
|
||||
output_file = h5py.File(output, mode=("a" if append else "w"))
|
||||
output_group = output_file.create_group(
|
||||
str(max((int(x) for x in output_file.keys()), default=-1) + 1)
|
||||
)
|
||||
ctx.obj["n"] = n
|
||||
ctx.obj["output_file"] = output_file
|
||||
ctx.obj["output_group"] = output_group
|
||||
ctx.obj["seed"] = seed
|
||||
ctx.obj["shape"] = shape
|
||||
|
||||
|
||||
@vp.command()
|
||||
@click.option("--n-layers", default=8, type=int, help="Number of layers")
|
||||
@click.option(
|
||||
"--initial-vp",
|
||||
default=(1350.0, 1650.0),
|
||||
type=(float, float),
|
||||
help="Initial Vp (in km/s)",
|
||||
)
|
||||
@click.option(
|
||||
"--vp-perturbation",
|
||||
default=(-190.0, 570.0),
|
||||
type=(float, float),
|
||||
help="Per-layer Vp perturbation (in km/s)",
|
||||
)
|
||||
@click.pass_context
|
||||
def rt(
|
||||
ctx,
|
||||
initial_vp: Tuple[float, float],
|
||||
n_layers: int,
|
||||
vp_perturbation: Tuple[float, float],
|
||||
):
|
||||
"""Röth-Tarantola model"""
|
||||
model = RoethTarantolaGenerator(
|
||||
shape=ctx.obj["shape"],
|
||||
seed=ctx.obj["seed"],
|
||||
n_layers=n_layers,
|
||||
initial_vp=initial_vp,
|
||||
vp_perturbation=vp_perturbation,
|
||||
)
|
||||
group = ctx.obj["output_group"]
|
||||
with click.progressbar(length=ctx.obj["n"]) as bar:
|
||||
for i, data in enumerate(islice(model.generate_many(), ctx.obj["n"])):
|
||||
group.create_dataset(str(i), data=data, compression="gzip")
|
||||
bar.update(1)
|
|
@ -1,14 +0,0 @@
|
|||
from .models import Model, VelocityModel
|
||||
from .sources import Receiver, RickerSource, WaveletSource
|
||||
from .time import TimeAxis
|
||||
from .types import Kernel
|
||||
|
||||
__all__ = [
|
||||
"Kernel",
|
||||
"Model",
|
||||
"Receiver",
|
||||
"RickerSource",
|
||||
"TimeAxis",
|
||||
"VelocityModel",
|
||||
"WaveletSource",
|
||||
]
|
|
@ -1,162 +0,0 @@
|
|||
from typing import Optional, Tuple, Union
|
||||
|
||||
import numpy as np
|
||||
from devito import (
|
||||
Constant,
|
||||
Eq,
|
||||
Function,
|
||||
Grid,
|
||||
Operator,
|
||||
SubDomain,
|
||||
TimeFunction,
|
||||
logger,
|
||||
solve,
|
||||
)
|
||||
|
||||
from .sources import PointSource
|
||||
from .subdomains import PhysicalDomain
|
||||
from .time import TimeAxis
|
||||
from .types import Kernel
|
||||
|
||||
logger.set_log_level("WARNING")
|
||||
|
||||
|
||||
class Model(object):
|
||||
def __init__(
|
||||
self,
|
||||
shape: Tuple[int, ...],
|
||||
origin: Tuple[float, ...],
|
||||
spacing: Tuple[float, ...],
|
||||
n_pml: Optional[int] = 0,
|
||||
dtype: Optional[type] = np.float32,
|
||||
subdomains: Optional[Tuple[SubDomain]] = (),
|
||||
):
|
||||
shape = tuple(int(x) for x in shape)
|
||||
origin = tuple(dtype(x) for x in origin)
|
||||
n_pml = int(n_pml)
|
||||
subdomains = tuple(subdomains) + (PhysicalDomain(n_pml),)
|
||||
shape_pml = tuple(x + 2 * n_pml for x in shape)
|
||||
extent_pml = tuple(s * (d - 1) for s, d in zip(spacing, shape_pml))
|
||||
origin_pml = tuple(
|
||||
dtype(o - s * n_pml) for o, s in zip(origin, spacing)
|
||||
)
|
||||
self.grid = Grid(
|
||||
shape=shape_pml,
|
||||
extent=extent_pml,
|
||||
origin=origin_pml,
|
||||
dtype=dtype,
|
||||
subdomains=subdomains,
|
||||
)
|
||||
self.n_pml = n_pml
|
||||
self.pml = Function(name="pml", grid=self.grid)
|
||||
pml_data = np.pad(
|
||||
np.zeros(shape, dtype=dtype),
|
||||
[(n_pml,) * 2 for _ in range(self.pml.ndim)],
|
||||
mode="edge",
|
||||
)
|
||||
pml_coef = 1.5 * np.log(1000.0) / 40.0
|
||||
for d in range(self.pml.ndim):
|
||||
for i in range(n_pml):
|
||||
pos = np.abs((n_pml - i + 1) / n_pml)
|
||||
val = pml_coef * (pos - np.sin(2 * np.pi * pos) / (2 * np.pi))
|
||||
idx = [slice(0, x) for x in pml_data.shape]
|
||||
idx[d] = slice(i, i + 1)
|
||||
pml_data[tuple(idx)] += val / self.grid.spacing[d]
|
||||
idx[d] = slice(
|
||||
pml_data.shape[d] - i, pml_data.shape[d] - i + 1
|
||||
)
|
||||
pml_data[tuple(idx)] += val / self.grid.spacing[d]
|
||||
pml_data = np.pad(
|
||||
pml_data,
|
||||
[(i.left, i.right) for i in self.pml._size_halo],
|
||||
mode="edge",
|
||||
)
|
||||
self.pml.data_with_halo[:] = pml_data
|
||||
self.shape = shape
|
||||
|
||||
@property
|
||||
def domain_size(self) -> Tuple[float, ...]:
|
||||
return tuple((d - 1) * s for d, s in zip(self.shape, self.spacing))
|
||||
|
||||
@property
|
||||
def dtype(self) -> type:
|
||||
return self.grid.dtype
|
||||
|
||||
@property
|
||||
def spacing(self):
|
||||
return self.grid.spacing
|
||||
|
||||
@property
|
||||
def spacing_map(self):
|
||||
return self.grid.spacing_map
|
||||
|
||||
@property
|
||||
def time_spacing(self):
|
||||
return self.grid.stepping_dim.spacing
|
||||
|
||||
|
||||
class VelocityModel(Model):
|
||||
def __init__(
|
||||
self,
|
||||
shape: Tuple[int, ...],
|
||||
origin: Tuple[float, ...],
|
||||
spacing: Tuple[float, ...],
|
||||
vp: Union[float, np.ndarray],
|
||||
space_order: Optional[int] = None,
|
||||
n_pml: Optional[int] = 0,
|
||||
dtype: Optional[type] = np.float32,
|
||||
subdomains: Optional[Tuple[SubDomain]] = (),
|
||||
):
|
||||
super().__init__(shape, origin, spacing, n_pml, dtype, subdomains)
|
||||
if isinstance(vp, np.ndarray):
|
||||
assert space_order is not None
|
||||
self.m = Function(
|
||||
name="m", grid=self.grid, space_order=int(space_order)
|
||||
)
|
||||
else:
|
||||
self.m = Constant(name="m", value=1.0 / float(vp) ** 2.0)
|
||||
self.vp = vp
|
||||
|
||||
@property
|
||||
def vp(self) -> Union[float, np.ndarray]:
|
||||
return self._vp
|
||||
|
||||
@vp.setter
|
||||
def vp(self, vp: Union[float, np.ndarray]) -> None:
|
||||
self._vp = vp
|
||||
if isinstance(vp, np.ndarray):
|
||||
pad_widths = [
|
||||
(self.n_pml + i.left, self.n_pml + i.right)
|
||||
for i in self.m._size_halo
|
||||
]
|
||||
self.m.data_with_halo[:] = np.pad(
|
||||
1.0 / self.vp ** 2.0, pad_widths, mode="edge"
|
||||
)
|
||||
else:
|
||||
self.m.data = 1.0 / float(vp) ** 2.0
|
||||
|
||||
def solve(
|
||||
self,
|
||||
source: PointSource,
|
||||
receivers: PointSource,
|
||||
time_range: TimeAxis,
|
||||
space_order: Optional[int] = 4,
|
||||
kernel: Optional[Kernel] = Kernel.OT2,
|
||||
) -> np.ndarray:
|
||||
assert isinstance(kernel, Kernel)
|
||||
u = TimeFunction(
|
||||
name="u", grid=self.grid, time_order=2, space_order=space_order
|
||||
)
|
||||
H = u.laplace
|
||||
if kernel is Kernel.OT4:
|
||||
H += self.time_spacing ** 2 / 12 * u.laplace2(1 / self.m)
|
||||
eq = Eq(
|
||||
u.forward, solve(self.m * u.dt2 - H + self.pml * u.dt, u.forward)
|
||||
)
|
||||
src_term = source.inject(
|
||||
field=u.forward, expr=source * self.time_spacing ** 2 / self.m
|
||||
)
|
||||
rec_term = receivers.interpolate(expr=u)
|
||||
op = Operator([eq] + src_term + rec_term, subs=self.spacing_map)
|
||||
op(time=time_range.num - 1, dt=time_range.step)
|
||||
return receivers.data
|
|
@ -1,132 +0,0 @@
|
|||
from typing import Optional
|
||||
|
||||
import numpy as np
|
||||
import sympy
|
||||
from devito.types import Dimension, SparseTimeFunction
|
||||
from devito.types.basic import _SymbolCache
|
||||
from scipy import interpolate
|
||||
|
||||
from .time import TimeAxis
|
||||
|
||||
|
||||
class PointSource(SparseTimeFunction):
|
||||
def __new__(cls, *args, **kwargs):
|
||||
if cls in _SymbolCache:
|
||||
options = kwargs.get("options", {})
|
||||
obj = sympy.Function.__new__(cls, *args, **options)
|
||||
obj._cached_init()
|
||||
return obj
|
||||
name = kwargs.pop("name")
|
||||
grid = kwargs.pop("grid")
|
||||
time_range = kwargs.pop("time_range")
|
||||
time_order = kwargs.pop("time_order", 2)
|
||||
p_dim = kwargs.pop("dimension", Dimension(name="p_%s" % name))
|
||||
npoint = kwargs.pop("npoint", None)
|
||||
coordinates = kwargs.pop(
|
||||
"coordinates", kwargs.pop("coordinates_data", None)
|
||||
)
|
||||
if npoint is None:
|
||||
assert (
|
||||
coordinates is not None
|
||||
), "Either `npoint` or `coordinates` must be provided"
|
||||
npoint = coordinates.shape[0]
|
||||
obj = SparseTimeFunction.__new__(
|
||||
cls,
|
||||
name=name,
|
||||
grid=grid,
|
||||
dimensions=(grid.time_dim, p_dim),
|
||||
npoint=npoint,
|
||||
nt=time_range.num,
|
||||
time_order=time_order,
|
||||
coordinates=coordinates,
|
||||
**kwargs
|
||||
)
|
||||
obj._time_range = time_range
|
||||
data = kwargs.get("data")
|
||||
if data is not None:
|
||||
obj.data[:] = data
|
||||
return obj
|
||||
|
||||
@property
|
||||
def time_range(self) -> TimeAxis:
|
||||
return self._time_range
|
||||
|
||||
@property
|
||||
def time_values(self) -> np.ndarray:
|
||||
return self._time_range.time_values
|
||||
|
||||
def resample(
|
||||
self,
|
||||
dt: Optional[float] = None,
|
||||
num: Optional[int] = None,
|
||||
rtol: Optional[float] = 1.0e-5,
|
||||
order: Optional[int] = 3,
|
||||
):
|
||||
assert (dt is not None) ^ (
|
||||
num is not None
|
||||
), "Exactly one of `dt` or `num` must be provided"
|
||||
start = self._time_range.start
|
||||
stop = self._time_range.stop
|
||||
dt0 = self._time_range.step
|
||||
if dt is not None:
|
||||
new_time_range = TimeAxis(start=start, stop=stop, step=dt)
|
||||
else:
|
||||
new_time_range = TimeAxis(start=start, stop=stop, num=num)
|
||||
dt = new_time_range.step
|
||||
if np.isclose(dt0, dt, rtol=rtol):
|
||||
return self
|
||||
n_traces = self.data.shape[1]
|
||||
new_traces = np.zeros(
|
||||
(new_time_range.num, n_traces), dtype=self.data.dtype
|
||||
)
|
||||
for j in range(n_traces):
|
||||
tck = interpolate.splrep(
|
||||
self._time_range.time_values, self.data[:, j], k=order
|
||||
)
|
||||
new_traces[:, j] = interpolate.splev(
|
||||
new_time_range.time_values, tck
|
||||
)
|
||||
return PointSource(
|
||||
name=self.name,
|
||||
grid=self.grid,
|
||||
time_range=new_time_range,
|
||||
coordinates=self.coordinates.data,
|
||||
data=new_traces,
|
||||
)
|
||||
|
||||
_pickle_kwargs = SparseTimeFunction._pickle_kwargs + ["time_range"]
|
||||
_pickle_kwargs.remove("nt") # Inferred from time_range
|
||||
|
||||
|
||||
class Receiver(PointSource):
|
||||
pass
|
||||
|
||||
|
||||
class WaveletSource(PointSource):
|
||||
def __new__(cls, *args, **kwargs):
|
||||
if cls in _SymbolCache:
|
||||
options = kwargs.get("options", {})
|
||||
obj = sympy.Function.__new__(cls, *args, **options)
|
||||
obj._cached_init()
|
||||
return obj
|
||||
npoint = kwargs.pop("npoint", 1)
|
||||
obj = PointSource.__new__(cls, npoint=npoint, **kwargs)
|
||||
obj.f0 = kwargs.get("f0")
|
||||
for p in range(npoint):
|
||||
obj.data[:, p] = obj.wavelet(obj.f0, obj.time_values)
|
||||
return obj
|
||||
|
||||
def __init__(self, *args, **kwargs):
|
||||
if not self._cached():
|
||||
super(WaveletSource, self).__init__(*args, **kwargs)
|
||||
|
||||
def wavelet(self, f0: float, t: np.ndarray) -> np.ndarray:
|
||||
raise NotImplementedError
|
||||
|
||||
_pickle_kwargs = PointSource._pickle_kwargs + ["f0"]
|
||||
|
||||
|
||||
class RickerSource(WaveletSource):
|
||||
def wavelet(self, f0: float, t: np.ndarray) -> np.ndarray:
|
||||
r = np.pi * f0 * (t - 1.0 / f0)
|
||||
return (1.0 - 2.0 * r ** 2.0) * np.exp(-r ** 2.0)
|
|
@ -1,16 +0,0 @@
|
|||
from typing import Dict, Iterable, Tuple
|
||||
|
||||
from devito import Dimension, SubDomain
|
||||
|
||||
|
||||
class PhysicalDomain(SubDomain):
|
||||
name = "physical_domain"
|
||||
|
||||
def __init__(self, n_pml: int):
|
||||
super().__init__()
|
||||
self.n_pml = n_pml
|
||||
|
||||
def define(
|
||||
self, dimensions: Iterable[Dimension]
|
||||
) -> Dict[Dimension, Tuple[str, int, int]]:
|
||||
return {d: ("middle", self.n_pml, self.n_pml) for d in dimensions}
|
|
@ -1,34 +0,0 @@
|
|||
from typing import Optional
|
||||
|
||||
import numpy as np
|
||||
|
||||
|
||||
class TimeAxis(object):
|
||||
def __init__(
|
||||
self,
|
||||
start: Optional[float] = None,
|
||||
stop: Optional[float] = None,
|
||||
num: Optional[int] = None,
|
||||
step: Optional[float] = None,
|
||||
dtype: Optional[type] = np.float32,
|
||||
):
|
||||
if start is None:
|
||||
start = step * (1 - num) + stop
|
||||
elif stop is None:
|
||||
stop = step * (num - 1) + start
|
||||
elif num is None:
|
||||
num = int(np.ceil((stop - start + step) / step))
|
||||
stop = step * (num - 1) + start
|
||||
elif step is None:
|
||||
step = (stop - start) / (num - 1)
|
||||
else:
|
||||
raise ValueError
|
||||
self.start = start
|
||||
self.stop = stop
|
||||
self.num = num
|
||||
self.step = step
|
||||
self.dtype = dtype
|
||||
|
||||
@property
|
||||
def time_values(self) -> np.ndarray:
|
||||
return np.linspace(self.start, self.stop, self.num, dtype=self.dtype)
|
|
@ -1,6 +0,0 @@
|
|||
from enum import Enum, auto
|
||||
|
||||
|
||||
class Kernel(Enum):
|
||||
OT2 = auto()
|
||||
OT4 = auto()
|
|
@ -1,4 +0,0 @@
|
|||
from .generator import Generator
|
||||
from .roeth_tarantola import RoethTarantolaGenerator
|
||||
|
||||
__all__ = ["Generator", "RoethTarantolaGenerator"]
|
|
@ -1,22 +0,0 @@
|
|||
from typing import Optional, Tuple
|
||||
|
||||
import numpy as np
|
||||
|
||||
|
||||
class Generator(object):
|
||||
def __init__(
|
||||
self,
|
||||
shape: Tuple[int, ...],
|
||||
dtype: Optional[type] = np.float32,
|
||||
seed: Optional[int] = None,
|
||||
):
|
||||
self.shape = shape
|
||||
self.dtype = dtype
|
||||
self._prng = np.random.RandomState(seed)
|
||||
|
||||
def generate(self) -> np.ndarray:
|
||||
raise NotImplementedError
|
||||
|
||||
def generate_many(self) -> np.ndarray:
|
||||
while True:
|
||||
yield self.generate()
|
|
@ -1,41 +0,0 @@
|
|||
from typing import Optional, Tuple
|
||||
|
||||
import numpy as np
|
||||
|
||||
from .generator import Generator
|
||||
|
||||
|
||||
class RoethTarantolaGenerator(Generator):
|
||||
def __init__(
|
||||
self,
|
||||
shape: Tuple[int, ...],
|
||||
dtype: Optional[type] = np.float32,
|
||||
seed: Optional[int] = None,
|
||||
depth_dim: Optional[int] = -1,
|
||||
n_layers: Optional[int] = 8,
|
||||
initial_vp: Optional[Tuple[float, float]] = (1.35, 1.65),
|
||||
vp_perturbation: Optional[Tuple[float, float]] = (-0.19, 0.57),
|
||||
):
|
||||
super().__init__(shape, dtype, seed)
|
||||
self.depth_dim = depth_dim
|
||||
self.n_layers = n_layers
|
||||
self.initial_vp = initial_vp
|
||||
self.vp_perturbation = vp_perturbation
|
||||
|
||||
def generate(self) -> np.ndarray:
|
||||
vp = np.zeros(self.shape, dtype=self.dtype)
|
||||
dim = self.depth_dim
|
||||
layer_idx = np.round(
|
||||
np.linspace(0, self.shape[dim], self.n_layers + 1)
|
||||
).astype(np.int)
|
||||
vp_idx = [slice(0, x) for x in vp.shape]
|
||||
layer_vp = None
|
||||
for i in range(self.n_layers):
|
||||
vp_idx[dim] = slice(layer_idx[i], layer_idx[i + 1])
|
||||
layer_vp = (
|
||||
self._prng.uniform(*self.initial_vp)
|
||||
if layer_vp is None
|
||||
else layer_vp + self._prng.uniform(*self.vp_perturbation)
|
||||
)
|
||||
vp[tuple(vp_idx)] = layer_vp
|
||||
return vp
|
|
@ -0,0 +1,6 @@
|
|||
# Documentation
|
||||
|
||||
To setup the documentation, first you need to install the dependencies of the full environment. For it please follow the [SETUP.md](../SETUP.md).
|
||||
|
||||
TODO: add more text
|
||||
|
|
@ -0,0 +1,38 @@
|
|||
name: seismic-interpretation
|
||||
channels:
|
||||
- conda-forge
|
||||
- pytorch
|
||||
dependencies:
|
||||
- python=3.6.7
|
||||
- pip
|
||||
- pytorch==1.3.1
|
||||
- cudatoolkit==10.1.243
|
||||
- jupyter
|
||||
- ipykernel
|
||||
- torchvision==0.4.2
|
||||
- pandas==0.25.3
|
||||
- opencv==4.1.2
|
||||
- scikit-learn==0.21.3
|
||||
- tensorflow==2.0
|
||||
- opt-einsum>=2.3.2
|
||||
- tqdm==4.39.0
|
||||
- itkwidgets==0.23.1
|
||||
- pytest
|
||||
- papermill>=1.0.1
|
||||
- pip:
|
||||
- segyio==1.8.8
|
||||
- pytorch-ignite==0.3.0.dev20191105 # pre-release until stable available
|
||||
- fire==0.2.1
|
||||
- toolz==0.10.0
|
||||
- tabulate==0.8.2
|
||||
- Jinja2==2.10.3
|
||||
- gitpython==3.0.5
|
||||
- tensorboard==2.0.1
|
||||
- tensorboardx==1.9
|
||||
- invoke==1.3.0
|
||||
- yacs==0.1.6
|
||||
- albumentations==0.4.3
|
||||
- black
|
||||
- pylint
|
||||
- scipy==1.1.0
|
||||
- jupytext==1.3.0
|
|
@ -0,0 +1,51 @@
|
|||
define PROJECT_HELP_MSG
|
||||
Makefile to control project aml_dist
|
||||
Usage:
|
||||
help show this message
|
||||
build build docker image to use as control plane
|
||||
bash run bash inside runnin docker container
|
||||
stop stop running docker container
|
||||
endef
|
||||
export PROJECT_HELP_MSG
|
||||
PWD:=$(shell pwd)
|
||||
PORT:=9999
|
||||
TBOARD_PORT:=6006
|
||||
IMAGE_NAME:=ignite_image
|
||||
NAME:=ignite_container # Name of running container
|
||||
DATA:=/mnt
|
||||
|
||||
BASEDIR:=$(shell dirname $(shell dirname ${PWD}))
|
||||
|
||||
local_code_volume:=-v $(BASEDIR):/workspace
|
||||
volumes:=-v $(DATA):/data \
|
||||
-v ${HOME}/.bash_history:/root/.bash_history
|
||||
|
||||
|
||||
help:
|
||||
echo "$$PROJECT_HELP_MSG" | less
|
||||
|
||||
build:
|
||||
docker build -t $(IMAGE_NAME) -f dockerfile .
|
||||
|
||||
run:
|
||||
# Start docker running as daemon
|
||||
docker run $(local_code_volume) $(volumes) $(setup_environment_file) \
|
||||
--shm-size="4g" \
|
||||
--runtime=nvidia \
|
||||
--name $(NAME) \
|
||||
-d \
|
||||
-v /var/run/docker.sock:/var/run/docker.sock \
|
||||
-e HIST_FILE=/root/.bash_history \
|
||||
-it $(IMAGE_NAME)
|
||||
|
||||
docker exec -it $(NAME) bash
|
||||
|
||||
|
||||
bash:
|
||||
docker exec -it $(NAME) bash
|
||||
|
||||
stop:
|
||||
docker stop $(NAME)
|
||||
docker rm $(NAME)
|
||||
|
||||
.PHONY: help build run bash stop
|
|
@ -0,0 +1,16 @@
|
|||
FROM pytorch/pytorch:nightly-devel-cuda10.0-cudnn7
|
||||
|
||||
RUN apt-get update && apt-get install -y --no-install-recommends \
|
||||
libglib2.0-0 \
|
||||
libsm6 \
|
||||
libxext6 \
|
||||
libxrender-dev
|
||||
|
||||
RUN git clone https://github.com/NVIDIA/apex && \
|
||||
cd apex && \
|
||||
pip install -v --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
|
||||
|
||||
RUN pip install toolz pytorch-ignite torchvision pandas opencv-python fire tensorboardx scikit-learn yacs
|
||||
|
||||
WORKDIR /workspace
|
||||
CMD /bin/bash
|
|
@ -0,0 +1,56 @@
|
|||
define PROJECT_HELP_MSG
|
||||
Makefile to control project aml_dist
|
||||
Usage:
|
||||
help show this message
|
||||
build build docker image to use as control plane
|
||||
bash run bash inside runnin docker container
|
||||
stop stop running docker container
|
||||
endef
|
||||
export PROJECT_HELP_MSG
|
||||
PWD:=$(shell pwd)
|
||||
PORT:=9999
|
||||
TBOARD_PORT:=6006
|
||||
IMAGE_NAME:=horovod_image
|
||||
NAME:=horovod_container # Name of running container
|
||||
DATA:=/mnt
|
||||
|
||||
BASEDIR:=$(shell dirname $(shell dirname $(shell dirname ${PWD})))
|
||||
REPODIR:=$(shell dirname ${BASEDIR})
|
||||
|
||||
local_code_volume:=-v $(BASEDIR):/workspace
|
||||
volumes:=-v $(DATA):/data \
|
||||
-v ${HOME}/.bash_history:/root/.bash_history
|
||||
|
||||
help:
|
||||
echo "$$PROJECT_HELP_MSG" | less
|
||||
|
||||
build:
|
||||
docker build -t $(IMAGE_NAME) -f dockerfile ${REPODIR}
|
||||
|
||||
run:
|
||||
@echo ${BASEDIR}
|
||||
# Start docker running as daemon
|
||||
docker run $(local_code_volume) $(volumes) $(setup_environment_file) \
|
||||
--privileged \
|
||||
--shm-size="4g" \
|
||||
--runtime=nvidia \
|
||||
--name $(NAME) \
|
||||
-d \
|
||||
-v /var/run/docker.sock:/var/run/docker.sock \
|
||||
-e HIST_FILE=/root/.bash_history \
|
||||
-it $(IMAGE_NAME)
|
||||
|
||||
docker exec -it $(NAME) bash
|
||||
|
||||
|
||||
run-horovod:
|
||||
docker exec -it $(NAME) mpirun -np 2 -bind-to none -map-by slot -x NCCL_DEBUG=INFO -x LD_LIBRARY_PATH -x PATH -mca pml ob1 -mca btl ^openib python train_horovod.py
|
||||
|
||||
bash:
|
||||
docker exec -it $(NAME) bash
|
||||
|
||||
stop:
|
||||
docker stop $(NAME)
|
||||
docker rm $(NAME)
|
||||
|
||||
.PHONY: help build run bash stop
|
|
@ -0,0 +1,130 @@
|
|||
FROM nvidia/cuda:10.0-devel-ubuntu18.04
|
||||
# Based on default horovod image
|
||||
|
||||
ENV PYTORCH_VERSION=1.1.0
|
||||
ENV TORCHVISION_VERSION=0.3.0
|
||||
ENV CUDNN_VERSION=7.6.0.64-1+cuda10.0
|
||||
ENV NCCL_VERSION=2.4.7-1+cuda10.0
|
||||
|
||||
# Python 2.7 or 3.6 is supported by Ubuntu Bionic out of the box
|
||||
ARG python=3.6
|
||||
ENV PYTHON_VERSION=${python}
|
||||
|
||||
# Set default shell to /bin/bash
|
||||
SHELL ["/bin/bash", "-cu"]
|
||||
|
||||
# We need gcc-4.9 to build plugins for TensorFlow & PyTorch, which is only available in Ubuntu Xenial
|
||||
RUN echo deb http://archive.ubuntu.com/ubuntu xenial main universe | tee -a /etc/apt/sources.list
|
||||
ENV DEBIAN_FRONTEND=noninteractive
|
||||
RUN apt-get update && apt-get install -y --no-install-recommends --allow-change-held-packages --allow-downgrades \
|
||||
build-essential \
|
||||
cmake \
|
||||
gcc-4.9 \
|
||||
g++-4.9 \
|
||||
gcc-4.9-base \
|
||||
software-properties-common \
|
||||
git \
|
||||
curl \
|
||||
wget \
|
||||
ca-certificates \
|
||||
libcudnn7=${CUDNN_VERSION} \
|
||||
libnccl2=${NCCL_VERSION} \
|
||||
libnccl-dev=${NCCL_VERSION} \
|
||||
libjpeg-dev \
|
||||
libpng-dev \
|
||||
python${PYTHON_VERSION} \
|
||||
python${PYTHON_VERSION}-dev \
|
||||
librdmacm1 \
|
||||
libibverbs1 \
|
||||
ibverbs-utils\
|
||||
ibutils \
|
||||
net-tools \
|
||||
ibverbs-providers \
|
||||
libglib2.0-0 \
|
||||
libsm6 \
|
||||
libxext6 \
|
||||
libxrender-dev
|
||||
|
||||
|
||||
RUN if [[ "${PYTHON_VERSION}" == "3.6" ]]; then \
|
||||
apt-get install -y python${PYTHON_VERSION}-distutils; \
|
||||
fi
|
||||
RUN ln -s /usr/bin/python${PYTHON_VERSION} /usr/bin/python
|
||||
|
||||
RUN curl -O https://bootstrap.pypa.io/get-pip.py && \
|
||||
python get-pip.py && \
|
||||
rm get-pip.py
|
||||
|
||||
# Install PyTorch
|
||||
RUN pip install future typing
|
||||
RUN pip install numpy
|
||||
RUN pip install https://download.pytorch.org/whl/cu100/torch-${PYTORCH_VERSION}-$(python -c "import wheel.pep425tags as w; print('-'.join(w.get_supported()[0]))").whl \
|
||||
https://download.pytorch.org/whl/cu100/torchvision-${TORCHVISION_VERSION}-$(python -c "import wheel.pep425tags as w; print('-'.join(w.get_supported()[0]))").whl
|
||||
RUN pip install --no-cache-dir torchvision h5py toolz pytorch-ignite pandas opencv-python fire tensorboardx scikit-learn tqdm yacs albumentations gitpython
|
||||
COPY ComputerVision_fork/contrib /contrib
|
||||
RUN pip install -e /contrib
|
||||
COPY DeepSeismic /DeepSeismic
|
||||
RUN pip install -e DeepSeismic/interpretation
|
||||
|
||||
# Install Open MPI
|
||||
RUN mkdir /tmp/openmpi && \
|
||||
cd /tmp/openmpi && \
|
||||
wget https://www.open-mpi.org/software/ompi/v4.0/downloads/openmpi-4.0.0.tar.gz && \
|
||||
tar zxf openmpi-4.0.0.tar.gz && \
|
||||
cd openmpi-4.0.0 && \
|
||||
./configure --enable-orterun-prefix-by-default && \
|
||||
make -j $(nproc) all && \
|
||||
make install && \
|
||||
ldconfig && \
|
||||
rm -rf /tmp/openmpi
|
||||
|
||||
# Pin GCC to 4.9 (priority 200) to compile correctly against TensorFlow, PyTorch, and MXNet.
|
||||
# Backup existing GCC installation as priority 100, so that it can be recovered later.
|
||||
RUN update-alternatives --install /usr/bin/gcc gcc $(readlink -f $(which gcc)) 100 && \
|
||||
update-alternatives --install /usr/bin/x86_64-linux-gnu-gcc x86_64-linux-gnu-gcc $(readlink -f $(which gcc)) 100 && \
|
||||
update-alternatives --install /usr/bin/g++ g++ $(readlink -f $(which g++)) 100 && \
|
||||
update-alternatives --install /usr/bin/x86_64-linux-gnu-g++ x86_64-linux-gnu-g++ $(readlink -f $(which g++)) 100
|
||||
RUN update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-4.9 200 && \
|
||||
update-alternatives --install /usr/bin/x86_64-linux-gnu-gcc x86_64-linux-gnu-gcc /usr/bin/gcc-4.9 200 && \
|
||||
update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-4.9 200 && \
|
||||
update-alternatives --install /usr/bin/x86_64-linux-gnu-g++ x86_64-linux-gnu-g++ /usr/bin/g++-4.9 200
|
||||
|
||||
|
||||
# Install Horovod, temporarily using CUDA stubs
|
||||
RUN ldconfig /usr/local/cuda/targets/x86_64-linux/lib/stubs && \
|
||||
HOROVOD_GPU_ALLREDUCE=NCCL HOROVOD_WITH_PYTORCH=1 pip install --no-cache-dir horovod && \
|
||||
ldconfig
|
||||
|
||||
# Remove GCC pinning
|
||||
RUN update-alternatives --remove gcc /usr/bin/gcc-4.9 && \
|
||||
update-alternatives --remove x86_64-linux-gnu-gcc /usr/bin/gcc-4.9 && \
|
||||
update-alternatives --remove g++ /usr/bin/g++-4.9 && \
|
||||
update-alternatives --remove x86_64-linux-gnu-g++ /usr/bin/g++-4.9
|
||||
|
||||
# Create a wrapper for OpenMPI to allow running as root by default
|
||||
RUN mv /usr/local/bin/mpirun /usr/local/bin/mpirun.real && \
|
||||
echo '#!/bin/bash' > /usr/local/bin/mpirun && \
|
||||
echo 'mpirun.real --allow-run-as-root "$@"' >> /usr/local/bin/mpirun && \
|
||||
chmod a+x /usr/local/bin/mpirun
|
||||
|
||||
# Configure OpenMPI to run good defaults:
|
||||
# --bind-to none --map-by slot --mca btl_tcp_if_exclude lo,docker0
|
||||
RUN echo "hwloc_base_binding_policy = none" >> /usr/local/etc/openmpi-mca-params.conf && \
|
||||
echo "rmaps_base_mapping_policy = slot" >> /usr/local/etc/openmpi-mca-params.conf
|
||||
# echo "btl_tcp_if_exclude = lo,docker0" >> /usr/local/etc/openmpi-mca-params.conf
|
||||
|
||||
# Set default NCCL parameters
|
||||
RUN echo NCCL_DEBUG=INFO >> /etc/nccl.conf && \
|
||||
echo NCCL_SOCKET_IFNAME=^docker0 >> /etc/nccl.conf
|
||||
|
||||
# Install OpenSSH for MPI to communicate between containers
|
||||
RUN apt-get install -y --no-install-recommends openssh-client openssh-server && \
|
||||
mkdir -p /var/run/sshd
|
||||
|
||||
# Allow OpenSSH to talk to containers without asking for confirmation
|
||||
RUN cat /etc/ssh/ssh_config | grep -v StrictHostKeyChecking > /etc/ssh/ssh_config.new && \
|
||||
echo " StrictHostKeyChecking no" >> /etc/ssh/ssh_config.new && \
|
||||
mv /etc/ssh/ssh_config.new /etc/ssh/ssh_config
|
||||
|
||||
WORKDIR /workspace
|
||||
CMD /bin/bash
|
|
@ -0,0 +1 @@
|
|||
Description of examples
|
Различия файлов скрыты, потому что одна или несколько строк слишком длинны
|
@ -0,0 +1,654 @@
|
|||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Copyright (c) Microsoft Corporation.\n",
|
||||
"\n",
|
||||
"Licensed under the MIT License."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# HRNet training and validation on numpy dataset"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"In this notebook, we demonstrate how to train an HRNet model for facies prediction using [Penobscot](https://zenodo.org/record/1341774#.XepaaUB2vOg) dataset. The Penobscot 3D seismic dataset was acquired in the Scotian shelf, offshore Nova Scotia, Canada. Please refer to the top-level [README.md](../../../README.md) file to download and prepare this dataset for the experiments. \n",
|
||||
"\n",
|
||||
"The data expected in this notebook needs to be in the form of two 3D numpy arrays. One array will contain the seismic information, the other the mask. The network will be trained to take a 2D patch of data from the seismic block and learn to predict the 2D mask patch associated with it."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Environment setup\n",
|
||||
"\n",
|
||||
"To set up the conda environment, please follow the instructions in the top-level [README.md](../../../README.md) file.\n",
|
||||
"\n",
|
||||
"__Note__: To register the conda environment in Jupyter, run:\n",
|
||||
"`python -m ipykernel install --user --name envname`\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Library imports"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import logging\n",
|
||||
"import logging.config\n",
|
||||
"from os import path\n",
|
||||
"\n",
|
||||
"import cv2\n",
|
||||
"import numpy as np\n",
|
||||
"import yacs.config\n",
|
||||
"import torch\n",
|
||||
"from albumentations import Compose, HorizontalFlip, Normalize, PadIfNeeded, Resize\n",
|
||||
"from cv_lib.utils import load_log_configuration\n",
|
||||
"from cv_lib.event_handlers import (\n",
|
||||
" SnapshotHandler,\n",
|
||||
" logging_handlers,\n",
|
||||
" tensorboard_handlers,\n",
|
||||
")\n",
|
||||
"from cv_lib.event_handlers.logging_handlers import Evaluator\n",
|
||||
"from cv_lib.event_handlers.tensorboard_handlers import (\n",
|
||||
" create_image_writer,\n",
|
||||
" create_summary_writer,\n",
|
||||
")\n",
|
||||
"from cv_lib.segmentation import models, extract_metric_from\n",
|
||||
"from cv_lib.segmentation.metrics import (\n",
|
||||
" pixelwise_accuracy,\n",
|
||||
" class_accuracy,\n",
|
||||
" mean_class_accuracy,\n",
|
||||
" class_iou,\n",
|
||||
" mean_iou,\n",
|
||||
")\n",
|
||||
"from cv_lib.segmentation.dutchf3.utils import (\n",
|
||||
" current_datetime,\n",
|
||||
" generate_path,\n",
|
||||
" np_to_tb,\n",
|
||||
")\n",
|
||||
"from cv_lib.segmentation.penobscot.engine import (\n",
|
||||
" create_supervised_evaluator,\n",
|
||||
" create_supervised_trainer,\n",
|
||||
")\n",
|
||||
"from deepseismic_interpretation.penobscot.data import PenobscotInlinePatchDataset\n",
|
||||
"from deepseismic_interpretation.dutchf3.data import decode_segmap\n",
|
||||
"from ignite.contrib.handlers import CosineAnnealingScheduler\n",
|
||||
"from ignite.engine import Events\n",
|
||||
"from ignite.metrics import Loss\n",
|
||||
"from ignite.utils import convert_tensor\n",
|
||||
"from toolz import compose\n",
|
||||
"from torch.utils import data\n",
|
||||
"from itkwidgets import view\n",
|
||||
"from utilities import plot_aline\n",
|
||||
"from toolz import take\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"mask_value = 255\n",
|
||||
"_SEG_COLOURS = np.asarray(\n",
|
||||
" [[241, 238, 246], [208, 209, 230], [166, 189, 219], [116, 169, 207], [54, 144, 192], [5, 112, 176], [3, 78, 123]]\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"# experiment configuration file\n",
|
||||
"CONFIG_FILE = \"./configs/hrnet.yaml\""
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"def _prepare_batch(batch, device=None, non_blocking=False):\n",
|
||||
" x, y, ids, patch_locations = batch\n",
|
||||
" return (\n",
|
||||
" convert_tensor(x, device=device, non_blocking=non_blocking),\n",
|
||||
" convert_tensor(y, device=device, non_blocking=non_blocking),\n",
|
||||
" ids,\n",
|
||||
" patch_locations,\n",
|
||||
" )"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Experiment configuration file\n",
|
||||
"We use configuration files to specify experiment configuration, such as hyperparameters used in training and evaluation, as well as other experiment settings. We provide several configuration files for this notebook, under `./configs`, mainly differing in the DNN architecture used for defining the model.\n",
|
||||
"\n",
|
||||
"Modify the `CONFIG_FILE` variable above if you would like to run the experiment using a different configuration file."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"with open(CONFIG_FILE, \"rt\") as f_read:\n",
|
||||
" config = yacs.config.load_cfg(f_read)\n",
|
||||
"\n",
|
||||
"print(f'Configuration loaded. Please check that the DATASET.ROOT:{config.DATASET.ROOT} points to your data location.')\n",
|
||||
"print(f'To modify any of the options, please edit the configuration file {CONFIG_FILE} and reload. \\n')\n",
|
||||
"print(config)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Parameters"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"tags": [
|
||||
"parameters"
|
||||
]
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# The number of datapoints you want to run in training or validation per batch \n",
|
||||
"# Setting to None will run whole dataset\n",
|
||||
"# useful for integration tests with a setting of something like 3\n",
|
||||
"# Use only if you want to check things are running and don't want to run\n",
|
||||
"# through whole dataset\n",
|
||||
"max_iterations = None \n",
|
||||
"# The number of epochs to run in training\n",
|
||||
"max_epochs = config.TRAIN.END_EPOCH \n",
|
||||
"max_snapshots = config.TRAIN.SNAPSHOTS\n",
|
||||
"dataset_root = config.DATASET.ROOT"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Load Dataset"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import os\n",
|
||||
"from toolz import pipe\n",
|
||||
"import glob\n",
|
||||
"from PIL import Image"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"image_dir = os.path.join(dataset_root, \"inlines\")\n",
|
||||
"mask_dir = os.path.join(dataset_root, \"masks\")\n",
|
||||
"\n",
|
||||
"image_iter = pipe(os.path.join(image_dir, \"*.tiff\"), glob.iglob,)\n",
|
||||
"\n",
|
||||
"_open_to_array = compose(np.array, Image.open)\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"def open_image_mask(image_path):\n",
|
||||
" return pipe(image_path, _open_to_array)\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"def _mask_filename(imagepath):\n",
|
||||
" file_part = os.path.splitext(os.path.split(imagepath)[-1].strip())[0]\n",
|
||||
" return os.path.join(mask_dir, file_part + \"_mask.png\")\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"image_list = sorted(list(image_iter))\n",
|
||||
"image_list_array = [_open_to_array(i) for i in image_list]\n",
|
||||
"mask_list_array = [pipe(i, _mask_filename, _open_to_array) for i in image_list]\n",
|
||||
"mask = np.stack(mask_list_array, axis=0)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Let's visualize the dataset."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"view(mask, slicing_planes=True)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Let's view slices of the data along inline and crossline directions."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"idx = 100\n",
|
||||
"x_in = image_list_array[idx]\n",
|
||||
"x_inl = mask_list_array[idx]\n",
|
||||
"\n",
|
||||
"plot_aline(x_in, x_inl, xlabel=\"inline\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Model training"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Setup logging\n",
|
||||
"load_log_configuration(config.LOG_CONFIG)\n",
|
||||
"logger = logging.getLogger(__name__)\n",
|
||||
"logger.debug(config.WORKERS)\n",
|
||||
"scheduler_step = max_epochs // max_snapshots\n",
|
||||
"torch.backends.cudnn.benchmark = config.CUDNN.BENCHMARK\n",
|
||||
"\n",
|
||||
"torch.manual_seed(config.SEED)\n",
|
||||
"if torch.cuda.is_available():\n",
|
||||
" torch.cuda.manual_seed_all(config.SEED)\n",
|
||||
"np.random.seed(seed=config.SEED)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Set up data augmentation\n",
|
||||
"\n",
|
||||
"Let's define our data augmentation pipeline, which includes basic transformations, such as _data normalization, resizing, and padding_ if necessary.\n",
|
||||
"The padding is carried out twice becuase if we split the inline or crossline slice into multiple patches then some of these patches will be at the edge of the slice and may not contain a full patch worth of data. To compensate to this and have same size patches in the batch (a requirement) we need to pad them.\n",
|
||||
"So our basic augmentation is:\n",
|
||||
"- Normalize\n",
|
||||
"- Pad if needed to initial size\n",
|
||||
"- Resize to a larger size\n",
|
||||
"- Pad further if necessary"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Setup Augmentations\n",
|
||||
"basic_aug = Compose(\n",
|
||||
" [\n",
|
||||
" Normalize(mean=(config.TRAIN.MEAN,), std=(config.TRAIN.STD,), max_pixel_value=config.TRAIN.MAX,),\n",
|
||||
" PadIfNeeded(\n",
|
||||
" min_height=config.TRAIN.PATCH_SIZE,\n",
|
||||
" min_width=config.TRAIN.PATCH_SIZE,\n",
|
||||
" border_mode=cv2.BORDER_CONSTANT,\n",
|
||||
" always_apply=True,\n",
|
||||
" mask_value=mask_value,\n",
|
||||
" value=0,\n",
|
||||
" ),\n",
|
||||
" Resize(config.TRAIN.AUGMENTATIONS.RESIZE.HEIGHT, config.TRAIN.AUGMENTATIONS.RESIZE.WIDTH, always_apply=True,),\n",
|
||||
" PadIfNeeded(\n",
|
||||
" min_height=config.TRAIN.AUGMENTATIONS.PAD.HEIGHT,\n",
|
||||
" min_width=config.TRAIN.AUGMENTATIONS.PAD.WIDTH,\n",
|
||||
" border_mode=cv2.BORDER_CONSTANT,\n",
|
||||
" always_apply=True,\n",
|
||||
" mask_value=mask_value,\n",
|
||||
" value=0,\n",
|
||||
" ),\n",
|
||||
" ]\n",
|
||||
")\n",
|
||||
"if config.TRAIN.AUGMENTATION:\n",
|
||||
" train_aug = Compose([basic_aug, HorizontalFlip(p=0.5)])\n",
|
||||
" val_aug = basic_aug\n",
|
||||
"else:\n",
|
||||
" train_aug = val_aug = basic_aug"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Load the data"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"For training the model, we will use a patch-based approach. Rather than using entire sections (crosslines or inlines) of the data, we extract a large number of small patches from the sections, and use the patches as our data. This allows us to generate larger set of images for training, but is also a more feasible approach for large seismic volumes.\n",
|
||||
"\n",
|
||||
"We are using a custom patch data loader from our __`deepseismic_interpretation`__ library for generating and loading patches from seismic section data."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"lines_to_next_cell": 2
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"train_set = PenobscotInlinePatchDataset(\n",
|
||||
" dataset_root,\n",
|
||||
" config.TRAIN.PATCH_SIZE,\n",
|
||||
" config.TRAIN.STRIDE,\n",
|
||||
" split=\"train\",\n",
|
||||
" transforms=train_aug,\n",
|
||||
" n_channels=config.MODEL.IN_CHANNELS,\n",
|
||||
" complete_patches_only=config.TRAIN.COMPLETE_PATCHES_ONLY,\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"val_set = PenobscotInlinePatchDataset(\n",
|
||||
" dataset_root,\n",
|
||||
" config.TRAIN.PATCH_SIZE,\n",
|
||||
" config.TRAIN.STRIDE,\n",
|
||||
" split=\"val\",\n",
|
||||
" transforms=val_aug,\n",
|
||||
" n_channels=config.MODEL.IN_CHANNELS,\n",
|
||||
" complete_patches_only=config.VALIDATION.COMPLETE_PATCHES_ONLY,\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"logger.info(train_set)\n",
|
||||
"logger.info(val_set)\n",
|
||||
"\n",
|
||||
"n_classes = train_set.n_classes\n",
|
||||
"train_loader = data.DataLoader(\n",
|
||||
" train_set, batch_size=config.TRAIN.BATCH_SIZE_PER_GPU, num_workers=config.WORKERS, shuffle=True,\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"val_loader = data.DataLoader(val_set, batch_size=config.VALIDATION.BATCH_SIZE_PER_GPU, num_workers=config.WORKERS,)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Set up model training\n",
|
||||
"Next, let's define a model to train, an optimization algorithm, and a loss function.\n",
|
||||
"\n",
|
||||
"Note that the model is loaded from our __`cv_lib`__ library, using the name of the model as specified in the configuration file. To load a different model, either change the `MODEL.NAME` field in the configuration file, or create a new one corresponding to the model you wish to train."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"model = getattr(models, config.MODEL.NAME).get_seg_model(config)\n",
|
||||
"\n",
|
||||
"device = \"cpu\"\n",
|
||||
"if torch.cuda.is_available():\n",
|
||||
" device = \"cuda\"\n",
|
||||
"model = model.to(device) # Send to GPU\n",
|
||||
"\n",
|
||||
"optimizer = torch.optim.SGD(\n",
|
||||
" model.parameters(), lr=config.TRAIN.MAX_LR, momentum=config.TRAIN.MOMENTUM, weight_decay=config.TRAIN.WEIGHT_DECAY,\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"output_dir = generate_path(config.OUTPUT_DIR, config.MODEL.NAME, current_datetime(),)\n",
|
||||
"summary_writer = create_summary_writer(log_dir=path.join(output_dir, config.LOG_DIR))\n",
|
||||
"snapshot_duration = scheduler_step * len(train_loader)\n",
|
||||
"scheduler = CosineAnnealingScheduler(optimizer, \"lr\", config.TRAIN.MAX_LR, config.TRAIN.MIN_LR, snapshot_duration)\n",
|
||||
"\n",
|
||||
"criterion = torch.nn.CrossEntropyLoss(ignore_index=mask_value, reduction=\"mean\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Training the model\n",
|
||||
"We use [ignite](https://pytorch.org/ignite/index.html) framework to create training and validation loops in our codebase. Ignite provides an easy way to create compact training/validation loops without too much boilerplate code.\n",
|
||||
"\n",
|
||||
"In this notebook, we demonstrate the use of ignite on the training loop only. We create a training engine `trainer` that loops multiple times over the training dataset and updates model parameters. In addition, we add various events to the trainer, using an event system, that allows us to interact with the engine on each step of the run, such as, when the trainer is started/completed, when the epoch is started/completed and so on.\n",
|
||||
"\n",
|
||||
"In the cell below, we use event handlers to add the following events to the training loop:\n",
|
||||
"- log training output\n",
|
||||
"- log and schedule learning rate and\n",
|
||||
"- periodically save model to disk."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"trainer = create_supervised_trainer(model, optimizer, criterion, _prepare_batch, device=device)\n",
|
||||
"\n",
|
||||
"trainer.add_event_handler(Events.ITERATION_STARTED, scheduler)\n",
|
||||
"\n",
|
||||
"trainer.add_event_handler(\n",
|
||||
" Events.ITERATION_COMPLETED, logging_handlers.log_training_output(log_interval=config.PRINT_FREQ),\n",
|
||||
")\n",
|
||||
"trainer.add_event_handler(Events.EPOCH_STARTED, logging_handlers.log_lr(optimizer))\n",
|
||||
"trainer.add_event_handler(\n",
|
||||
" Events.EPOCH_STARTED, tensorboard_handlers.log_lr(summary_writer, optimizer, \"epoch\"),\n",
|
||||
")\n",
|
||||
"trainer.add_event_handler(\n",
|
||||
" Events.ITERATION_COMPLETED, tensorboard_handlers.log_training_output(summary_writer),\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"def _select_pred_and_mask(model_out_dict):\n",
|
||||
" return (model_out_dict[\"y_pred\"].squeeze(), model_out_dict[\"mask\"].squeeze())\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"evaluator = create_supervised_evaluator(\n",
|
||||
" model,\n",
|
||||
" _prepare_batch,\n",
|
||||
" metrics={\n",
|
||||
" \"pixacc\": pixelwise_accuracy(n_classes, output_transform=_select_pred_and_mask),\n",
|
||||
" \"nll\": Loss(criterion, output_transform=_select_pred_and_mask),\n",
|
||||
" \"cacc\": class_accuracy(n_classes, output_transform=_select_pred_and_mask),\n",
|
||||
" \"mca\": mean_class_accuracy(n_classes, output_transform=_select_pred_and_mask),\n",
|
||||
" \"ciou\": class_iou(n_classes, output_transform=_select_pred_and_mask),\n",
|
||||
" \"mIoU\": mean_iou(n_classes, output_transform=_select_pred_and_mask),\n",
|
||||
" },\n",
|
||||
" device=device,\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"if max_iterations is not None:\n",
|
||||
" val_loader = take(max_iterations, val_loader)\n",
|
||||
"\n",
|
||||
"# Set the validation run to start on the epoch completion of the training run\n",
|
||||
"trainer.add_event_handler(Events.EPOCH_COMPLETED, Evaluator(evaluator, val_loader))\n",
|
||||
"\n",
|
||||
"evaluator.add_event_handler(\n",
|
||||
" Events.EPOCH_COMPLETED,\n",
|
||||
" logging_handlers.log_metrics(\n",
|
||||
" \"Validation results\",\n",
|
||||
" metrics_dict={\n",
|
||||
" \"nll\": \"Avg loss :\",\n",
|
||||
" \"pixacc\": \"Pixelwise Accuracy :\",\n",
|
||||
" \"mca\": \"Avg Class Accuracy :\",\n",
|
||||
" \"mIoU\": \"Avg Class IoU :\",\n",
|
||||
" },\n",
|
||||
" ),\n",
|
||||
")\n",
|
||||
"evaluator.add_event_handler(\n",
|
||||
" Events.EPOCH_COMPLETED,\n",
|
||||
" tensorboard_handlers.log_metrics(\n",
|
||||
" summary_writer,\n",
|
||||
" trainer,\n",
|
||||
" \"epoch\",\n",
|
||||
" metrics_dict={\n",
|
||||
" \"mIoU\": \"Validation/mIoU\",\n",
|
||||
" \"nll\": \"Validation/Loss\",\n",
|
||||
" \"mca\": \"Validation/MCA\",\n",
|
||||
" \"pixacc\": \"Validation/Pixel_Acc\",\n",
|
||||
" },\n",
|
||||
" ),\n",
|
||||
")\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"def _select_max(pred_tensor):\n",
|
||||
" return pred_tensor.max(1)[1]\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"def _tensor_to_numpy(pred_tensor):\n",
|
||||
" return pred_tensor.squeeze().cpu().numpy()\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"transform_func = compose(np_to_tb, decode_segmap(n_classes=n_classes, label_colours=_SEG_COLOURS), _tensor_to_numpy,)\n",
|
||||
"\n",
|
||||
"transform_pred = compose(transform_func, _select_max)\n",
|
||||
"\n",
|
||||
"evaluator.add_event_handler(\n",
|
||||
" Events.EPOCH_COMPLETED, create_image_writer(summary_writer, \"Validation/Image\", \"image\"),\n",
|
||||
")\n",
|
||||
"evaluator.add_event_handler(\n",
|
||||
" Events.EPOCH_COMPLETED,\n",
|
||||
" create_image_writer(summary_writer, \"Validation/Mask\", \"mask\", transform_func=transform_func),\n",
|
||||
")\n",
|
||||
"evaluator.add_event_handler(\n",
|
||||
" Events.EPOCH_COMPLETED,\n",
|
||||
" create_image_writer(summary_writer, \"Validation/Pred\", \"y_pred\", transform_func=transform_pred),\n",
|
||||
")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Checkpointing\n",
|
||||
"Below we define the function that will save the best performing models based on mean IoU."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"def snapshot_function():\n",
|
||||
" return (trainer.state.iteration % snapshot_duration) == 0\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"checkpoint_handler = SnapshotHandler(\n",
|
||||
" path.join(output_dir, config.TRAIN.MODEL_DIR), config.MODEL.NAME, extract_metric_from(\"mIoU\"), snapshot_function,\n",
|
||||
")\n",
|
||||
"evaluator.add_event_handler(Events.EPOCH_COMPLETED, checkpoint_handler, {\"model\": model})"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Start the training engine run."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"if max_iterations is not None:\n",
|
||||
" train_loader = take(max_iterations, train_loader)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"logger.info(\"Starting training\")\n",
|
||||
"trainer.run(train_loader, max_epochs=max_epochs)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Tensorboard\n",
|
||||
"Using tensorboard for monitoring runs can be quite enlightening. Just ensure that the appropriate port is open on the VM so you can access it. Below we have the command for running tensorboard in your notebook. You can as easily view it in a seperate browser window by pointing the browser to the appropriate location and port."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"if max_epochs>1:\n",
|
||||
" %load_ext tensorboard"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"if max_epochs>1:\n",
|
||||
" %tensorboard --logdir outputs --port 6007 --host 0.0.0.0"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"celltoolbar": "Tags",
|
||||
"kernelspec": {
|
||||
"display_name": "seismic-interpretation",
|
||||
"language": "python",
|
||||
"name": "seismic-interpretation"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.6.7"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 2
|
||||
}
|
|
@ -0,0 +1,109 @@
|
|||
CUDNN:
|
||||
BENCHMARK: true
|
||||
DETERMINISTIC: false
|
||||
ENABLED: true
|
||||
GPUS: (0,)
|
||||
OUTPUT_DIR: 'outputs'
|
||||
LOG_DIR: 'log'
|
||||
WORKERS: 4
|
||||
PRINT_FREQ: 50
|
||||
|
||||
LOG_CONFIG: logging.conf
|
||||
SEED: 2019
|
||||
|
||||
|
||||
DATASET:
|
||||
NUM_CLASSES: 7
|
||||
ROOT: /mnt/penobscot
|
||||
CLASS_WEIGHTS: [0.02630481, 0.05448931, 0.0811898 , 0.01866496, 0.15868563, 0.0875993 , 0.5730662]
|
||||
INLINE_HEIGHT: 1501
|
||||
INLINE_WIDTH: 481
|
||||
|
||||
|
||||
MODEL:
|
||||
NAME: seg_hrnet
|
||||
IN_CHANNELS: 3
|
||||
PRETRAINED: '/data/hrnet_pretrained/image_classification/hrnetv2_w48_imagenet_pretrained.pth'
|
||||
EXTRA:
|
||||
FINAL_CONV_KERNEL: 1
|
||||
STAGE2:
|
||||
NUM_MODULES: 1
|
||||
NUM_BRANCHES: 2
|
||||
BLOCK: BASIC
|
||||
NUM_BLOCKS:
|
||||
- 4
|
||||
- 4
|
||||
NUM_CHANNELS:
|
||||
- 48
|
||||
- 96
|
||||
FUSE_METHOD: SUM
|
||||
STAGE3:
|
||||
NUM_MODULES: 4
|
||||
NUM_BRANCHES: 3
|
||||
BLOCK: BASIC
|
||||
NUM_BLOCKS:
|
||||
- 4
|
||||
- 4
|
||||
- 4
|
||||
NUM_CHANNELS:
|
||||
- 48
|
||||
- 96
|
||||
- 192
|
||||
FUSE_METHOD: SUM
|
||||
STAGE4:
|
||||
NUM_MODULES: 3
|
||||
NUM_BRANCHES: 4
|
||||
BLOCK: BASIC
|
||||
NUM_BLOCKS:
|
||||
- 4
|
||||
- 4
|
||||
- 4
|
||||
- 4
|
||||
NUM_CHANNELS:
|
||||
- 48
|
||||
- 96
|
||||
- 192
|
||||
- 384
|
||||
FUSE_METHOD: SUM
|
||||
|
||||
TRAIN:
|
||||
COMPLETE_PATCHES_ONLY: True
|
||||
BATCH_SIZE_PER_GPU: 32
|
||||
BEGIN_EPOCH: 0
|
||||
END_EPOCH: 300
|
||||
MIN_LR: 0.0001
|
||||
MAX_LR: 0.02
|
||||
MOMENTUM: 0.9
|
||||
WEIGHT_DECAY: 0.0001
|
||||
SNAPSHOTS: 5
|
||||
AUGMENTATION: True
|
||||
DEPTH: "none" #"patch" # Options are none, patch and section
|
||||
STRIDE: 64
|
||||
PATCH_SIZE: 128
|
||||
AUGMENTATIONS:
|
||||
RESIZE:
|
||||
HEIGHT: 256
|
||||
WIDTH: 256
|
||||
PAD:
|
||||
HEIGHT: 256
|
||||
WIDTH: 256
|
||||
MEAN: [-0.0001777, 0.49, -0.0000688] # First value is for images, second for depth and then combination of both
|
||||
STD: [0.14076 , 0.2717, 0.06286]
|
||||
MAX: 1
|
||||
MODEL_DIR: "models"
|
||||
|
||||
|
||||
VALIDATION:
|
||||
BATCH_SIZE_PER_GPU: 128
|
||||
COMPLETE_PATCHES_ONLY: True
|
||||
|
||||
TEST:
|
||||
COMPLETE_PATCHES_ONLY: False
|
||||
MODEL_PATH: "/data/home/mat/repos/DeepSeismic/experiments/segmentation/penobscot/local/output/penobscot/437970c875226e7e39c8109c0de8d21c5e5d6e3b/seg_hrnet/Sep25_144942/models/seg_hrnet_running_model_28.pth"
|
||||
AUGMENTATIONS:
|
||||
RESIZE:
|
||||
HEIGHT: 256
|
||||
WIDTH: 256
|
||||
PAD:
|
||||
HEIGHT: 256
|
||||
WIDTH: 256
|
|
@ -0,0 +1,59 @@
|
|||
CUDNN:
|
||||
BENCHMARK: true
|
||||
DETERMINISTIC: false
|
||||
ENABLED: true
|
||||
GPUS: (0,)
|
||||
OUTPUT_DIR: 'output'
|
||||
LOG_DIR: 'log'
|
||||
WORKERS: 4
|
||||
PRINT_FREQ: 50
|
||||
LOG_CONFIG: logging.conf
|
||||
SEED: 2019
|
||||
|
||||
DATASET:
|
||||
NUM_CLASSES: 6
|
||||
ROOT: /data/dutchf3
|
||||
CLASS_WEIGHTS: [0.7151, 0.8811, 0.5156, 0.9346, 0.9683, 0.9852]
|
||||
|
||||
MODEL:
|
||||
NAME: patch_deconvnet_skip
|
||||
IN_CHANNELS: 1
|
||||
|
||||
|
||||
TRAIN:
|
||||
BATCH_SIZE_PER_GPU: 64
|
||||
BEGIN_EPOCH: 0
|
||||
END_EPOCH: 100
|
||||
MIN_LR: 0.001
|
||||
MAX_LR: 0.02
|
||||
MOMENTUM: 0.9
|
||||
WEIGHT_DECAY: 0.0001
|
||||
SNAPSHOTS: 5
|
||||
AUGMENTATION: True
|
||||
DEPTH: "none" #"patch" # Options are No, Patch and Section
|
||||
STRIDE: 50
|
||||
PATCH_SIZE: 99
|
||||
AUGMENTATIONS:
|
||||
RESIZE:
|
||||
HEIGHT: 99
|
||||
WIDTH: 99
|
||||
PAD:
|
||||
HEIGHT: 99
|
||||
WIDTH: 99
|
||||
MEAN: 0.0009997 # 0.0009996710808862074
|
||||
STD: 0.20977 # 0.20976548783479299
|
||||
MODEL_DIR: "models"
|
||||
|
||||
VALIDATION:
|
||||
BATCH_SIZE_PER_GPU: 512
|
||||
|
||||
TEST:
|
||||
MODEL_PATH: '/data/home/mat/repos/DeepSeismic/examples/interpretation/notebooks/output/models/model_patch_deconvnet_skip_2.pth'
|
||||
TEST_STRIDE: 10
|
||||
SPLIT: 'test1' # Can be both, test1, test2
|
||||
INLINE: True
|
||||
CROSSLINE: True
|
||||
POST_PROCESSING:
|
||||
SIZE: 99 #
|
||||
CROP_PIXELS: 0 # Number of pixels to crop top, bottom, left and right
|
||||
|
|
@ -0,0 +1,59 @@
|
|||
# UNet configuration
|
||||
|
||||
CUDNN:
|
||||
BENCHMARK: true
|
||||
DETERMINISTIC: false
|
||||
ENABLED: true
|
||||
GPUS: (0,)
|
||||
OUTPUT_DIR: 'output'
|
||||
LOG_DIR: 'log'
|
||||
WORKERS: 4
|
||||
PRINT_FREQ: 50
|
||||
LOG_CONFIG: logging.conf
|
||||
SEED: 2019
|
||||
|
||||
|
||||
DATASET:
|
||||
NUM_CLASSES: 6
|
||||
ROOT: '/data/dutchf3'
|
||||
CLASS_WEIGHTS: [0.7151, 0.8811, 0.5156, 0.9346, 0.9683, 0.9852]
|
||||
|
||||
MODEL:
|
||||
NAME: resnet_unet
|
||||
IN_CHANNELS: 3
|
||||
|
||||
|
||||
TRAIN:
|
||||
BATCH_SIZE_PER_GPU: 16
|
||||
BEGIN_EPOCH: 0
|
||||
END_EPOCH: 10
|
||||
MIN_LR: 0.001
|
||||
MAX_LR: 0.02
|
||||
MOMENTUM: 0.9
|
||||
WEIGHT_DECAY: 0.0001
|
||||
SNAPSHOTS: 5
|
||||
AUGMENTATION: True
|
||||
DEPTH: "section" # Options are No, Patch and Section
|
||||
STRIDE: 50
|
||||
PATCH_SIZE: 100
|
||||
AUGMENTATIONS:
|
||||
RESIZE:
|
||||
HEIGHT: 200
|
||||
WIDTH: 200
|
||||
PAD:
|
||||
HEIGHT: 256
|
||||
WIDTH: 256
|
||||
MEAN: 0.0009997 # 0.0009996710808862074
|
||||
STD: 0.20977 # 0.20976548783479299
|
||||
MODEL_DIR: "models"
|
||||
|
||||
TEST:
|
||||
MODEL_PATH: ""
|
||||
TEST_STRIDE: 10
|
||||
SPLIT: 'Both' # Can be Both, Test1, Test2
|
||||
INLINE: True
|
||||
CROSSLINE: True
|
||||
POST_PROCESSING:
|
||||
SIZE: 128
|
||||
CROP_PIXELS: 14 # Number of pixels to crop top, bottom, left and right
|
||||
|
Некоторые файлы не были показаны из-за слишком большого количества измененных файлов Показать больше
Загрузка…
Ссылка в новой задаче