Граф коммитов

2633 Коммитов

Автор SHA1 Сообщение Дата
Gustavo Rosa 6bb8ba3864 fix(root): Fixes direct depency so its publishable to PyPI. 2023-09-15 11:07:30 -07:00
Gustavo de Rosa 0f55ccb4df
Merge pull request #245 from tosemml/patch-1
Refactoring
2023-09-15 10:59:42 -07:00
piero2c 35038708e4
fix(param): exclude_cls param 2023-09-13 15:40:04 -07:00
Dom acf7a846b5
Use list comp 2023-08-29 13:24:25 -07:00
Caio Mendes 95d6e19a15 Move xformers to extra as it requires compilation 2023-07-26 18:30:01 -07:00
Chris Lovett 6e2a20d123
print error message so it doesn't silently fail. (#244) 2023-05-11 10:23:04 -07:00
Chris Lovett 060b0e0d6f add demo video link. 2023-05-11 10:21:57 -07:00
Yoganand Rajasekaran 0a695c6deb
QAT support for facial landmark detection task (#240)
Added QAT support for facial landmark detection task.
Includes quantizable model with ability to skip layers for quantization to avoid accuracy drop.
2023-05-09 14:01:49 -07:00
Chris Lovett b33b162902
plot the effect of full training. (#243) 2023-05-08 15:53:45 -07:00
Chris Lovett f807260cf4
Add script to do full training of final set of models to face segmentation task (#238)
* Add a --test option that runs only the data prep step to test the environment is working.
* force train.py to grab the lock on the row (removing rare failure case).
* Fix snpe kubernetes scaling using the anti-node affinity pattern.
* Publish new docker image.
* Add mlflow integration to train.py.
* Add script that does full training pipeline for final pareto models.
* switch to bokeh so I can get nice tooltips on each dot in the scatter plot.
* add axis titles.
* Add device F1 scoring to train_pareto
* Add more to readmes.
* add image
* Add helper script to do final F1 scoring on Qualcomm devices.
* fix lint errors.
* fix bugs
* rev environment version.
* fix lint error
* rename snp_test script and fix bugs
* add iteration 20
* fix bug
* Add gif animations
* Fix bugs in snp_test
* fix bugs - snp_test needs to reset the .dlc files.
* make loop.sh executable
* only reset the models we are actually going to test.
* add final snpe f1 score chart
* Improve calc_pareto_frontier helper
* Show final dots that fell off pareto as gray.
* full training is complete, this is the final results.
2023-05-05 20:40:26 -07:00
wchen-github bff0b3eb49
Merge pull request #242 from microsoft/tasks_factial_lardmark_detection_replace_graph
Missing a commit
2023-05-05 10:53:40 -07:00
Wei-ge Chen e4aed275ce Missing a commit 2023-05-05 09:35:28 -07:00
wchen-github 74854eb5a3
Merge pull request #241 from microsoft:tasks_factial_lardmark_detection_replace_graph
Replaced partial training plot.
2023-05-05 09:32:47 -07:00
Wei-ge Chen 80545be5e3 Replaced partial training plot.
Renamed full training plot file name for consistency.
2023-05-04 16:11:17 -07:00
wchen-github 2b88a3cf65
Merge pull request #239 from microsoft/main_fix_setup
Fix merging error
2023-04-28 16:26:48 -07:00
Wei-ge Chen b45d5f301b Fix merging error 2023-04-28 16:21:20 -07:00
wchen-github 32becbe47a
Merge pull request #237 from microsoft/task_facial_landmark_detection_autoformatted
Autoformatted with black. Fixed a few warnings from flake8.
2023-04-28 13:39:59 -07:00
Wei-ge Chen 0c2ebec20b Fixed more flake8 warnings 2023-04-28 11:22:15 -07:00
Wei-ge Chen b5d00d3098 Autoformatted with black 2023-04-28 10:44:19 -07:00
wchen-github fb9ce81b49
Merge pull request #235 from microsoft/task_facial_landmark_detection
Task_facial_landmark_detection
2023-04-28 10:15:20 -07:00
Gustavo de Rosa bd8684979b
fix(tasks): Fixes issue #227. 2023-04-28 09:01:45 -03:00
Wei-ge Chen f9eeb0140c Missed in previous commit 2023-04-27 14:08:28 -07:00
Chris Lovett 93b8ab75d7
Finalize the AML test run (#236)
* Add a --test option that runs only the data prep step to test the environment is working.

* force train.py to grab the lock on the row (removing rare failure case).

* Fix snpe kubernetes scaling using the anti-node affinity pattern.
Publish new docker image.
Add mlflow integration to train.py.

* Add script that does full training pipeline for final pareto models.

* add iteration 7

* add iteration 9

* switch to bokeh so I can get nice tooltips on each dot in the scatter plot.

* add axis titles.

* Add device F1 scoring to train_pareto
Add more to readmes.

* add image

* Add helper script to do final F1 scoring on Qualcomm devices.

* fix lint errors.

* fix bugs
2023-04-27 13:03:08 -07:00
Wei-ge Chen efc4b5b01f Resolve comments for merge PR 2023-04-27 12:35:03 -07:00
Chris Lovett 5376cdbf04 Merge branch 'main' into task_facial_landmark_detection 2023-04-26 14:56:54 -07:00
Wei-ge Chen 2673902b15 Further clean up 2023-04-26 11:29:00 -07:00
Wei-ge Chen 7b334a2a4f Further clean up 2023-04-26 11:28:23 -07:00
Wei-ge Chen 8e89402d40 Remove hard coded paths. Clean up more. 2023-04-25 16:00:15 -07:00
Wei-ge Chen 11145b86f3 Missed this file 2023-04-25 15:49:29 -07:00
Wei-ge Chen 2694b1e82c More clean up + full training results 2023-04-25 15:48:13 -07:00
Wei-ge Chen c3af4e4f1f Fixed arg name for archid 2023-04-25 10:50:19 -07:00
Wei-ge Chen c3b9c61ec3 Patch up the previous commit 2023-04-25 10:12:33 -07:00
Wei-ge Chen d01ff3a1fc Now have the 1st round of full training result 2023-04-25 10:12:06 -07:00
Wei-ge Chen de2c1cf3c7 Update the CSV file 2023-04-24 22:11:22 -07:00
Chris Lovett 4a3cf62a77
robustify the error case a bit more so search jobs can continue when a small number of training jobs fail. (#234)
* remove old notebook

* store onnx latency
allow aml partial training with no snapdragon mode

* fix docker file now that aml branch is merged.

* fix bug in reset
add notebook

* add link to notebook.

* Add an on_start_iteration callback so that user can track which models came from which iterations.

* fix conda file.

* new version

* robustify the error case a bit more so search jobs can continue when a small number of training jobs fail.

* re-use onnx latency numbers.
2023-04-24 17:53:25 -07:00
Wei-ge Chen 212a1d654f Fixed typo 2023-04-24 16:04:20 -07:00
Wei-ge Chen c4529b4a00 Fixed typo. 2023-04-24 16:01:23 -07:00
Wei-ge Chen f9e9287c5a Remove unused code 2023-04-24 15:56:21 -07:00
Wei-ge Chen 3c289d01ec Add script to train all pareto models 2023-04-24 15:48:29 -07:00
Wei-ge Chen a9db06abc6 Enable full training on candidate models 2023-04-24 12:28:34 -07:00
Wei-ge Chen 89a913981f Full search is finished 2023-04-24 09:07:32 -07:00
Chris Lovett 5410e8bdd1
fix conda file (#233)
* remove old notebook

* store onnx latency
allow aml partial training with no snapdragon mode

* fix docker file now that aml branch is merged.

* fix bug in reset
add notebook

* add link to notebook.

* Add an on_start_iteration callback so that user can track which models came from which iterations.

* fix conda file.
2023-04-23 13:45:40 -07:00
Chris Lovett 0165b708b7
Add an on_start_iteration callback on Searcher so that user can track which models came from which iterations (#232)
* remove old notebook

* store onnx latency
allow aml partial training with no snapdragon mode

* fix docker file now that aml branch is merged.

* fix bug in reset
add notebook

* add link to notebook.

* Add an on_start_iteration callback so that user can track which models came from which iterations.
2023-04-23 00:33:05 -07:00
dependabot[bot] 02b03d8c42
Bump protobuf in /tasks/face_segmentation/aml/docker/quantizer (#228)
Bumps [protobuf](https://github.com/protocolbuffers/protobuf) from 3.20 to 3.20.2.
- [Release notes](https://github.com/protocolbuffers/protobuf/releases)
- [Changelog](https://github.com/protocolbuffers/protobuf/blob/main/generate_changelog.py)
- [Commits](https://github.com/protocolbuffers/protobuf/compare/v3.20.0...v3.20.2)

---
updated-dependencies:
- dependency-name: protobuf
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-04-23 00:32:44 -07:00
Gustavo de Rosa ec72afd749
Merge pull request #212 from sgunasekar/patch-1
Ignore errors in loading tokenizer from_cache
2023-04-22 12:58:16 -03:00
Chris Lovett fc031eb447
remove old notebook (#231)
* remove old notebook

* store onnx latency
allow aml partial training with no snapdragon mode

* fix docker file now that aml branch is merged.

* fix bug in reset
add notebook

* add link to notebook.
2023-04-22 03:23:38 -07:00
Suriya Gunasekar 808756cb96
Update fast_hf_dataset_provider.py 2023-04-21 19:45:48 -07:00
Wei-ge Chen 8352dd39f0 Further clean up. Add copyright statements. 2023-04-21 12:24:43 -07:00
Chris Lovett aae38db1a5
Add Azure ML running to the face segmentation task. (#217)
* add code owners

* initial commit, beginnings of AML version of face synthetics search pipeline.

* Add download_and_extract_zip
Add download capability to FaceSyntheticsDataset
Fix face segmentation data prep script.

* fix bugs

* cleanup launch.json

* cleanup launch.json
add download capability to FaceSyntheticsDataset
add download_and_extract_zip helper

* fix file count test

* work in progress

* work in progress

* unify snpe status table and aml training table.

* fix experiment referencing

* fix experiment referencing

* work in progress

* fix complete status

* fix bugs

* fix bug

* fix metric key, we have 2, one for remote snpe, and another for aml training pipelines.

* pass seed through to the search.py script.

* fix use of AzureMLOnBehalfOfCredential

* fix bugs

* fix bugs

* publish new image

* fix bugs

* fix bugs

* fix bug

* maerge

* revert

* new version

* fix bugs

* rename the top level folder from 'snpe' to 'aml' and move all AML code into this folder except the top level entry point 'aml.py'
make the keys returned from the JobCompletionMonitor wait method configurable
Rename AmlPartialTrainingEvaluator and make it restartable.
Turn off save_pareto_model_weights
Remove redundant copy of JobCompletionMonitor

* rev the versions.

* updates to readme information.

* only inference testing targets are 'cpu' and 'snp', trigger the aml partial training by a different key in the config file.

* add iteration info

* new version.

* fix ordering of results from AmlPartialTrainingEvaluator

* change AML batch size default to 64 for faster training
don't store MODEL_STORAGE_CONNECTION_STRING

* Fix bug in merge_status_entity, add more unit test coverage

* new version

* Store training time in status table.

* improve diagram.

* save iteration in status table.

* pick up new version of archai to fix randomness bug in the EvolutionParetoSearch so that these search jobs are restartable.
2023-04-21 10:47:58 -07:00
Wei-ge Chen 424d37e791 Finished clean up this file 2023-04-21 09:54:27 -07:00