Граф коммитов

26 Коммитов

Автор SHA1 Сообщение Дата
Chris Lovett f807260cf4
Add script to do full training of final set of models to face segmentation task (#238)
* Add a --test option that runs only the data prep step to test the environment is working.
* force train.py to grab the lock on the row (removing rare failure case).
* Fix snpe kubernetes scaling using the anti-node affinity pattern.
* Publish new docker image.
* Add mlflow integration to train.py.
* Add script that does full training pipeline for final pareto models.
* switch to bokeh so I can get nice tooltips on each dot in the scatter plot.
* add axis titles.
* Add device F1 scoring to train_pareto
* Add more to readmes.
* add image
* Add helper script to do final F1 scoring on Qualcomm devices.
* fix lint errors.
* fix bugs
* rev environment version.
* fix lint error
* rename snp_test script and fix bugs
* add iteration 20
* fix bug
* Add gif animations
* Fix bugs in snp_test
* fix bugs - snp_test needs to reset the .dlc files.
* make loop.sh executable
* only reset the models we are actually going to test.
* add final snpe f1 score chart
* Improve calc_pareto_frontier helper
* Show final dots that fell off pareto as gray.
* full training is complete, this is the final results.
2023-05-05 20:40:26 -07:00
Chris Lovett aae38db1a5
Add Azure ML running to the face segmentation task. (#217)
* add code owners

* initial commit, beginnings of AML version of face synthetics search pipeline.

* Add download_and_extract_zip
Add download capability to FaceSyntheticsDataset
Fix face segmentation data prep script.

* fix bugs

* cleanup launch.json

* cleanup launch.json
add download capability to FaceSyntheticsDataset
add download_and_extract_zip helper

* fix file count test

* work in progress

* work in progress

* unify snpe status table and aml training table.

* fix experiment referencing

* fix experiment referencing

* work in progress

* fix complete status

* fix bugs

* fix bug

* fix metric key, we have 2, one for remote snpe, and another for aml training pipelines.

* pass seed through to the search.py script.

* fix use of AzureMLOnBehalfOfCredential

* fix bugs

* fix bugs

* publish new image

* fix bugs

* fix bugs

* fix bug

* maerge

* revert

* new version

* fix bugs

* rename the top level folder from 'snpe' to 'aml' and move all AML code into this folder except the top level entry point 'aml.py'
make the keys returned from the JobCompletionMonitor wait method configurable
Rename AmlPartialTrainingEvaluator and make it restartable.
Turn off save_pareto_model_weights
Remove redundant copy of JobCompletionMonitor

* rev the versions.

* updates to readme information.

* only inference testing targets are 'cpu' and 'snp', trigger the aml partial training by a different key in the config file.

* add iteration info

* new version.

* fix ordering of results from AmlPartialTrainingEvaluator

* change AML batch size default to 64 for faster training
don't store MODEL_STORAGE_CONNECTION_STRING

* Fix bug in merge_status_entity, add more unit test coverage

* new version

* Store training time in status table.

* improve diagram.

* save iteration in status table.

* pick up new version of archai to fix randomness bug in the EvolutionParetoSearch so that these search jobs are restartable.
2023-04-21 10:47:58 -07:00
Chris Lovett 28288861f1 add more info on docker image publish process for SNPE quantizer. 2023-04-07 16:58:58 -07:00
Chris Lovett 6b5c489fb2 remove launch.json 2023-04-06 14:14:00 -07:00
Chris Lovett 10bd7024a8 Remove MODEL_STORAGE_CONNECTION_STRING from docker image and move it to the kubernetes deployment script. 2023-04-05 11:55:47 -07:00
Chris Lovett 0b30fa121b remove dead code, move CI pipeline to min python 3.8 which is needed by pytorch-lightning. 2023-04-04 19:30:15 -07:00
Chris Lovett 5aa824c465 Merge branch 'clovett/snpe' into task_segmentation 2023-04-04 10:49:38 -07:00
Chris Lovett 56b64ab9fa Merge branch 'task_segmentation' of github.com:microsoft/archai into task_segmentation 2023-04-04 10:26:45 -07:00
Chris Lovett 851614ef81 gitignore output folders 2023-04-04 10:26:11 -07:00
Chris Lovett bf573d6992 remove output folders 2023-04-03 19:07:32 -07:00
Chris Lovett 5d516c3fe3
add tutorial on multi node search on azure (#195)
add tutorial on multi node search on azure using azure-ai-ml 1.5.0
2023-03-24 16:30:05 -07:00
Chris Lovett 934eea67d3 ignore .azureml folders. 2023-02-15 12:08:54 -08:00
Gustavo Rosa 59e52cca07 chore(root): Adds .db extension to .gitignore. 2023-01-20 16:30:21 -03:00
Gustavo Rosa c059c391cd patch(root): Patches missing/changed files according to master. 2022-12-16 18:51:26 -03:00
Debadeepta Dey 69f162ec61 Reverted changes to gitignore. 2022-12-16 16:53:11 -03:00
Debadeepta Dey 984482d595 Added natsbench compiler code to repo. 2022-12-16 16:53:11 -03:00
Gustavo Rosa fdc9d9296e fix(root): Improves gitignore files. 2022-12-16 16:51:36 -03:00
Gustavo de Rosa 43e500a01a chore(gitgnore): Ignores .onnx files. 2022-12-16 16:45:50 -03:00
Debadeepta Dey 06b2ed29e9 3D pareto-frontier search is now being tested. Runs through. More testing needed. 2022-12-16 16:41:15 -03:00
Gustavo de Rosa 6116cf1c76 chore(archai): Adds log files to gitignore list and removes unused script. 2022-12-16 16:41:08 -03:00
Chris Lovett e85b7016a8 initial commit of qualcomm device code 2022-12-16 16:26:46 -03:00
Gustavo Rosa 1dab9295bf chore(archai): Adds updated files. 2022-12-16 16:26:45 -03:00
Gustavo Rosa 4d1d93d08b fix(root): Changes ptignore to amltignore and removes unused init on root folder. 2022-12-16 16:26:44 -03:00
Shital Shah 5613d2f597 added simple cifar resnet test code 2022-12-16 16:25:56 -03:00
Ubuntu 8410ae3023 Making changes such that petridish distributed eval can train all the models needed. 2022-12-16 16:18:53 -03:00
Shital Shah af1d639c6e initial 2020-05-18 03:11:07 -07:00