* Add a --test option that runs only the data prep step to test the environment is working.
* force train.py to grab the lock on the row (removing rare failure case).
* Fix snpe kubernetes scaling using the anti-node affinity pattern.
* Publish new docker image.
* Add mlflow integration to train.py.
* Add script that does full training pipeline for final pareto models.
* switch to bokeh so I can get nice tooltips on each dot in the scatter plot.
* add axis titles.
* Add device F1 scoring to train_pareto
* Add more to readmes.
* add image
* Add helper script to do final F1 scoring on Qualcomm devices.
* fix lint errors.
* fix bugs
* rev environment version.
* fix lint error
* rename snp_test script and fix bugs
* add iteration 20
* fix bug
* Add gif animations
* Fix bugs in snp_test
* fix bugs - snp_test needs to reset the .dlc files.
* make loop.sh executable
* only reset the models we are actually going to test.
* add final snpe f1 score chart
* Improve calc_pareto_frontier helper
* Show final dots that fell off pareto as gray.
* full training is complete, this is the final results.
* Add a --test option that runs only the data prep step to test the environment is working.
* force train.py to grab the lock on the row (removing rare failure case).
* Fix snpe kubernetes scaling using the anti-node affinity pattern.
Publish new docker image.
Add mlflow integration to train.py.
* Add script that does full training pipeline for final pareto models.
* add iteration 7
* add iteration 9
* switch to bokeh so I can get nice tooltips on each dot in the scatter plot.
* add axis titles.
* Add device F1 scoring to train_pareto
Add more to readmes.
* add image
* Add helper script to do final F1 scoring on Qualcomm devices.
* fix lint errors.
* fix bugs
* remove old notebook
* store onnx latency
allow aml partial training with no snapdragon mode
* fix docker file now that aml branch is merged.
* fix bug in reset
add notebook
* add link to notebook.
* Add an on_start_iteration callback so that user can track which models came from which iterations.
* fix conda file.
* new version
* robustify the error case a bit more so search jobs can continue when a small number of training jobs fail.
* re-use onnx latency numbers.
* remove old notebook
* store onnx latency
allow aml partial training with no snapdragon mode
* fix docker file now that aml branch is merged.
* fix bug in reset
add notebook
* add link to notebook.
* Add an on_start_iteration callback so that user can track which models came from which iterations.
* fix conda file.
* remove old notebook
* store onnx latency
allow aml partial training with no snapdragon mode
* fix docker file now that aml branch is merged.
* fix bug in reset
add notebook
* add link to notebook.
* Add an on_start_iteration callback so that user can track which models came from which iterations.
* remove old notebook
* store onnx latency
allow aml partial training with no snapdragon mode
* fix docker file now that aml branch is merged.
* fix bug in reset
add notebook
* add link to notebook.
* add code owners
* initial commit, beginnings of AML version of face synthetics search pipeline.
* Add download_and_extract_zip
Add download capability to FaceSyntheticsDataset
Fix face segmentation data prep script.
* fix bugs
* cleanup launch.json
* cleanup launch.json
add download capability to FaceSyntheticsDataset
add download_and_extract_zip helper
* fix file count test
* work in progress
* work in progress
* unify snpe status table and aml training table.
* fix experiment referencing
* fix experiment referencing
* work in progress
* fix complete status
* fix bugs
* fix bug
* fix metric key, we have 2, one for remote snpe, and another for aml training pipelines.
* pass seed through to the search.py script.
* fix use of AzureMLOnBehalfOfCredential
* fix bugs
* fix bugs
* publish new image
* fix bugs
* fix bugs
* fix bug
* maerge
* revert
* new version
* fix bugs
* rename the top level folder from 'snpe' to 'aml' and move all AML code into this folder except the top level entry point 'aml.py'
make the keys returned from the JobCompletionMonitor wait method configurable
Rename AmlPartialTrainingEvaluator and make it restartable.
Turn off save_pareto_model_weights
Remove redundant copy of JobCompletionMonitor
* rev the versions.
* updates to readme information.
* only inference testing targets are 'cpu' and 'snp', trigger the aml partial training by a different key in the config file.
* add iteration info
* new version.
* fix ordering of results from AmlPartialTrainingEvaluator
* change AML batch size default to 64 for faster training
don't store MODEL_STORAGE_CONNECTION_STRING
* Fix bug in merge_status_entity, add more unit test coverage
* new version
* Store training time in status table.
* improve diagram.
* save iteration in status table.
* pick up new version of archai to fix randomness bug in the EvolutionParetoSearch so that these search jobs are restartable.