* Revision of getting started guide up to Batch scoring. Also new diagam and fix to ARM template to remove region restrictions.
* Detail on Batch scoring for Getting Started and additional debug message in the copy to ease of diagnosing issues
* Tweaked text and added a NOQA for message
* Clarified/Fixed getting started instructions for WebApp/AppService deployment
Co-authored-by: Joao Pedro Martins <joaopedro.martins@microsoft.com>
* Revision of getting started guide up to Batch scoring. Also new diagam and fix to ARM template to remove region restrictions.
* Detail on Batch scoring for Getting Started and additional debug message in the copy to ease of diagnosing issues
* Tweaked text and added a NOQA for message
Co-authored-by: Joao Pedro Martins <joaopedro.martins@microsoft.com>
* Change AKS deployment configuration
Deployment config changed from 1CPU/4GB to 0.5CPU/2GB so it fits a AKS created with default parameters
* Update custom_model.md
Co-authored-by: João Pedro Martins <lokijota@users.noreply.github.com>
TRAIN_SCRIPT_PATH value updated from 'training/train.py' to 'training/train_aml.py'.
This is aligned with /.pipelines/diabetes_regression-variables-template.yml.
* development_setup.md update
development_setup.md updated to use install_requirements.sh.
See #158:
> Use conda rather than pip packages when possible (as recommended in AML docs).
> Dev environment is hence also constrained to conda (no more pip install -r requirements.txt).
* Content of install_requirements.sh deleted
* build_train_pipeline.py filename fixed
* build_train_pipeline.py filename fixed
* docs
* fix pipeline status badge and tf naming uniqueness
* add a note about how to change the name of the pipeline
* extra clarification on workspace connection
Recently the step to get the model version had a name associated with the step. We need to also update the references in future step to use the step name as a prefix. This resulted in the MODEL_VERSION variable causing failures in each CD deployment step.
This did not show up in CI because the MODEL_VERSION var is hard coded in the variable group. We should also remove that.
A bug surfaced where first time evaluation of a model fails due to the
Model constructor throwing if the model does not exist.
Looking deeper, we see that most calls to get_model expect a possible
None response and check at the call site. Unfortunately we get the same
WebserviceException class for a model not being found as we do a REST
error or similar.
This change is a stopgap mitigation to restore compatibility with the
existing callers, and compromises by allowing the model version
dependent behavior to continue passing on exceptions.
In a future follow up we should settle on a convention and allow version
checks to propagate failure while still giving the possibility for
handling a service exception in the caller.
- Tied SDK version to 1.2.x as with conda_dependencies.yml
- Lock versions to point updates
- Kept the rest of the deps manually specified to keep image size small and minimize regressions
Azure deprecated their top level meta-package which led to a deprecation error. We don't actually need this top level package.
I took the opportunity to clean up the conda deps using the dependency tree.
- Trimmed the package list
- Explicitly list pip to avoid conda warning
- Use azureml-defaults for WebApp dependencies
- Lock azureml-sdk and azureml-defaults versions
- Add comments for dependencies
* Add vars template to canary pipeline
* Enable ACR authentication on AKS using a service principal
- Upgrade helm version to 3.1.1
- Remove ACR secret from the abtest-model deployment