* Add readme template to contributing.md
* Update exclusion list
* Add more paths to exclusion file
* Add debugging statements
* Wrap in try/except
* Wrap in try/except
* Run black to format
* Add sample to exclusions list
* ci: Fix computation of repo_root in check-readme.py
__file__ is a relative path for Python 3.5-3.8, but absolute
in Python>=3.9
This distinction causes us to calculate the incorrect path for the
root of the repo:
Path.parent will return '.' once you hit the "root" of a relative
path (i.e. Path('./README.md').parent.parent == Path('.'))
* refactor: Rename working_directory to repo_root
* enable create new ws for oai
* set env var
* export env var
* setting location
* setting location
* removed acr access
* removed all unnecessary verifications
* changed dataset format
* oaiv2 sdk example
* modified workflows and notebooks
* removed all cell outputs
* removed old files
* changed cron param
* reverted cron param
* black format
* Added cli examples for new finetune pipeline component (#2711)
* Added cli examples for new finetune pipeline component
* Added screenshots for cli examples
* added and renamed cli workflows
* directory fixes
* Vvatsalya/fix cli oai v2 workflow (#2717)
* setting location as ncus
* set in setup-cli step
* new init and setup script for oai v2
* correcting syntax for init sh
* fix init oai v2 script
* fix
* fix 1
* fix 2
* change training dataset name in cli oai v2 example
* sdk workflow region change
* use oai as suffix in ws name
* added oai v1 and v2 in readme for cli and sdk
* adding install reqs for sdk
---------
Co-authored-by: Vishal Vatsalya <98515131+vvatsalya@users.noreply.github.com>
Co-authored-by: Vishal Vatsalya <vvatsalya@microsoft.com>
Co-authored-by: Ayush Mishra <61145377+novaturient95@users.noreply.github.com>
* refactor: Stop invoking jq in infra scripts
* refactor: Remove some commented out code from infra
* refactor: Don't install jq
Ubuntu runners come with it pre-installed:
4fe7f6bc86/images/linux/Ubuntu2204-Readme.md
* refactor: Replace `az command | jq '.QUERY'` with `az command --query QUERY`
* refactor: Collapse `jq | jq | jq` into a single jq invocation
* refactor: Do not install xmlstarlet
Seem to be entirely unused
* refactor: Don't install uuid-runtime and remove install_packages function
* Clean old resource group as well
* Added escape in regular expression for "AzureML Metrics Writer (preview)". So that more Role Assignments are cleaned
* delete registry after creation
* always run deletion
* init env
* set working dir
* add sdk_helper
* add sub_id
* edit for debugging
* try debugging
* try deletion inside the notebook
* delete in the same step
* print registry name for debugging
* remove deletion for testing.
* change everything back for final pr
* modify format
* make deletion inside the ipnb
* try sdk deletion
* try sdk
* use verified api.
* change format
* Update registry-create.ipynb
* Add role assignment cleanup
* Add verbose logging for troubleshooting the exit code 100 issue
* bypass the error for grub-efi-amd64-signed
* Adding the redirections back
* add deepspeed example
* fix mlflow logging in train.py
* move examples to cli folder
* move to jobs, add job.yaml files
* generate workflow file
* add comment in train.py
* add latest tag to env
* try different way of starting run
* set up job as pipeline
* move to pipelines folder
* add temp yaml file to pass validation
* modify readme.py to support deepspeed
* move generate-yml script
* recreate workflow files.
* move generated key location
* change environment
* change env for both examples
* fix env
* try v100 computes
* create nd40 cluster
* add compute create line
* change max number of nodes
* change data directory
* change command to block
* move data files into src
* remove unused data thing
* change data file type
* fix mlflow get run fail
* change command style in job
* change location of job
* change generate key path
* change code: path
* rename results folder
* change back key path
* change max number of nd40 nodes
* try moving autotuning folder into src
* change output file name
* try overwriting output
* add dockerfile for custom env
* change env in yaml
* update README and move autotune example
* move dockerfile to parent dir
* Remove cluster after use even if it is in the failed state.
* Fix readme.py and add the config
* Rollback minor changes
* Re generate config with the new script
* Add documentation
* Reformat readme.py
* Fix typo
* Fix workflow files
* Adding checks for k8 extension and check for attaching the compute to workspace
* Adding a note on the top of all the workflows about regenerating using python script
* Update header in the workflows
* Adding the template for account and container
* Update the resource names for November to recreate resources
* Adding checks for k8 extension and check for attaching the compute to workspace
* Adding a note on the top of all the workflows about regenerating using python script
* Update header in the workflows
* Adding the template for account and container
* Create and cleanup registry
* Create and cleanup registry
* Create and cleanup registry
* Create and cleanup registry
* Create and cleanup registry
* nyc_taxi_data_regression-run-sample
* Add rg in az resource delete
* Adding checks for k8 extension and check for attaching the compute to workspace
* Adding a note on the top of all the workflows about regenerating using python script
* Update header in the workflows
* Adding reusable bootstrapping script to configure infra resources
* Fixing tput error
* Disable color coding for the output
* Removing the tput call
* Adjusting the directory structure of infra scripts
* Adjusting the path of the bootstrapping script
* Rerunning 2 workflows only
* Addressing the indendation issue
* Fixing the path of the init script
* loggign the path
* Adding workspace path
* adding helper script call
* Removing typo from the file name
* Adding source call
* Refactoring the Compute creation script
* Updating single-step workflows
* Updating CLI script to use bootstrapping script
* Updating all of the cli workflows
* Updating all of the CLI and SDK workflows
* Allowing endpoint workflows to continue ot use old subscription
* Updating all of the CLI and SDK workflows for cli configuration
* Refactoring logging
* Refactoring some of the existing scipts
* Updating the dataset call
* Updating pipeline workflows for extracting the correct config
* Updating the azure config
* Adding helper scripts for resource provisioning
* Refactoring samples to use the new subscription
* fix permissions for running sh files
* fix permissions for running sh files
* Ensuring the tool are installed
* Add copy data script
* Updating the config generation in SDK workflows
* Updating the config generation directory
* Updating the config generation directory one subdirectory level
* Extracting the folder name from the path for creating config
* Adding logging
* Adding logging for pwd
* Adding config.json
* Adding generic scripts for granting permissions on rg
* Updating the workflows by adding the cluster info
* Adding the new dataset
* Adding concurrency and updating the schedules of the jobs
* Updating the workflows to use concurrency
* cancel a currently running workflow from the same PR, branch or tag
* Updating workflows for concurrency
* Logging the information about workflow names
* Changign the group name to variable
* Printing the notebooks content
* Print the content of the notebooks
* Merging changes from main
* Adding pre-commit-hook
* Updating the task names
* Reverting the set-up repo changes
* Addressing the formatting of the files
* Addressing the formatting of the files
* Updating the moe scripts to fetch the registry name
* Updating the new workflows
* Adding new workflows and fixing the formatting of the files
* Updating the replacements
* Updating the ACR details
* Updating bootstrapping script
* Updating the ACR
* Fixing the typo in notebook
* Adding the script to create vnet/subnet
* Updating the workflows
* Updating the identity creation with appropriate roles
* Archieve the model while running endpoint workflows and create vnets
* Adding bootstrapping workflow
* Removing managed online endpoints VS workflow
* Continue on error for bootstrap workflow
* Adding the path in the workflow
* Debug the bootstrapping workflow
* Adding call to bootstrap mltable
* Adding call az ml commands
* Adding call az ml commands by removing data calls
* Adding the replacements for InteractiveBrowserCredential
* Reformatting the notebook
* Updating the bootstrap workflow
* Updating the bootstrap workflow with less logging
* Updating the bootstrap workflow with less logging
* Updating the bootstrap workflows
* Adding more bootstrapping steps
* Removing newly added workflows
* Install tools as part of bootstrap step
* Updating the script to grant permissions on identity
* Updating the ACR Details
* Importing data for automl jobs
* Reformatting the files and regenrating all files
* Updating the SDK package version
* Updating bootstrapping script name
* fix pipeline sample 4b
* fix pipeline sample 4b
* Reformating and updating the folder path in bootstrapping workflow
* Update sdk path in bootstrapping workflow
* Regenerate the workflows
* Merging changes from main and regenerating the workflows
* Adding owner for the tags
* update image cli to run prepare_data.py first (#1699)
Co-authored-by: Riti Sharma <risha@microsoft.com>
* Regenerating the workflows after adding data prep step
* add automl in the check for prepare_data
* update ymls
* Updating the script to reinstall ML extension
* Regenerating the workflows for automl to address identity installation issue
* Regenerating the workflows for automl to address identity installation issue and reformatting
* Installing SDK required for running prepare dataset
* Updating the SDK call for automl workflows
* Fix failed CLI pipeline(add restriction on azureml-mlflow) (#1703)
* add restriction
* fix
* update
* directly use expected version
* update
* update
* remove deleted wf
* Updating ARC script
* Updating the readme file
* Regenerate the workflows
* Updating arc scripts
* Adding macro for account name and container
* Calling update dataset script
* Regenerating workflows and addressing formatting
* Merging changes from main after sdk release
* Address the vulnerabilities in the VMSS agents pool
* Merging changes from main for new workflows
* Update SDK v2 in setup file
* update cli cifar distributed
* init (#1786)
* Updating registry env variable
* Automated resource provisioning - registries (#1812)
* init
* enable registry check
* init REGISTRY_NAME
* changed env var names and import
* try to fix checks
* ran readme.py; changed registry yml location; modified helper
* fixing syntax
* [mldesigner] Modify sdk/1b to adopt newest mldesigner 0.1.0b7 (#1794)
* update
* update
* update
* update
* update
* update
* change registry name for testing
* fixing syntax
* fixing syntax
* change registry name for testing
* change registry name for testing
* Dont run samples for any checkin (#1799)
* Dont run samples for any checkin
* Reformat with black
* change registry name for testing
* change registry name back
* replace version
* Batch Endpoints scenarios (#1777)
* batch scenarios
remove model
* black
* comments
* typo
* typos
* formatting
* build
* batch.sh fix
* modified model deployment
* fixing workflow
* Fix model path in Local Debug in VSCode (#1679)
* Resolve model path issue and add to readme ignore
* Update readme
* Revert readme
* Update readme
* Fix environment
* Remove stray line
* Change to default azure credential
* remove workflow
* [mldesigner] Fix mldesigner version in sample (#1797)
* update
* update
* update
* update
* update
* update
* ran readme.py, edited readme.py, edited workflow
* added log for debugging
* fixing env var
* added enviroment to deploy.yml
* revert deploy.yml
* resolved conflicts
* revert changes in v1
Co-authored-by: Korin <0mza987@gmail.com>
Co-authored-by: Bala P V <33712765+balapv@users.noreply.github.com>
Co-authored-by: Facundo Santiago <fasantia@microsoft.com>
Co-authored-by: Alex Wallace <80542152+xanwal@users.noreply.github.com>
* Updating the bootstrapping script
* Update sdk_helpers.sh
fixing the registry creating bug
* modified ensure_registry
* Updating the resource names to use November month resources
Co-authored-by: lochen <cloga0216@gmail.com>
Co-authored-by: sharma-riti <52715641+sharma-riti@users.noreply.github.com>
Co-authored-by: Riti Sharma <risha@microsoft.com>
Co-authored-by: Han Wang <phoenix.seek@gmail.com>
Co-authored-by: Ayush Mishra <61145377+novaturient95@users.noreply.github.com>
Co-authored-by: Douglas Xiao <xiake@microsoft.com>
Co-authored-by: Komnus丶Q <40655746+quchuyuan@users.noreply.github.com>
Co-authored-by: Korin <0mza987@gmail.com>
Co-authored-by: Bala P V <33712765+balapv@users.noreply.github.com>
Co-authored-by: Facundo Santiago <fasantia@microsoft.com>
Co-authored-by: Alex Wallace <80542152+xanwal@users.noreply.github.com>
Co-authored-by: Chuyuan Qu <chuyuanqu@microsoft.com>