зеркало из https://github.com/microsoft/AzureTRE.git
MLFlow Additional Config (#1569)
* Moved inline scripts to template files * Added conda-forge nexus repo * Added conda-forge to conda config * Added mlflow doc mapping * Added conda-forge to conda config * Documented MLFlow use with conda-forge * Added missing link * Amended links * Documented proxied repos * Added conda.anaconda to fqdns list * Formatting fix * Fix linting issues * Fixed incorrect var * Fixed inconsistency * Fixed linting issues * Added md linting rule config * Linting fix * Fixed linting * markdownlint change to reflect lint github action * Fixed config file name Co-authored-by: Jamie D <daltskin@hotmail.com> Co-authored-by: Ross Smith <ross-p-smith@users.noreply.github.com>
This commit is contained in:
Родитель
05ab18f2cd
Коммит
8bbe054510
|
@ -144,15 +144,12 @@ runs:
|
|||
TEST_ACCOUNT_CLIENT_ID: "${{ inputs.TEST_ACCOUNT_CLIENT_ID }}"
|
||||
TEST_ACCOUNT_CLIENT_SECRET: "${{ inputs.TEST_ACCOUNT_CLIENT_SECRET }}"
|
||||
ACR_NAME: ${{ inputs.ACR_NAME }}
|
||||
TRE_URL:
|
||||
"https://${{inputs.TRE_ID}}.${{inputs.LOCATION}}.cloudapp.azure.com"
|
||||
TRE_URL: "https://${{inputs.TRE_ID}}.${{inputs.LOCATION}}.cloudapp.azure.com"
|
||||
TRE_ID: "${{ inputs.TRE_ID }}"
|
||||
TF_VAR_tre_id: "${{ inputs.TRE_ID }}"
|
||||
TF_VAR_terraform_state_container_name:
|
||||
"${{ inputs.TF_VAR_terraform_state_container_name }}"
|
||||
TF_VAR_terraform_state_container_name: "${{ inputs.TF_VAR_terraform_state_container_name }}"
|
||||
TF_VAR_mgmt_resource_group_name: "${{ inputs.TF_VAR_mgmt_resource_group_name }}"
|
||||
TF_VAR_mgmt_storage_account_name:
|
||||
"${{ inputs.TF_VAR_mgmt_storage_account_name }}"
|
||||
TF_VAR_mgmt_storage_account_name: "${{ inputs.TF_VAR_mgmt_storage_account_name }}"
|
||||
TF_VAR_core_address_space: "${{ inputs.TF_VAR_core_address_space }}"
|
||||
TF_VAR_tre_address_space: "${{ inputs.TF_VAR_tre_address_space }}"
|
||||
TF_VAR_swagger_ui_client_id: "${{ inputs.TF_VAR_swagger_ui_client_id }}"
|
||||
|
|
|
@ -7,7 +7,6 @@ on:
|
|||
# https://docs.github.com/en/developers/webhooks-and-events/webhooks/webhook-events-and-payloads#issue_comment
|
||||
|
||||
jobs:
|
||||
|
||||
pr_comment:
|
||||
name: PR comment
|
||||
environment: CICD
|
||||
|
@ -220,7 +219,6 @@ jobs:
|
|||
devops/scripts/destroy_env_no_terraform.sh --core-tre-rg ${RG_NAME}
|
||||
gh pr comment ${PR_NUMBER} --repo $REPO --body "Test environment destroy complete"
|
||||
|
||||
|
||||
run_test:
|
||||
# Run the tests with the re-usable workflow
|
||||
needs: [pr_comment]
|
||||
|
@ -260,4 +258,3 @@ jobs:
|
|||
TRE_ID: ${{ format('tre{0}', needs.pr_comment.outputs.refid) }}
|
||||
CI_CACHE_ACR_NAME: ${{ secrets.ACR_NAME }}
|
||||
TF_LOG: ${{ secrets.TF_LOG }}
|
||||
|
||||
|
|
|
@ -0,0 +1,16 @@
|
|||
{
|
||||
"MD004": false,
|
||||
"MD007": {
|
||||
"indent": 2
|
||||
},
|
||||
"MD013": {
|
||||
"line_length": 400
|
||||
},
|
||||
"MD026": {
|
||||
"punctuation": ".,;:!。,;:"
|
||||
},
|
||||
"MD029": false,
|
||||
"MD033": false,
|
||||
"MD036": false,
|
||||
"blank_lines": false
|
||||
}
|
|
@ -12,12 +12,12 @@ Nexus will be deployed as part of the main TRE terraform deployment. A configura
|
|||
|
||||
1. Fetch the Nexus generated password from storage account.
|
||||
2. Reset the default password and set a new one.
|
||||
3. Store the new password in Key Vault under 'nexus-<TRE_ID>-admin-password'
|
||||
3. Store the new password in Key Vault under `nexus-<TRE_ID>-admin-password`
|
||||
4. Create an anonymous default PyPI proxy repository
|
||||
|
||||
## Setup and usage
|
||||
|
||||
1. A TRE Administrator can access Nexus though the admin jumpbox provisioned as part of the TRE deployment. The credentials for the jumpbox are located in the KeyVault under "vm-<tre-id>-jumpbox-admin-credentials"
|
||||
1. A TRE Administrator can access Nexus though the admin jumpbox provisioned as part of the TRE deployment. The credentials for the jumpbox are located in the KeyVault under `vm-<tre-id>-jumpbox-admin-credentials`
|
||||
2. A researcher can access Nexus from within the workspace by using the internal Nexus URL of: [https://nexus-<TRE_ID>.azurewebsites.net/](https://nexus-<TRE_ID>.azurewebsites.net/)
|
||||
3. To fetch Python packages from the PyPI proxy, a researcher can use pip install while specifying the proxy server:
|
||||
|
||||
|
@ -34,4 +34,24 @@ Notice that since Nexus Shared Service is running on an App Service, the outgoin
|
|||
| --- | --- |
|
||||
| AzureActiveDirectory | Authorize the signed in user against Azure Active Directory. |
|
||||
| AzureContainerRegistry | Pull the Nexus container image, as it is located in Azure Container Registry. |
|
||||
| pypi.org | Enables Nexus to "proxy" python packages to use inside of workspaces |
|
||||
| pypi.org | Enables Nexus to "proxy" python packages to use inside of workspaces. |
|
||||
| repo.anaconda.com | Enables Nexus to "proxy" conda packages to use inside of workspaces. |
|
||||
| conda.anaconda.org | Enables Nexus to "proxy" additional conda packages to use inside of workspaces such as conda-forge. |
|
||||
| *.docker.com | Enables Nexus to "proxy" docker repos to use inside of workspaces. |
|
||||
| *.docker.io | Enables Nexus to "proxy" docker repos to use inside of workspaces. |
|
||||
| archive.ubuntu.com | Enables Nexus to "proxy" apt packages to use inside of workspaces. |
|
||||
| security.ubuntu.com | Enables Nexus to "proxy" apt packages to use inside of workspaces. |
|
||||
|
||||
## Current Repos
|
||||
|
||||
| Name | Type | Source URI | Nexus URI | Usage |
|
||||
| --- | --- | --- | --- | --- |
|
||||
| PiPy | PiPy | [https://pypi.org/] | `https://nexus-<TRE_ID>.azurewebsites.net/repository/pypi/` | Allow use of pip commands. |
|
||||
| Apt PiPy | Apt | [https://pypi.org/] | `https://nexus-<TRE_ID>.azurewebsites.net/repository/apt-pypi/` | Install pip via apt on Linux systems. |
|
||||
| Conda | conda | [https://repo.anaconda.com/pkgs/main/] | `https://nexus-<TRE_ID>.azurewebsites.net/repository/conda/` | Configure conda to have access to default conda packages. |
|
||||
| Conda-Forge | conda | [https://conda.anaconda.org/conda-forge/] | `https://nexus-<TRE_ID>.azurewebsites.net/repository/conda-forge/` | Configure conda to have access to conda-forge packages. |
|
||||
| Docker | apt | [https://download.docker.com/linux/ubuntu/] | `https://nexus-<TRE_ID>.azurewebsites.net/repository/docker/` | Install Docker via apt on Linux systems.|
|
||||
| Docker GPG | raw | [https://download.docker.com/linux/ubuntu/] | `https://nexus-<TRE_ID>.azurewebsites.net/repository/docker-public-key/` | Provide public key to sign apt source for above Docker apt. |
|
||||
| Docker Hub | docker | [https://registry-1.docker.io] | `https://nexus-<TRE_ID>.azurewebsites.net/repository/docker-hub/` | Provide docker access to public images repo. |
|
||||
| Ubuntu Packages | apt | [http://archive.ubuntu.com/ubuntu/] | `https://nexus-<TRE_ID>.azurewebsites.net/repository/ubuntu/` | Provide access to Ubuntu apt packages on Ubuntu systems. |
|
||||
| Ubuntu Security Packages | apt | [http://security.ubuntu.com/ubuntu/] | `https://nexus-<TRE_ID>.azurewebsites.net/repository/ubuntu-security/` | Provide access to Ubuntu Security apt packages on Ubuntu systems. |
|
||||
|
|
|
@ -4,7 +4,7 @@ See: <https://www.mlflow.org>
|
|||
|
||||
## Prerequisites
|
||||
|
||||
- [A base workspace deployed](../workspaces/base.md)
|
||||
- [A base workspace deployed](https://microsoft.github.io/AzureTRE/tre-templates/workspaces/base/)
|
||||
|
||||
- The MLflow workspace service container image needs building and pushing:
|
||||
|
||||
|
@ -12,7 +12,8 @@ See: <https://www.mlflow.org>
|
|||
|
||||
## MLflow Workspace VM Configuration
|
||||
|
||||
Each MLflow server deployment creates a PowerShell (for Windows) and a shell script (for Linux) with the same name as the MLflow server, in the shared storage mounted on the researcher VMs. These scripts will configure the researcher VMs (by installing the required packages and setting up the environment variables) to communicate with the MLflow tracking server.
|
||||
Each MLflow server deployment creates a PowerShell (for Windows) and a shell script (for Linux) with the same name as the MLflow server, in the shared storage mounted on the researcher VMs.
|
||||
These scripts will configure the researcher VMs (by installing the required packages and setting up the environment variables) to communicate with the MLflow tracking server.
|
||||
|
||||
!!! note
|
||||
Please ensure that [nexus reposiory](https://microsoft.github.io/AzureTRE/tre-admins/setup-instructions/configuring-shared-services/) is configured before running the above scripts.
|
||||
|
@ -26,3 +27,26 @@ remote_server_uri = "https://xxxxxxx.azurewebsites.net/"
|
|||
|
||||
mlflow.set_tracking_uri(remote_server_uri)
|
||||
```
|
||||
|
||||
## Using with Conda-Forge
|
||||
|
||||
If working with Conda-Forge you need to ensure the user resource you are using is configured correctly and using the channels available via the [Nexus repository](https://microsoft.github.io/AzureTRE/azure-tre-overview/shared-services/nexus/).
|
||||
If the user resource you have deployed used one of the pre-existing Guacamole user resource templates and has conda installed by default, conda will already be configured to use the correct channels via Nexus.
|
||||
If not and conda has been manually deployed on the user resource, the following script can be used to configure conda:
|
||||
|
||||
```shell
|
||||
conda config --add channels ${nexus_proxy_url}/repository/conda/ --system
|
||||
conda config --add channels ${nexus_proxy_url}/repository/conda-forge/ --system
|
||||
conda config --remove channels defaults --system
|
||||
conda config --set channel_alias ${nexus_proxy_url}/repository/conda/ --system
|
||||
```
|
||||
|
||||
### conda.yml
|
||||
|
||||
When using a `conda.yml` file to configure your MLFlow environment it is required to specify the channels to use.
|
||||
As the traditional channels (conda-forge, defaults etc) have been replaced with Nexus channels, you must ensure that the Nexus channels are being specified here instead.
|
||||
To retireve these channels, run `conda config --show channels` once conda has been configured to use Nexus.
|
||||
|
||||
!!! note
|
||||
When logging models using sklearn, an optional parameter `conda_env` can be passed as either JSON or YML. If this is not passed a default `conda.yml` will be generate for the model, targeting the channel `conda-forge` causing any subsequent environments created using the model to fail.
|
||||
See the official documentation [here](https://www.mlflow.org/docs/latest/python_api/mlflow.sklearn.html) for the full details.
|
||||
|
|
|
@ -79,6 +79,7 @@ nav:
|
|||
- Gitea: 'tre-templates/workspace-services/gitea.md'
|
||||
- Guacamole: 'tre-templates/workspace-services/guacamole.md'
|
||||
- InnerEye: 'tre-templates/workspace-services/inner-eye.md'
|
||||
- MLFlow: 'tre-templates/workspace-services/mlflow.md'
|
||||
- User Resources:
|
||||
- Guacamole Windows VM: 'tre-templates/user-resources/guacamole-windows-vm.md'
|
||||
- Guacamole Linux VM: 'tre-templates/user-resources/guacamole-linux-vm.md'
|
||||
|
|
|
@ -0,0 +1,32 @@
|
|||
{
|
||||
"name": "conda-forge",
|
||||
"online": true,
|
||||
"storage": {
|
||||
"blobStoreName": "default",
|
||||
"strictContentTypeValidation": true,
|
||||
"write_policy": "ALLOW"
|
||||
},
|
||||
"proxy": {
|
||||
"remoteUrl": "https://conda.anaconda.org/conda-forge/",
|
||||
"contentMaxAge": 1440,
|
||||
"metadataMaxAge": 1440
|
||||
},
|
||||
"negativeCache": {
|
||||
"enabled": true,
|
||||
"timeToLive": 1440
|
||||
},
|
||||
"httpClient": {
|
||||
"blocked": false,
|
||||
"autoBlock": true,
|
||||
"connection": {
|
||||
"retries": 0,
|
||||
"userAgentSuffix": "string",
|
||||
"timeout": 60,
|
||||
"enableCircularRedirects": false,
|
||||
"enableCookies": false,
|
||||
"useTrustStore": false
|
||||
}
|
||||
},
|
||||
"baseType": "conda",
|
||||
"repoType": "proxy"
|
||||
}
|
|
@ -17,7 +17,7 @@ variable "nexus_storage_limit" {
|
|||
variable "nexus_allowed_fqdns" {
|
||||
type = string
|
||||
description = "comma seperated string of allowed FQDNs for Nexus"
|
||||
default = "*pypi.org,files.pythonhosted.org,security.ubuntu.com,archive.ubuntu.com,repo.anaconda.com,*.docker.com,*.docker.io"
|
||||
default = "*pypi.org,files.pythonhosted.org,security.ubuntu.com,archive.ubuntu.com,repo.anaconda.com,*.docker.com,*.docker.io,conda.anaconda.org"
|
||||
}
|
||||
|
||||
variable "nexus_properties_path" {
|
||||
|
|
|
@ -78,8 +78,7 @@ install:
|
|||
tre_id: "{{ bundle.parameters.tre_id }}"
|
||||
id: "{{ bundle.parameters.id }}"
|
||||
mgmt_acr_name: "{{ bundle.parameters.mgmt_acr_name }}"
|
||||
mgmt_resource_group_name:
|
||||
"{{ bundle.parameters.mgmt_resource_group_name }}"
|
||||
mgmt_resource_group_name: "{{ bundle.parameters.mgmt_resource_group_name }}"
|
||||
openid_client_id: "{{ bundle.parameters.openid_client_id }}"
|
||||
openid_client_secret: "{{ bundle.parameters.openid_client_secret }}"
|
||||
openid_authority: "{{ bundle.parameters.openid_authority }}"
|
||||
|
@ -105,8 +104,7 @@ uninstall:
|
|||
tre_id: "{{ bundle.parameters.tre_id }}"
|
||||
id: "{{ bundle.parameters.id }}"
|
||||
mgmt_acr_name: "{{ bundle.parameters.mgmt_acr_name }}"
|
||||
mgmt_resource_group_name:
|
||||
"{{ bundle.parameters.mgmt_resource_group_name }}"
|
||||
mgmt_resource_group_name: "{{ bundle.parameters.mgmt_resource_group_name }}"
|
||||
openid_client_id: "{{ bundle.parameters.openid_client_id }}"
|
||||
openid_client_secret: "{{ bundle.parameters.openid_client_secret }}"
|
||||
openid_authority: "{{ bundle.parameters.openid_authority }}"
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
---
|
||||
name: tre-service-guacamole-linuxvm
|
||||
version: 0.1.8
|
||||
version: 0.1.9
|
||||
description: "An Azure TRE User Resource Template for Guacamole (Linux)"
|
||||
registry: azuretre
|
||||
dockerfile: Dockerfile.tmpl
|
||||
|
|
|
@ -4,15 +4,15 @@ apt:
|
|||
primary:
|
||||
- arches:
|
||||
- default
|
||||
uri: '${nexus_proxy_url}/repository/ubuntu/'
|
||||
uri: "${nexus_proxy_url}/repository/ubuntu/"
|
||||
|
||||
security:
|
||||
- arches:
|
||||
- default
|
||||
uri: '${nexus_proxy_url}/repository/ubuntu-security/'
|
||||
uri: "${nexus_proxy_url}/repository/ubuntu-security/"
|
||||
sources_list: |
|
||||
deb [trusted=yes] $PRIMARY $RELEASE main restricted universe multiverse
|
||||
deb [trusted=yes] $PRIMARY $RELEASE-updates main restricted universe multiverse
|
||||
deb [trusted=yes] $SECURITY $RELEASE main restricted universe multiverse
|
||||
deb [trusted=yes] ${nexus_proxy_url}/repository/apt-pypi/ $RELEASE main restricted universe multiverse
|
||||
deb [signed-by=/etc/apt/trusted.gpg.d/docker-archive-keyring.gpg] ${nexus_proxy_url}/repository/docker/ $RELEASE stable
|
||||
deb [trusted=yes] $PRIMARY $RELEASE main restricted universe multiverse
|
||||
deb [trusted=yes] $PRIMARY $RELEASE-updates main restricted universe multiverse
|
||||
deb [trusted=yes] $SECURITY $RELEASE main restricted universe multiverse
|
||||
deb [trusted=yes] ${nexus_proxy_url}/repository/apt-pypi/ $RELEASE main restricted universe multiverse
|
||||
deb [signed-by=/etc/apt/trusted.gpg.d/docker-archive-keyring.gpg] ${nexus_proxy_url}/repository/docker/ $RELEASE stable
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
#!/bin/bash
|
||||
|
||||
# Remove apt sources not included in sources.list file
|
||||
# Remove apt sources not included in sources.list file
|
||||
sudo rm /etc/apt/sources.list.d/*
|
||||
|
||||
# Update apt packages from configured Nexus sources
|
||||
|
@ -79,6 +79,7 @@ if [ ${conda_config} -eq 1 ]; then
|
|||
export PATH="/anaconda/bin":$PATH
|
||||
export PATH="/anaconda/envs/py38_default/bin":$PATH
|
||||
conda config --add channels ${nexus_proxy_url}/repository/conda/ --system
|
||||
conda config --add channels ${nexus_proxy_url}/repository/conda-forge/ --system
|
||||
conda config --remove channels defaults --system
|
||||
conda config --set channel_alias ${nexus_proxy_url}/repository/conda/ --system
|
||||
fi
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
---
|
||||
name: tre-service-guacamole-windowsvm
|
||||
version: 0.1.6
|
||||
version: 0.1.7
|
||||
description: "An Azure TRE User Resource Template for Guacamole (Windows 10)"
|
||||
registry: azuretre
|
||||
dockerfile: Dockerfile.tmpl
|
||||
|
|
|
@ -28,6 +28,7 @@ $Utf8NoBomEncoding = New-Object System.Text.UTF8Encoding $False
|
|||
if( ${CondaConfig} -eq 1 )
|
||||
{
|
||||
conda config --add channels ${nexus_proxy_url}/repository/conda/ --system
|
||||
conda config --add channels ${nexus_proxy_url}/repository/conda-forge/ --system
|
||||
conda config --remove channels defaults --system
|
||||
conda config --set channel_alias ${nexus_proxy_url}/repository/conda/ --system
|
||||
}
|
||||
}
|
||||
|
|
|
@ -0,0 +1,4 @@
|
|||
export AZURE_STORAGE_CONNECTION_STRING="${MLFlow_Connection_String}"
|
||||
pip install mlflow==1.24.0
|
||||
pip install azure-storage-blob==12.10.0
|
||||
pip install azure-identity==1.8.0
|
|
@ -0,0 +1,4 @@
|
|||
[Environment]::SetEnvironmentVariable("AZURE_STORAGE_CONNECTION_STRING", "${MLFlow_Connection_String}", "Machine")
|
||||
pip install mlflow==1.24.0
|
||||
pip install azure-storage-blob==12.10.0
|
||||
pip install azure-identity==1.8.0
|
|
@ -3,13 +3,27 @@ data "azurerm_storage_share" "shared_storage" {
|
|||
storage_account_name = local.storage_name
|
||||
}
|
||||
|
||||
data "template_file" "mlflow-windows-config" {
|
||||
template = file("${path.module}/../mlflow-vm-config/windows/template_config.ps1")
|
||||
vars = {
|
||||
MLFlow_Connection_String = data.azurerm_storage_account.mlflow.primary_connection_string
|
||||
}
|
||||
}
|
||||
|
||||
data "template_file" "mlflow-linux-config" {
|
||||
template = file("${path.module}/../mlflow-vm-config/linux/template_config.sh")
|
||||
vars = {
|
||||
MLFlow_Connection_String = data.azurerm_storage_account.mlflow.primary_connection_string
|
||||
}
|
||||
}
|
||||
|
||||
resource "local_file" "mlflow-windows-config" {
|
||||
content = "[Environment]::SetEnvironmentVariable('AZURE_STORAGE_CONNECTION_STRING', '${data.azurerm_storage_account.mlflow.primary_connection_string}', 'Machine')\npip install mlflow==1.24.0\npip install azure-storage-blob==12.10.0\n"
|
||||
content = data.template_file.mlflow-windows-config.rendered
|
||||
filename = "${path.module}/../mlflow-vm-config/windows/config.ps1"
|
||||
}
|
||||
|
||||
resource "local_file" "mlflow-linux-config" {
|
||||
content = "export AZURE_STORAGE_CONNECTION_STRING=\"${data.azurerm_storage_account.mlflow.primary_connection_string}\"\npip install mlflow==1.24.0\npip install azure-storage-blob==12.10.0\n"
|
||||
content = data.template_file.mlflow-linux-config.rendered
|
||||
filename = "${path.module}/../mlflow-vm-config/linux/config.sh"
|
||||
}
|
||||
|
||||
|
|
Загрузка…
Ссылка в новой задаче