Pushing image to datastore and deleting after inference (#4)

* Pushing image to datastore

And immediately deleting 'local' copy (from the snapshot)

* Deleting image file after bootstar script runs

Not tested yet.

* two spurious newlines

* mypy spotted missing colon

* better defaults

* Fixing two mypy errors

* Test prints

Using AzureML Studio to work out is shapshot is affected by the deletion

* Saves image to datastore

* Including GUID in datastore path

* WiP downloading images from datastore in bootstrap script

* Overwriting image data zip

* tidy up

* Returning GUID and tidy-up

* Added unit test of image data zip overwrite

* Fixing API call to submit with ignored return

* Swapping temp image folder name to config

* Fixing documentation

* Plumbing for datastore name

* Swapping from default to named datastore

* Fixing main.yml build env

* Decreasing indentation by breaking loop

https://github.com/microsoft/InnerEye-Inference/pull/4#discussion_r617584432

* pylint not used and commented out line deleted

* Fixing flake8 warnings

* Swapping imports back to single line

After Anton's query on the PR I checked the .flake8 file and our maximum line length is 160 and
so I do not need to break up these lines.

* Hard coding not-secret settings in workflow

As per Anton's request:
https://github.com/microsoft/InnerEye-Inference/pull/4#discussion_r618144366

* /= consistency

* removing 2 debug print statements

* removed brackets

* swapping to run.wait_for_completion

* swapping to writetext

* removing unnecessary initialization

* Rationalising 'run' method: paths, arguments, and comments

* removing debug print lines

* Reverting line lengths

I had assumed a max line length of 100, but it is 160 in the .flake8 configuration file so
I have reverted the changes I had made to not-too-long lines, and fixed a few others for
legibility

* Swapping to required=True for params

* Typo and unnecessary Path() spotted by Jonathan

* Changing parameter to underscores from dashes

To bring them into line with the rest of the InnerEye projects. They needed
to be different when the unknown args were passed straight on to score.py
but that is not how the arguments flow through anymore.

* Line length fix in comment to kick off license/cla

The 'license/cla' check in the build pipeline has stalled and there
is not 'retry' buttong so I am hoping to kick it into action again
with an almost vacuaous commit!
This commit is contained in:
Tim Regan 2021-04-26 15:31:23 +01:00 коммит произвёл GitHub
Родитель 4f9d1fad61
Коммит d487fb93c9
Не найден ключ, соответствующий данной подписи
Идентификатор ключа GPG: 4AEE18F83AFDEB23
10 изменённых файлов: 172 добавлений и 93 удалений

10
.github/workflows/main.yml поставляемый
Просмотреть файл

@ -32,13 +32,15 @@ jobs:
env:
CUSTOMCONNSTR_AZUREML_SERVICE_PRINCIPAL_SECRET: ${{ secrets.CUSTOMCONNSTR_AZUREML_SERVICE_PRINCIPAL_SECRET }}
CUSTOMCONNSTR_API_AUTH_SECRET: ${{ secrets.CUSTOMCONNSTR_API_AUTH_SECRET }}
CLUSTER: ${{ secrets.CLUSTER }}
WORKSPACE_NAME: ${{ secrets.WORKSPACE_NAME }}
EXPERIMENT_NAME: ${{ secrets.EXPERIMENT_NAME }}
RESOURCE_GROUP: ${{ secrets.RESOURCE_GROUP }}
CLUSTER: "training-nc12"
WORKSPACE_NAME: "InnerEye-DeepLearning"
EXPERIMENT_NAME: "api_inference"
RESOURCE_GROUP: "InnerEye-DeepLearning"
SUBSCRIPTION_ID: ${{ secrets.SUBSCRIPTION_ID }}
APPLICATION_ID: ${{ secrets.APPLICATION_ID }}
TENANT_ID: ${{ secrets.TENANT_ID }}
DATASTORE_NAME: "inferencetestimagestore"
IMAGE_DATA_FOLDER: "temp-image-store"
run: |
conda activate inference
pytest --cov=./ --cov-report=html

Просмотреть файл

@ -38,6 +38,8 @@ export RESOURCE_GROUP=
export SUBSCRIPTION_ID=
export APPLICATION_ID=
export TENANT_ID=
export DATASTORE_NAME=
export IMAGE_DATA_FOLDER=
```
Run with `source set_environment.sh`
@ -59,6 +61,10 @@ If you would like to reproduce the automatic deployment of the service for testi
* `az ad sp create-for-rbac --name "<name>" --role contributor --scope /subscriptions/<subs>/resourceGroups/InnerEyeInference --sdk-auth`
* The previous command will return a json object with the content for the variable `secrets.AZURE_CREDENTIALS` .github/workflows/deploy.yml
## Images
During inference the image data zip file is copied to the IMAGE_DATA_FOLDER in the AzureML workspace's DATASTORE_NAME datastore. At the end of inference the copied image data zip file is overwritten with a simple line of text. At present we cannot delete these. If you would like these overwritten files removed from your datastore you can [add a policy](https://docs.microsoft.com/en-us/azure/storage/blobs/storage-lifecycle-management-concepts?tabs=azure-portal) to delete items from the datastore after a period of time. We recommend 7 days.
## Help and Bug Reporting
1. [Guidelines for how to report bug.](./docs/BugReporting.md)

Просмотреть файл

@ -3,24 +3,24 @@
# Licensed under the MIT License (MIT). See LICENSE in the repo root for license information.
# ------------------------------------------------------------------------------------------
from pathlib import Path
import random
import shutil
import tempfile
import time
import zipfile
from pathlib import Path
from typing import Any, Optional
from unittest import mock
import zipfile
from azureml._restclient.constants import RunStatus
from azureml.core import Experiment, Model, Workspace
from azureml.core import Experiment, Model, Workspace, Datastore
from azureml.exceptions import WebserviceException
from flask import Response
from pydicom import dcmread
from app import app, HTTP_STATUS_CODE, ERROR_EXTRA_DETAILS
from configure import API_AUTH_SECRET, API_AUTH_SECRET_HEADER_NAME
from submit_for_inference import DEFAULT_RESULT_IMAGE_NAME
from app import ERROR_EXTRA_DETAILS, HTTP_STATUS_CODE, RUNNING_OR_POST_PROCESSING, app
from configure import API_AUTH_SECRET, API_AUTH_SECRET_HEADER_NAME, get_azure_config
from download_model_and_run_scoring import DELETED_IMAGE_DATA_NOTIFICATION
from submit_for_inference import DEFAULT_RESULT_IMAGE_NAME, IMAGEDATA_FILE_NAME, SubmitForInferenceConfig, submit_for_inference
# Timeout, in seconds, for Azure runs, 20 minutes.
TIMEOUT_IN_SECONDS = 20 * 60
@ -32,6 +32,8 @@ TEST_DATA_DIR: Path = THIS_DIR / "TestData"
# Test reference series.
TestDicomVolumeLocation: Path = TEST_DATA_DIR / "HN"
PASSTHROUGH_MODEL_ID = "PassThroughModel:4"
def assert_response_error_type(response: Response, status_code: HTTP_STATUS_CODE,
extra_details: Optional[ERROR_EXTRA_DETAILS] = None) -> None:
@ -142,7 +144,7 @@ def test_model_start_authenticated_valid_model_id() -> None:
# Patch the method Experiment.submit to prevent the AzureML experiment actually running.
with mock.patch.object(Experiment, 'submit', return_value=run_mock):
with app.test_client() as client:
response = client.post("/v1/model/start/PassThroughModel:4",
response = client.post(f"/v1/model/start/{PASSTHROUGH_MODEL_ID}",
headers={API_AUTH_SECRET_HEADER_NAME: API_AUTH_SECRET})
assert response.status_code == HTTP_STATUS_CODE.CREATED.value
assert response.content_type == 'text/plain'
@ -280,12 +282,12 @@ def submit_for_inference_and_wait(model_id: str, data: bytes) -> Any:
def test_submit_for_inference_end_to_end() -> None:
"""
Test that submitting a zipped DICOM series to model PassThroughModel:4 returns
Test that submitting a zipped DICOM series to model PASSTHROUGH_MODEL_ID returns
the expected DICOM-RT format.
"""
image_data = create_zipped_dicom_series()
assert len(image_data) > 0
response = submit_for_inference_and_wait("PassThroughModel:4", image_data)
response = submit_for_inference_and_wait(PASSTHROUGH_MODEL_ID, image_data)
assert response.content_type == 'application/zip'
assert response.status_code == HTTP_STATUS_CODE.OK.value
# Create a scratch directory
@ -313,7 +315,7 @@ def test_submit_for_inference_end_to_end() -> None:
# Check the modality
assert ds.Modality == 'RTSTRUCT'
assert ds.Manufacturer == 'Default_Manufacturer'
assert ds.SoftwareVersions == 'PassThroughModel:4'
assert ds.SoftwareVersions == PASSTHROUGH_MODEL_ID
# Check the structure names
expected_structure_names = ["SpinalCord", "Lung_R", "Lung_L", "Heart", "Esophagus"]
assert len(ds.StructureSetROISequence) == len(expected_structure_names)
@ -325,6 +327,7 @@ def test_submit_for_inference_end_to_end() -> None:
assert len(ds.ROIContourSequence) == len(expected_structure_names)
for i, item in enumerate(expected_structure_names):
assert ds.ROIContourSequence[i].ReferencedROINumber == i + 1
# Download image data zip, which should now have been overwritten
def test_submit_for_inference_bad_image_file() -> None:
@ -335,6 +338,34 @@ def test_submit_for_inference_bad_image_file() -> None:
"""
# Get a random 1Kb
image_data = bytes([random.randint(0, 255) for _ in range(0, 1024)])
response = submit_for_inference_and_wait("PassThroughModel:4", image_data)
response = submit_for_inference_and_wait(PASSTHROUGH_MODEL_ID, image_data)
assert_response_error_type(response, HTTP_STATUS_CODE.BAD_REQUEST,
ERROR_EXTRA_DETAILS.INVALID_ZIP_FILE)
def test_submit_for_inference_image_data_deletion() -> None:
"""
Test that the image data zip is overwritten after the inference runs
"""
image_data = create_zipped_dicom_series()
azure_config = get_azure_config()
workspace = azure_config.get_workspace()
config = SubmitForInferenceConfig(
model_id=PASSTHROUGH_MODEL_ID,
image_data=image_data,
experiment_name=azure_config.experiment_name)
run_id, datastore_image_path = submit_for_inference(config, workspace, azure_config)
run = workspace.get_run(run_id)
run.wait_for_completion()
image_datastore = Datastore(workspace, azure_config.datastore_name)
with tempfile.TemporaryDirectory() as temp_dir:
image_datastore.download(
target_path=temp_dir,
prefix=datastore_image_path,
overwrite=False,
show_progress=False)
temp_dir_path = Path(temp_dir)
image_data_zip_path = temp_dir_path / datastore_image_path / IMAGEDATA_FILE_NAME
with image_data_zip_path.open() as image_data_file:
first_line = image_data_file.readline().strip()
assert first_line == DELETED_IMAGE_DATA_NOTIFICATION

5
app.py
Просмотреть файл

@ -133,9 +133,8 @@ def start_model(model_id: str, workspace: Workspace, azure_config: AzureConfig)
try:
image_data: bytes = request.stream.read()
logging.info(f'Starting {model_id}')
config = SubmitForInferenceConfig(model_id=model_id, image_data=image_data,
experiment_name=azure_config.experiment_name)
run_id = submit_for_inference(config, workspace, azure_config)
config = SubmitForInferenceConfig(model_id=model_id, image_data=image_data, experiment_name=azure_config.experiment_name)
run_id, _ = submit_for_inference(config, workspace, azure_config)
response = make_response(run_id, HTTP_STATUS_CODE.CREATED.value)
response.headers.set('Content-Type', 'text/plain')
return response

Просмотреть файл

@ -21,6 +21,8 @@ class AzureConfig:
cluster: str # The name of the GPU cluster inside the AzureML workspace, that should execute the job.
experiment_name: str
service_principal_secret: str
datastore_name: str # The datastore data store for temp image storage.
image_data_folder: str # The folder name in the data store for temp image storage.
_workspace: Optional[Workspace] = None # "The cached workspace object
@staticmethod

Просмотреть файл

@ -11,3 +11,5 @@ RESOURCE_GROUP = "RESOURCE_GROUP"
SUBSCRIPTION_ID = "SUBSCRIPTION_ID"
APPLICATION_ID = "APPLICATION_ID"
TENANT_ID = "TENANT_ID"
DATASTORE_NAME = "DATASTORE_NAME"
IMAGE_DATA_FOLDER = "IMAGE_DATA_FOLDER"

Просмотреть файл

@ -7,8 +7,10 @@ from azureml.core import Workspace
from injector import singleton, Binder
from azure_config import AzureConfig
from configuration_constants import API_AUTH_SECRET_ENVIRONMENT_VARIABLE, CLUSTER, WORKSPACE_NAME, EXPERIMENT_NAME, \
RESOURCE_GROUP, SUBSCRIPTION_ID, APPLICATION_ID, TENANT_ID, AZUREML_SERVICE_PRINCIPAL_SECRET_ENVIRONMENT_VARIABLE
from configuration_constants import (API_AUTH_SECRET_ENVIRONMENT_VARIABLE, CLUSTER, WORKSPACE_NAME,
EXPERIMENT_NAME, RESOURCE_GROUP, SUBSCRIPTION_ID,
APPLICATION_ID, TENANT_ID, IMAGE_DATA_FOLDER, DATASTORE_NAME,
AZUREML_SERVICE_PRINCIPAL_SECRET_ENVIRONMENT_VARIABLE)
PROJECT_SECRETS_FILE = Path(__file__).resolve().parent / Path("set_environment.sh")
@ -62,4 +64,6 @@ def get_azure_config() -> AzureConfig:
application_id=get_environment_variable(APPLICATION_ID),
service_principal_secret=get_environment_variable(
AZUREML_SERVICE_PRINCIPAL_SECRET_ENVIRONMENT_VARIABLE),
tenant_id=get_environment_variable(TENANT_ID))
tenant_id=get_environment_variable(TENANT_ID),
datastore_name=get_environment_variable(DATASTORE_NAME),
image_data_folder=get_environment_variable(IMAGE_DATA_FOLDER))

Просмотреть файл

@ -8,9 +8,12 @@ import os
import subprocess
import sys
from pathlib import Path
from typing import Dict, List, Tuple
from typing import Dict, List, Tuple, Any
from azureml.core import Model, Run
from azureml.core import Model, Run, Datastore
DELETED_IMAGE_DATA_NOTIFICATION = "image data deleted"
def spawn_and_monitor_subprocess(process: str, args: List[str], env: Dict[str, str]) -> Tuple[int, List[str]]:
@ -19,8 +22,7 @@ def spawn_and_monitor_subprocess(process: str, args: List[str], env: Dict[str, s
:param process: The name or path of the process to spawn.
:param args: The args to the process.
:param env: The environment variables for the process (default is the environment variables of the parent).
:return: Return code after the process has finished, and the list of lines that were written to stdout by the
subprocess.
:return: Return code after the process has finished, and the list of lines that were written to stdout by the subprocess.
"""
p = subprocess.Popen(
[process] + args,
@ -41,47 +43,76 @@ def spawn_and_monitor_subprocess(process: str, args: List[str], env: Dict[str, s
def run() -> None:
"""
Downloads a model from AzureML, and starts the score script (usually score.py) in the root folder of the model.
Downloading the model is only supported if the present code is running inside of AzureML. When running outside
of AzureML, the model must have been downloaded beforehand into the folder given by the model-folder argument.
The script is executed with the current Python interpreter.
If the model requires a specific Conda environment to run in, the caller of this script needs to ensure
that this has been set up correctly (taking the environment.yml file stored in the model).
All arguments that are not recognized by the present code will be passed through to `score.py` unmodified.
Example arguments:
download_model_and_run_scoring.py --model-id=Foo:1 score.py --foo=1 --bar
This would attempt to download version 1 of model Foo, and then start the script score.py in the model's root
folder. Arguments --foo and --bar are passed through to score.py
This script is run in an AzureML experiment which was submitted by submit_for_inference.
It downloads a model from AzureML, and starts the score script (usually score.py) which is in the root
folder of the model. The image data zip is are downloaded from the AzureML datastore where it was copied
by submit_for_inference. Once scoring is completed the image data zip is overwritten with some simple
text in lieue of there being a delete method in the AzureML datastore API. This ensure that the run does
not retain images.
"""
parser = argparse.ArgumentParser(description='Execute code inside of an AzureML model')
# Use argument names with dashes here. The rest of the codebase uses _ as the separator, meaning that there
# can't be a clash of names with arguments that are passed through to score.py
parser.add_argument('--model-folder', dest='model_folder', action='store', type=str)
parser.add_argument('--model-id', dest='model_id', action='store', type=str)
known_args, unknown_args = parser.parse_known_args()
model_folder = known_args.model_folder or "."
if known_args.model_id:
current_run = Run.get_context()
if not hasattr(current_run, 'experiment'):
raise ValueError("The model-id argument can only be used inside AzureML. Please drop the argument, and "
"supply the downloaded model in the model-folder.")
workspace = current_run.experiment.workspace
model = Model(workspace=workspace, id=known_args.model_id)
# Download the model from AzureML into a sub-folder of model_folder
model_folder = str(Path(model.download(model_folder)).absolute())
parser.add_argument('--model_id', dest='model_id', action='store', type=str, required=True,
help='AzureML model ID')
parser.add_argument('--script_name', dest='script_name', action='store', type=str, required=True,
help='Name of the script in the model that will produce the image scores')
parser.add_argument('--datastore_name', dest='datastore_name', action='store', type=str, required=True,
help='Name of the datastore where the image data zip has been copied')
parser.add_argument('--datastore_image_path', dest='datastore_image_path', action='store', type=str, required=True,
help='Path to the image data zip copied to the datastore')
known_args, _ = parser.parse_known_args()
current_run = Run.get_context()
if not hasattr(current_run, 'experiment'):
raise ValueError("This script must run in an AzureML experiment")
workspace = current_run.experiment.workspace
model = Model(workspace=workspace, id=known_args.model_id)
# Download the model from AzureML
here = Path.cwd().absolute()
model_path = Path(model.download(here)).absolute()
# Download the image data zip from the named datastore where it was copied by submit_for_infernece
# We copy it to a data store, rather than using the AzureML experiment's snapshot, so that we can
# overwrite it after the inference and thus not retain image data.
image_datastore = Datastore(workspace, known_args.datastore_name)
prefix = str(Path(known_args.datastore_image_path).parent)
image_datastore.download(target_path=here, prefix=prefix, overwrite=False, show_progress=False)
downloaded_image_path = here / known_args.datastore_image_path
env = dict(os.environ.items())
# Work around https://github.com/pytorch/pytorch/issues/37377
env['MKL_SERVICE_FORCE_INTEL'] = '1'
# The model should include all necessary code, hence point the Python path to its root folder.
env['PYTHONPATH'] = model_folder
if not unknown_args:
raise ValueError("No arguments specified for starting the scoring script.")
score_script = Path(model_folder) / unknown_args[0]
score_args = [str(score_script), *unknown_args[1:]]
env['PYTHONPATH'] = str(model_path)
score_script = model_path / known_args.script_name
score_args = [
str(score_script),
'--data_folder', str(here / Path(known_args.datastore_image_path).parent),
'--image_files', str(downloaded_image_path),
'--model_id', known_args.model_id,
'--use_dicom', 'True']
if not score_script.exists():
raise ValueError(f"The specified entry script {score_args[0]} does not exist in {model_folder}")
print(f"Starting Python with these arguments: {' '.join(score_args)}")
code, stdout = spawn_and_monitor_subprocess(process=sys.executable, args=score_args, env=env)
raise ValueError(
f"The specified entry script {known_args.script_name} does not exist in {model_path}")
print(f"Starting Python with these arguments: {score_args}")
try:
code, stdout = spawn_and_monitor_subprocess(process=sys.executable, args=score_args, env=env)
finally:
# Delete image data zip locally
downloaded_image_path.unlink()
# Overwrite image data zip in datastore. The datastore API does not (yet) include deletion
# and so we overwrite the image data zip with a short piece of text instead. Later these
# overwritten image data zip files can be erased, we recommend using a blobstore lifecylce
# management policy to delete them after a period of time, e.g. seven days.
downloaded_image_path.write_text(DELETED_IMAGE_DATA_NOTIFICATION)
image_datastore.upload_files(files=[str(downloaded_image_path)], target_path=prefix, overwrite=True, show_progress=False)
# Delete the overwritten image data zip locally
downloaded_image_path.unlink()
if code != 0:
print(f"Python terminated with exit code {code}. Stdout: {os.linesep.join(stdout)}")
sys.exit(code)

Просмотреть файл

@ -1,6 +1,6 @@
from dataclasses import dataclass, field
from pathlib import Path
from typing import Optional, Dict, List
from typing import List
@dataclass

Просмотреть файл

@ -7,10 +7,12 @@ import logging
import os
import shutil
import tempfile
import uuid
from pathlib import Path
from typing import Tuple
from attr import dataclass
from azureml.core import Experiment, Model, ScriptRunConfig, Environment
from azureml.core import Experiment, Model, ScriptRunConfig, Environment, Datastore
from azureml.core.runconfig import RunConfiguration
from azureml.core.workspace import WORKSPACE_DEFAULT_BLOB_STORE_NAME, Workspace
@ -24,6 +26,8 @@ SCORE_SCRIPT = "score.py"
RUN_SCORING_SCRIPT = "download_model_and_run_scoring.py"
# The property in the model registry that holds the name of the Python environment
PYTHON_ENVIRONMENT_NAME = "python_environment_name"
IMAGEDATA_FILE_NAME = "imagedata.zip"
@dataclass
class SubmitForInferenceConfig:
@ -34,6 +38,7 @@ class SubmitForInferenceConfig:
image_data: bytes
experiment_name: str
def create_run_config(azure_config: AzureConfig,
source_config: SourceConfig,
environment_name: str) -> ScriptRunConfig:
@ -48,14 +53,11 @@ def create_run_config(azure_config: AzureConfig,
"""
# AzureML seems to sometimes expect the entry script path in Linux format, hence convert to posix path
entry_script_relative_path = source_config.entry_script.relative_to(source_config.root_folder).as_posix()
logging.info(f"Entry script {entry_script_relative_path} ({source_config.entry_script} relative to "
f"source directory {source_config.root_folder})")
logging.info(f"Entry script {entry_script_relative_path} ({source_config.entry_script} "
f"relative to source directory {source_config.root_folder})")
max_run_duration = None
workspace = azure_config.get_workspace()
run_config = RunConfiguration(
script=entry_script_relative_path,
arguments=source_config.script_params,
)
run_config = RunConfiguration(script=entry_script_relative_path, arguments=source_config.script_params)
env = Environment.get(azure_config.get_workspace(), name=environment_name, version=ENVIRONMENT_VERSION)
logging.info(f"Using existing Python environment '{env.name}'.")
run_config.environment = env
@ -63,58 +65,58 @@ def create_run_config(azure_config: AzureConfig,
run_config.max_run_duration_seconds = max_run_duration
# Use blob storage for storing the source, rather than the FileShares section of the storage account.
run_config.source_directory_data_store = workspace.datastores.get(WORKSPACE_DEFAULT_BLOB_STORE_NAME).name
script_run_config = ScriptRunConfig(
source_directory=str(source_config.root_folder),
run_config=run_config,
)
script_run_config = ScriptRunConfig(source_directory=str(source_config.root_folder), run_config=run_config)
return script_run_config
def submit_for_inference(args: SubmitForInferenceConfig, workspace: Workspace, azure_config: AzureConfig) -> str:
def submit_for_inference(args: SubmitForInferenceConfig, workspace: Workspace, azure_config: AzureConfig) -> Tuple[str, str]:
"""
Create and submit an inference to AzureML, and optionally download the resulting segmentation.
:param args: configuration, see SubmitForInferenceConfig
:param workspace: Azure ML workspace.
:param azure_config: An object with all necessary information for accessing Azure.
:return: Azure Run Id.
:return: Azure Run Id (and the target path on the datastore, including the uuid, for a unit
test to ensure that the image data zip is overwritten after infernece)
"""
logging.info("Identifying model")
model = Model(workspace=workspace, id=args.model_id)
model_id = model.id
logging.info(f"Identified model {model_id}")
source_directory = tempfile.TemporaryDirectory()
source_directory_path = Path(source_directory.name)
logging.info(f"Building inference run submission in {source_directory_path}")
image_folder = source_directory_path / DEFAULT_DATA_FOLDER
image_folder.mkdir(parents=True, exist_ok=True)
image_path = image_folder / "imagedata.zip"
image_path = image_folder / IMAGEDATA_FILE_NAME
image_path.write_bytes(args.image_data)
# Retrieve the name of the Python environment that the training run used. This environment should have been
# registered. If no such environment exists, it will be re-create from the Conda files provided.
image_datastore = Datastore(workspace, azure_config.datastore_name)
target_path = f"{azure_config.image_data_folder}/{str(uuid.uuid4())}"
image_datastore.upload_files(files=[str(image_path)], target_path=target_path, overwrite=False, show_progress=False)
image_path.unlink()
# Retrieve the name of the Python environment that the training run used. This environment
# should have been registered. If no such environment exists, it will be re-create from the
# Conda files provided.
python_environment_name = model.tags.get(PYTHON_ENVIRONMENT_NAME, "")
if python_environment_name == "":
raise ValueError(f"Model ID: {model_id} does not contain an environment tag {PYTHON_ENVIRONMENT_NAME}")
raise ValueError(
f"Model ID: {model_id} does not contain an environment tag {PYTHON_ENVIRONMENT_NAME}")
# Copy the scoring script from the repository. This will start the model download from Azure, and invoke the
# scoring script.
# Copy the scoring script from the repository. This will start the model download from Azure,
# and invoke the scoring script.
entry_script = source_directory_path / Path(RUN_SCORING_SCRIPT).name
current_file_path = Path(os.path.dirname(os.path.realpath(__file__)))
shutil.copyfile(current_file_path / str(RUN_SCORING_SCRIPT),
str(entry_script))
shutil.copyfile(current_file_path / str(RUN_SCORING_SCRIPT), str(entry_script))
source_config = SourceConfig(
root_folder=source_directory_path,
entry_script=entry_script,
script_params=["--model-folder", ".",
"--model-id", model_id,
SCORE_SCRIPT,
# The data folder must be relative to the root folder of the AzureML job. test_image_files
# is then just the file relative to the data_folder
"--data_folder", image_path.parent.name,
"--image_files", image_path.name,
"--use_dicom", "True",
"--model_id", model_id],
)
script_params=["--model_id", model_id,
"--script_name", SCORE_SCRIPT,
"--datastore_name", azure_config.datastore_name,
"--datastore_image_path", str(Path(target_path) / IMAGEDATA_FILE_NAME)])
run_config = create_run_config(azure_config, source_config, environment_name=python_environment_name)
exp = Experiment(workspace=workspace, name=args.experiment_name)
run = exp.submit(run_config)
@ -122,4 +124,4 @@ def submit_for_inference(args: SubmitForInferenceConfig, workspace: Workspace, a
logging.info(f"Run URL: {run.get_portal_url()}")
source_directory.cleanup()
logging.info(f"Deleted submission directory {source_directory_path}")
return run.id
return run.id, target_path