Pushing image to datastore and deleting after inference (#4)
* Pushing image to datastore And immediately deleting 'local' copy (from the snapshot) * Deleting image file after bootstar script runs Not tested yet. * two spurious newlines * mypy spotted missing colon * better defaults * Fixing two mypy errors * Test prints Using AzureML Studio to work out is shapshot is affected by the deletion * Saves image to datastore * Including GUID in datastore path * WiP downloading images from datastore in bootstrap script * Overwriting image data zip * tidy up * Returning GUID and tidy-up * Added unit test of image data zip overwrite * Fixing API call to submit with ignored return * Swapping temp image folder name to config * Fixing documentation * Plumbing for datastore name * Swapping from default to named datastore * Fixing main.yml build env * Decreasing indentation by breaking loop https://github.com/microsoft/InnerEye-Inference/pull/4#discussion_r617584432 * pylint not used and commented out line deleted * Fixing flake8 warnings * Swapping imports back to single line After Anton's query on the PR I checked the .flake8 file and our maximum line length is 160 and so I do not need to break up these lines. * Hard coding not-secret settings in workflow As per Anton's request: https://github.com/microsoft/InnerEye-Inference/pull/4#discussion_r618144366 * /= consistency * removing 2 debug print statements * removed brackets * swapping to run.wait_for_completion * swapping to writetext * removing unnecessary initialization * Rationalising 'run' method: paths, arguments, and comments * removing debug print lines * Reverting line lengths I had assumed a max line length of 100, but it is 160 in the .flake8 configuration file so I have reverted the changes I had made to not-too-long lines, and fixed a few others for legibility * Swapping to required=True for params * Typo and unnecessary Path() spotted by Jonathan * Changing parameter to underscores from dashes To bring them into line with the rest of the InnerEye projects. They needed to be different when the unknown args were passed straight on to score.py but that is not how the arguments flow through anymore. * Line length fix in comment to kick off license/cla The 'license/cla' check in the build pipeline has stalled and there is not 'retry' buttong so I am hoping to kick it into action again with an almost vacuaous commit!
This commit is contained in:
Родитель
4f9d1fad61
Коммит
d487fb93c9
|
@ -32,13 +32,15 @@ jobs:
|
|||
env:
|
||||
CUSTOMCONNSTR_AZUREML_SERVICE_PRINCIPAL_SECRET: ${{ secrets.CUSTOMCONNSTR_AZUREML_SERVICE_PRINCIPAL_SECRET }}
|
||||
CUSTOMCONNSTR_API_AUTH_SECRET: ${{ secrets.CUSTOMCONNSTR_API_AUTH_SECRET }}
|
||||
CLUSTER: ${{ secrets.CLUSTER }}
|
||||
WORKSPACE_NAME: ${{ secrets.WORKSPACE_NAME }}
|
||||
EXPERIMENT_NAME: ${{ secrets.EXPERIMENT_NAME }}
|
||||
RESOURCE_GROUP: ${{ secrets.RESOURCE_GROUP }}
|
||||
CLUSTER: "training-nc12"
|
||||
WORKSPACE_NAME: "InnerEye-DeepLearning"
|
||||
EXPERIMENT_NAME: "api_inference"
|
||||
RESOURCE_GROUP: "InnerEye-DeepLearning"
|
||||
SUBSCRIPTION_ID: ${{ secrets.SUBSCRIPTION_ID }}
|
||||
APPLICATION_ID: ${{ secrets.APPLICATION_ID }}
|
||||
TENANT_ID: ${{ secrets.TENANT_ID }}
|
||||
DATASTORE_NAME: "inferencetestimagestore"
|
||||
IMAGE_DATA_FOLDER: "temp-image-store"
|
||||
run: |
|
||||
conda activate inference
|
||||
pytest --cov=./ --cov-report=html
|
||||
|
|
|
@ -38,6 +38,8 @@ export RESOURCE_GROUP=
|
|||
export SUBSCRIPTION_ID=
|
||||
export APPLICATION_ID=
|
||||
export TENANT_ID=
|
||||
export DATASTORE_NAME=
|
||||
export IMAGE_DATA_FOLDER=
|
||||
```
|
||||
|
||||
Run with `source set_environment.sh`
|
||||
|
@ -59,6 +61,10 @@ If you would like to reproduce the automatic deployment of the service for testi
|
|||
* `az ad sp create-for-rbac --name "<name>" --role contributor --scope /subscriptions/<subs>/resourceGroups/InnerEyeInference --sdk-auth`
|
||||
* The previous command will return a json object with the content for the variable `secrets.AZURE_CREDENTIALS` .github/workflows/deploy.yml
|
||||
|
||||
## Images
|
||||
|
||||
During inference the image data zip file is copied to the IMAGE_DATA_FOLDER in the AzureML workspace's DATASTORE_NAME datastore. At the end of inference the copied image data zip file is overwritten with a simple line of text. At present we cannot delete these. If you would like these overwritten files removed from your datastore you can [add a policy](https://docs.microsoft.com/en-us/azure/storage/blobs/storage-lifecycle-management-concepts?tabs=azure-portal) to delete items from the datastore after a period of time. We recommend 7 days.
|
||||
|
||||
## Help and Bug Reporting
|
||||
|
||||
1. [Guidelines for how to report bug.](./docs/BugReporting.md)
|
||||
|
|
|
@ -3,24 +3,24 @@
|
|||
# Licensed under the MIT License (MIT). See LICENSE in the repo root for license information.
|
||||
# ------------------------------------------------------------------------------------------
|
||||
|
||||
from pathlib import Path
|
||||
import random
|
||||
import shutil
|
||||
import tempfile
|
||||
import time
|
||||
import zipfile
|
||||
from pathlib import Path
|
||||
from typing import Any, Optional
|
||||
from unittest import mock
|
||||
import zipfile
|
||||
|
||||
from azureml._restclient.constants import RunStatus
|
||||
from azureml.core import Experiment, Model, Workspace
|
||||
from azureml.core import Experiment, Model, Workspace, Datastore
|
||||
from azureml.exceptions import WebserviceException
|
||||
from flask import Response
|
||||
from pydicom import dcmread
|
||||
|
||||
from app import app, HTTP_STATUS_CODE, ERROR_EXTRA_DETAILS
|
||||
from configure import API_AUTH_SECRET, API_AUTH_SECRET_HEADER_NAME
|
||||
from submit_for_inference import DEFAULT_RESULT_IMAGE_NAME
|
||||
from app import ERROR_EXTRA_DETAILS, HTTP_STATUS_CODE, RUNNING_OR_POST_PROCESSING, app
|
||||
from configure import API_AUTH_SECRET, API_AUTH_SECRET_HEADER_NAME, get_azure_config
|
||||
from download_model_and_run_scoring import DELETED_IMAGE_DATA_NOTIFICATION
|
||||
from submit_for_inference import DEFAULT_RESULT_IMAGE_NAME, IMAGEDATA_FILE_NAME, SubmitForInferenceConfig, submit_for_inference
|
||||
|
||||
# Timeout, in seconds, for Azure runs, 20 minutes.
|
||||
TIMEOUT_IN_SECONDS = 20 * 60
|
||||
|
@ -32,6 +32,8 @@ TEST_DATA_DIR: Path = THIS_DIR / "TestData"
|
|||
# Test reference series.
|
||||
TestDicomVolumeLocation: Path = TEST_DATA_DIR / "HN"
|
||||
|
||||
PASSTHROUGH_MODEL_ID = "PassThroughModel:4"
|
||||
|
||||
|
||||
def assert_response_error_type(response: Response, status_code: HTTP_STATUS_CODE,
|
||||
extra_details: Optional[ERROR_EXTRA_DETAILS] = None) -> None:
|
||||
|
@ -142,7 +144,7 @@ def test_model_start_authenticated_valid_model_id() -> None:
|
|||
# Patch the method Experiment.submit to prevent the AzureML experiment actually running.
|
||||
with mock.patch.object(Experiment, 'submit', return_value=run_mock):
|
||||
with app.test_client() as client:
|
||||
response = client.post("/v1/model/start/PassThroughModel:4",
|
||||
response = client.post(f"/v1/model/start/{PASSTHROUGH_MODEL_ID}",
|
||||
headers={API_AUTH_SECRET_HEADER_NAME: API_AUTH_SECRET})
|
||||
assert response.status_code == HTTP_STATUS_CODE.CREATED.value
|
||||
assert response.content_type == 'text/plain'
|
||||
|
@ -280,12 +282,12 @@ def submit_for_inference_and_wait(model_id: str, data: bytes) -> Any:
|
|||
|
||||
def test_submit_for_inference_end_to_end() -> None:
|
||||
"""
|
||||
Test that submitting a zipped DICOM series to model PassThroughModel:4 returns
|
||||
Test that submitting a zipped DICOM series to model PASSTHROUGH_MODEL_ID returns
|
||||
the expected DICOM-RT format.
|
||||
"""
|
||||
image_data = create_zipped_dicom_series()
|
||||
assert len(image_data) > 0
|
||||
response = submit_for_inference_and_wait("PassThroughModel:4", image_data)
|
||||
response = submit_for_inference_and_wait(PASSTHROUGH_MODEL_ID, image_data)
|
||||
assert response.content_type == 'application/zip'
|
||||
assert response.status_code == HTTP_STATUS_CODE.OK.value
|
||||
# Create a scratch directory
|
||||
|
@ -313,7 +315,7 @@ def test_submit_for_inference_end_to_end() -> None:
|
|||
# Check the modality
|
||||
assert ds.Modality == 'RTSTRUCT'
|
||||
assert ds.Manufacturer == 'Default_Manufacturer'
|
||||
assert ds.SoftwareVersions == 'PassThroughModel:4'
|
||||
assert ds.SoftwareVersions == PASSTHROUGH_MODEL_ID
|
||||
# Check the structure names
|
||||
expected_structure_names = ["SpinalCord", "Lung_R", "Lung_L", "Heart", "Esophagus"]
|
||||
assert len(ds.StructureSetROISequence) == len(expected_structure_names)
|
||||
|
@ -325,6 +327,7 @@ def test_submit_for_inference_end_to_end() -> None:
|
|||
assert len(ds.ROIContourSequence) == len(expected_structure_names)
|
||||
for i, item in enumerate(expected_structure_names):
|
||||
assert ds.ROIContourSequence[i].ReferencedROINumber == i + 1
|
||||
# Download image data zip, which should now have been overwritten
|
||||
|
||||
|
||||
def test_submit_for_inference_bad_image_file() -> None:
|
||||
|
@ -335,6 +338,34 @@ def test_submit_for_inference_bad_image_file() -> None:
|
|||
"""
|
||||
# Get a random 1Kb
|
||||
image_data = bytes([random.randint(0, 255) for _ in range(0, 1024)])
|
||||
response = submit_for_inference_and_wait("PassThroughModel:4", image_data)
|
||||
response = submit_for_inference_and_wait(PASSTHROUGH_MODEL_ID, image_data)
|
||||
assert_response_error_type(response, HTTP_STATUS_CODE.BAD_REQUEST,
|
||||
ERROR_EXTRA_DETAILS.INVALID_ZIP_FILE)
|
||||
|
||||
|
||||
def test_submit_for_inference_image_data_deletion() -> None:
|
||||
"""
|
||||
Test that the image data zip is overwritten after the inference runs
|
||||
"""
|
||||
image_data = create_zipped_dicom_series()
|
||||
azure_config = get_azure_config()
|
||||
workspace = azure_config.get_workspace()
|
||||
config = SubmitForInferenceConfig(
|
||||
model_id=PASSTHROUGH_MODEL_ID,
|
||||
image_data=image_data,
|
||||
experiment_name=azure_config.experiment_name)
|
||||
run_id, datastore_image_path = submit_for_inference(config, workspace, azure_config)
|
||||
run = workspace.get_run(run_id)
|
||||
run.wait_for_completion()
|
||||
image_datastore = Datastore(workspace, azure_config.datastore_name)
|
||||
with tempfile.TemporaryDirectory() as temp_dir:
|
||||
image_datastore.download(
|
||||
target_path=temp_dir,
|
||||
prefix=datastore_image_path,
|
||||
overwrite=False,
|
||||
show_progress=False)
|
||||
temp_dir_path = Path(temp_dir)
|
||||
image_data_zip_path = temp_dir_path / datastore_image_path / IMAGEDATA_FILE_NAME
|
||||
with image_data_zip_path.open() as image_data_file:
|
||||
first_line = image_data_file.readline().strip()
|
||||
assert first_line == DELETED_IMAGE_DATA_NOTIFICATION
|
||||
|
|
5
app.py
5
app.py
|
@ -133,9 +133,8 @@ def start_model(model_id: str, workspace: Workspace, azure_config: AzureConfig)
|
|||
try:
|
||||
image_data: bytes = request.stream.read()
|
||||
logging.info(f'Starting {model_id}')
|
||||
config = SubmitForInferenceConfig(model_id=model_id, image_data=image_data,
|
||||
experiment_name=azure_config.experiment_name)
|
||||
run_id = submit_for_inference(config, workspace, azure_config)
|
||||
config = SubmitForInferenceConfig(model_id=model_id, image_data=image_data, experiment_name=azure_config.experiment_name)
|
||||
run_id, _ = submit_for_inference(config, workspace, azure_config)
|
||||
response = make_response(run_id, HTTP_STATUS_CODE.CREATED.value)
|
||||
response.headers.set('Content-Type', 'text/plain')
|
||||
return response
|
||||
|
|
|
@ -21,6 +21,8 @@ class AzureConfig:
|
|||
cluster: str # The name of the GPU cluster inside the AzureML workspace, that should execute the job.
|
||||
experiment_name: str
|
||||
service_principal_secret: str
|
||||
datastore_name: str # The datastore data store for temp image storage.
|
||||
image_data_folder: str # The folder name in the data store for temp image storage.
|
||||
_workspace: Optional[Workspace] = None # "The cached workspace object
|
||||
|
||||
@staticmethod
|
||||
|
|
|
@ -11,3 +11,5 @@ RESOURCE_GROUP = "RESOURCE_GROUP"
|
|||
SUBSCRIPTION_ID = "SUBSCRIPTION_ID"
|
||||
APPLICATION_ID = "APPLICATION_ID"
|
||||
TENANT_ID = "TENANT_ID"
|
||||
DATASTORE_NAME = "DATASTORE_NAME"
|
||||
IMAGE_DATA_FOLDER = "IMAGE_DATA_FOLDER"
|
||||
|
|
10
configure.py
10
configure.py
|
@ -7,8 +7,10 @@ from azureml.core import Workspace
|
|||
from injector import singleton, Binder
|
||||
|
||||
from azure_config import AzureConfig
|
||||
from configuration_constants import API_AUTH_SECRET_ENVIRONMENT_VARIABLE, CLUSTER, WORKSPACE_NAME, EXPERIMENT_NAME, \
|
||||
RESOURCE_GROUP, SUBSCRIPTION_ID, APPLICATION_ID, TENANT_ID, AZUREML_SERVICE_PRINCIPAL_SECRET_ENVIRONMENT_VARIABLE
|
||||
from configuration_constants import (API_AUTH_SECRET_ENVIRONMENT_VARIABLE, CLUSTER, WORKSPACE_NAME,
|
||||
EXPERIMENT_NAME, RESOURCE_GROUP, SUBSCRIPTION_ID,
|
||||
APPLICATION_ID, TENANT_ID, IMAGE_DATA_FOLDER, DATASTORE_NAME,
|
||||
AZUREML_SERVICE_PRINCIPAL_SECRET_ENVIRONMENT_VARIABLE)
|
||||
|
||||
PROJECT_SECRETS_FILE = Path(__file__).resolve().parent / Path("set_environment.sh")
|
||||
|
||||
|
@ -62,4 +64,6 @@ def get_azure_config() -> AzureConfig:
|
|||
application_id=get_environment_variable(APPLICATION_ID),
|
||||
service_principal_secret=get_environment_variable(
|
||||
AZUREML_SERVICE_PRINCIPAL_SECRET_ENVIRONMENT_VARIABLE),
|
||||
tenant_id=get_environment_variable(TENANT_ID))
|
||||
tenant_id=get_environment_variable(TENANT_ID),
|
||||
datastore_name=get_environment_variable(DATASTORE_NAME),
|
||||
image_data_folder=get_environment_variable(IMAGE_DATA_FOLDER))
|
||||
|
|
|
@ -8,9 +8,12 @@ import os
|
|||
import subprocess
|
||||
import sys
|
||||
from pathlib import Path
|
||||
from typing import Dict, List, Tuple
|
||||
from typing import Dict, List, Tuple, Any
|
||||
|
||||
from azureml.core import Model, Run
|
||||
from azureml.core import Model, Run, Datastore
|
||||
|
||||
|
||||
DELETED_IMAGE_DATA_NOTIFICATION = "image data deleted"
|
||||
|
||||
|
||||
def spawn_and_monitor_subprocess(process: str, args: List[str], env: Dict[str, str]) -> Tuple[int, List[str]]:
|
||||
|
@ -19,8 +22,7 @@ def spawn_and_monitor_subprocess(process: str, args: List[str], env: Dict[str, s
|
|||
:param process: The name or path of the process to spawn.
|
||||
:param args: The args to the process.
|
||||
:param env: The environment variables for the process (default is the environment variables of the parent).
|
||||
:return: Return code after the process has finished, and the list of lines that were written to stdout by the
|
||||
subprocess.
|
||||
:return: Return code after the process has finished, and the list of lines that were written to stdout by the subprocess.
|
||||
"""
|
||||
p = subprocess.Popen(
|
||||
[process] + args,
|
||||
|
@ -41,47 +43,76 @@ def spawn_and_monitor_subprocess(process: str, args: List[str], env: Dict[str, s
|
|||
|
||||
def run() -> None:
|
||||
"""
|
||||
Downloads a model from AzureML, and starts the score script (usually score.py) in the root folder of the model.
|
||||
Downloading the model is only supported if the present code is running inside of AzureML. When running outside
|
||||
of AzureML, the model must have been downloaded beforehand into the folder given by the model-folder argument.
|
||||
The script is executed with the current Python interpreter.
|
||||
If the model requires a specific Conda environment to run in, the caller of this script needs to ensure
|
||||
that this has been set up correctly (taking the environment.yml file stored in the model).
|
||||
All arguments that are not recognized by the present code will be passed through to `score.py` unmodified.
|
||||
Example arguments:
|
||||
download_model_and_run_scoring.py --model-id=Foo:1 score.py --foo=1 --bar
|
||||
This would attempt to download version 1 of model Foo, and then start the script score.py in the model's root
|
||||
folder. Arguments --foo and --bar are passed through to score.py
|
||||
This script is run in an AzureML experiment which was submitted by submit_for_inference.
|
||||
|
||||
It downloads a model from AzureML, and starts the score script (usually score.py) which is in the root
|
||||
folder of the model. The image data zip is are downloaded from the AzureML datastore where it was copied
|
||||
by submit_for_inference. Once scoring is completed the image data zip is overwritten with some simple
|
||||
text in lieue of there being a delete method in the AzureML datastore API. This ensure that the run does
|
||||
not retain images.
|
||||
"""
|
||||
parser = argparse.ArgumentParser(description='Execute code inside of an AzureML model')
|
||||
# Use argument names with dashes here. The rest of the codebase uses _ as the separator, meaning that there
|
||||
# can't be a clash of names with arguments that are passed through to score.py
|
||||
parser.add_argument('--model-folder', dest='model_folder', action='store', type=str)
|
||||
parser.add_argument('--model-id', dest='model_id', action='store', type=str)
|
||||
known_args, unknown_args = parser.parse_known_args()
|
||||
model_folder = known_args.model_folder or "."
|
||||
if known_args.model_id:
|
||||
current_run = Run.get_context()
|
||||
if not hasattr(current_run, 'experiment'):
|
||||
raise ValueError("The model-id argument can only be used inside AzureML. Please drop the argument, and "
|
||||
"supply the downloaded model in the model-folder.")
|
||||
workspace = current_run.experiment.workspace
|
||||
model = Model(workspace=workspace, id=known_args.model_id)
|
||||
# Download the model from AzureML into a sub-folder of model_folder
|
||||
model_folder = str(Path(model.download(model_folder)).absolute())
|
||||
parser.add_argument('--model_id', dest='model_id', action='store', type=str, required=True,
|
||||
help='AzureML model ID')
|
||||
parser.add_argument('--script_name', dest='script_name', action='store', type=str, required=True,
|
||||
help='Name of the script in the model that will produce the image scores')
|
||||
parser.add_argument('--datastore_name', dest='datastore_name', action='store', type=str, required=True,
|
||||
help='Name of the datastore where the image data zip has been copied')
|
||||
parser.add_argument('--datastore_image_path', dest='datastore_image_path', action='store', type=str, required=True,
|
||||
help='Path to the image data zip copied to the datastore')
|
||||
known_args, _ = parser.parse_known_args()
|
||||
|
||||
current_run = Run.get_context()
|
||||
if not hasattr(current_run, 'experiment'):
|
||||
raise ValueError("This script must run in an AzureML experiment")
|
||||
|
||||
workspace = current_run.experiment.workspace
|
||||
model = Model(workspace=workspace, id=known_args.model_id)
|
||||
|
||||
# Download the model from AzureML
|
||||
here = Path.cwd().absolute()
|
||||
model_path = Path(model.download(here)).absolute()
|
||||
|
||||
# Download the image data zip from the named datastore where it was copied by submit_for_infernece
|
||||
# We copy it to a data store, rather than using the AzureML experiment's snapshot, so that we can
|
||||
# overwrite it after the inference and thus not retain image data.
|
||||
image_datastore = Datastore(workspace, known_args.datastore_name)
|
||||
prefix = str(Path(known_args.datastore_image_path).parent)
|
||||
image_datastore.download(target_path=here, prefix=prefix, overwrite=False, show_progress=False)
|
||||
downloaded_image_path = here / known_args.datastore_image_path
|
||||
|
||||
env = dict(os.environ.items())
|
||||
# Work around https://github.com/pytorch/pytorch/issues/37377
|
||||
env['MKL_SERVICE_FORCE_INTEL'] = '1'
|
||||
# The model should include all necessary code, hence point the Python path to its root folder.
|
||||
env['PYTHONPATH'] = model_folder
|
||||
if not unknown_args:
|
||||
raise ValueError("No arguments specified for starting the scoring script.")
|
||||
score_script = Path(model_folder) / unknown_args[0]
|
||||
score_args = [str(score_script), *unknown_args[1:]]
|
||||
env['PYTHONPATH'] = str(model_path)
|
||||
|
||||
score_script = model_path / known_args.script_name
|
||||
score_args = [
|
||||
str(score_script),
|
||||
'--data_folder', str(here / Path(known_args.datastore_image_path).parent),
|
||||
'--image_files', str(downloaded_image_path),
|
||||
'--model_id', known_args.model_id,
|
||||
'--use_dicom', 'True']
|
||||
|
||||
if not score_script.exists():
|
||||
raise ValueError(f"The specified entry script {score_args[0]} does not exist in {model_folder}")
|
||||
print(f"Starting Python with these arguments: {' '.join(score_args)}")
|
||||
code, stdout = spawn_and_monitor_subprocess(process=sys.executable, args=score_args, env=env)
|
||||
raise ValueError(
|
||||
f"The specified entry script {known_args.script_name} does not exist in {model_path}")
|
||||
|
||||
print(f"Starting Python with these arguments: {score_args}")
|
||||
try:
|
||||
code, stdout = spawn_and_monitor_subprocess(process=sys.executable, args=score_args, env=env)
|
||||
finally:
|
||||
# Delete image data zip locally
|
||||
downloaded_image_path.unlink()
|
||||
# Overwrite image data zip in datastore. The datastore API does not (yet) include deletion
|
||||
# and so we overwrite the image data zip with a short piece of text instead. Later these
|
||||
# overwritten image data zip files can be erased, we recommend using a blobstore lifecylce
|
||||
# management policy to delete them after a period of time, e.g. seven days.
|
||||
downloaded_image_path.write_text(DELETED_IMAGE_DATA_NOTIFICATION)
|
||||
image_datastore.upload_files(files=[str(downloaded_image_path)], target_path=prefix, overwrite=True, show_progress=False)
|
||||
# Delete the overwritten image data zip locally
|
||||
downloaded_image_path.unlink()
|
||||
if code != 0:
|
||||
print(f"Python terminated with exit code {code}. Stdout: {os.linesep.join(stdout)}")
|
||||
sys.exit(code)
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
from dataclasses import dataclass, field
|
||||
from pathlib import Path
|
||||
from typing import Optional, Dict, List
|
||||
from typing import List
|
||||
|
||||
|
||||
@dataclass
|
||||
|
|
|
@ -7,10 +7,12 @@ import logging
|
|||
import os
|
||||
import shutil
|
||||
import tempfile
|
||||
import uuid
|
||||
from pathlib import Path
|
||||
from typing import Tuple
|
||||
|
||||
from attr import dataclass
|
||||
from azureml.core import Experiment, Model, ScriptRunConfig, Environment
|
||||
from azureml.core import Experiment, Model, ScriptRunConfig, Environment, Datastore
|
||||
from azureml.core.runconfig import RunConfiguration
|
||||
from azureml.core.workspace import WORKSPACE_DEFAULT_BLOB_STORE_NAME, Workspace
|
||||
|
||||
|
@ -24,6 +26,8 @@ SCORE_SCRIPT = "score.py"
|
|||
RUN_SCORING_SCRIPT = "download_model_and_run_scoring.py"
|
||||
# The property in the model registry that holds the name of the Python environment
|
||||
PYTHON_ENVIRONMENT_NAME = "python_environment_name"
|
||||
IMAGEDATA_FILE_NAME = "imagedata.zip"
|
||||
|
||||
|
||||
@dataclass
|
||||
class SubmitForInferenceConfig:
|
||||
|
@ -34,6 +38,7 @@ class SubmitForInferenceConfig:
|
|||
image_data: bytes
|
||||
experiment_name: str
|
||||
|
||||
|
||||
def create_run_config(azure_config: AzureConfig,
|
||||
source_config: SourceConfig,
|
||||
environment_name: str) -> ScriptRunConfig:
|
||||
|
@ -48,14 +53,11 @@ def create_run_config(azure_config: AzureConfig,
|
|||
"""
|
||||
# AzureML seems to sometimes expect the entry script path in Linux format, hence convert to posix path
|
||||
entry_script_relative_path = source_config.entry_script.relative_to(source_config.root_folder).as_posix()
|
||||
logging.info(f"Entry script {entry_script_relative_path} ({source_config.entry_script} relative to "
|
||||
f"source directory {source_config.root_folder})")
|
||||
logging.info(f"Entry script {entry_script_relative_path} ({source_config.entry_script} "
|
||||
f"relative to source directory {source_config.root_folder})")
|
||||
max_run_duration = None
|
||||
workspace = azure_config.get_workspace()
|
||||
run_config = RunConfiguration(
|
||||
script=entry_script_relative_path,
|
||||
arguments=source_config.script_params,
|
||||
)
|
||||
run_config = RunConfiguration(script=entry_script_relative_path, arguments=source_config.script_params)
|
||||
env = Environment.get(azure_config.get_workspace(), name=environment_name, version=ENVIRONMENT_VERSION)
|
||||
logging.info(f"Using existing Python environment '{env.name}'.")
|
||||
run_config.environment = env
|
||||
|
@ -63,58 +65,58 @@ def create_run_config(azure_config: AzureConfig,
|
|||
run_config.max_run_duration_seconds = max_run_duration
|
||||
# Use blob storage for storing the source, rather than the FileShares section of the storage account.
|
||||
run_config.source_directory_data_store = workspace.datastores.get(WORKSPACE_DEFAULT_BLOB_STORE_NAME).name
|
||||
script_run_config = ScriptRunConfig(
|
||||
source_directory=str(source_config.root_folder),
|
||||
run_config=run_config,
|
||||
)
|
||||
script_run_config = ScriptRunConfig(source_directory=str(source_config.root_folder), run_config=run_config)
|
||||
return script_run_config
|
||||
|
||||
|
||||
def submit_for_inference(args: SubmitForInferenceConfig, workspace: Workspace, azure_config: AzureConfig) -> str:
|
||||
def submit_for_inference(args: SubmitForInferenceConfig, workspace: Workspace, azure_config: AzureConfig) -> Tuple[str, str]:
|
||||
"""
|
||||
Create and submit an inference to AzureML, and optionally download the resulting segmentation.
|
||||
:param args: configuration, see SubmitForInferenceConfig
|
||||
:param workspace: Azure ML workspace.
|
||||
:param azure_config: An object with all necessary information for accessing Azure.
|
||||
:return: Azure Run Id.
|
||||
:return: Azure Run Id (and the target path on the datastore, including the uuid, for a unit
|
||||
test to ensure that the image data zip is overwritten after infernece)
|
||||
"""
|
||||
logging.info("Identifying model")
|
||||
model = Model(workspace=workspace, id=args.model_id)
|
||||
model_id = model.id
|
||||
logging.info(f"Identified model {model_id}")
|
||||
|
||||
source_directory = tempfile.TemporaryDirectory()
|
||||
source_directory_path = Path(source_directory.name)
|
||||
logging.info(f"Building inference run submission in {source_directory_path}")
|
||||
|
||||
image_folder = source_directory_path / DEFAULT_DATA_FOLDER
|
||||
image_folder.mkdir(parents=True, exist_ok=True)
|
||||
image_path = image_folder / "imagedata.zip"
|
||||
image_path = image_folder / IMAGEDATA_FILE_NAME
|
||||
image_path.write_bytes(args.image_data)
|
||||
|
||||
# Retrieve the name of the Python environment that the training run used. This environment should have been
|
||||
# registered. If no such environment exists, it will be re-create from the Conda files provided.
|
||||
image_datastore = Datastore(workspace, azure_config.datastore_name)
|
||||
target_path = f"{azure_config.image_data_folder}/{str(uuid.uuid4())}"
|
||||
image_datastore.upload_files(files=[str(image_path)], target_path=target_path, overwrite=False, show_progress=False)
|
||||
image_path.unlink()
|
||||
|
||||
# Retrieve the name of the Python environment that the training run used. This environment
|
||||
# should have been registered. If no such environment exists, it will be re-create from the
|
||||
# Conda files provided.
|
||||
python_environment_name = model.tags.get(PYTHON_ENVIRONMENT_NAME, "")
|
||||
if python_environment_name == "":
|
||||
raise ValueError(f"Model ID: {model_id} does not contain an environment tag {PYTHON_ENVIRONMENT_NAME}")
|
||||
raise ValueError(
|
||||
f"Model ID: {model_id} does not contain an environment tag {PYTHON_ENVIRONMENT_NAME}")
|
||||
|
||||
# Copy the scoring script from the repository. This will start the model download from Azure, and invoke the
|
||||
# scoring script.
|
||||
# Copy the scoring script from the repository. This will start the model download from Azure,
|
||||
# and invoke the scoring script.
|
||||
entry_script = source_directory_path / Path(RUN_SCORING_SCRIPT).name
|
||||
current_file_path = Path(os.path.dirname(os.path.realpath(__file__)))
|
||||
shutil.copyfile(current_file_path / str(RUN_SCORING_SCRIPT),
|
||||
str(entry_script))
|
||||
shutil.copyfile(current_file_path / str(RUN_SCORING_SCRIPT), str(entry_script))
|
||||
source_config = SourceConfig(
|
||||
root_folder=source_directory_path,
|
||||
entry_script=entry_script,
|
||||
script_params=["--model-folder", ".",
|
||||
"--model-id", model_id,
|
||||
SCORE_SCRIPT,
|
||||
# The data folder must be relative to the root folder of the AzureML job. test_image_files
|
||||
# is then just the file relative to the data_folder
|
||||
"--data_folder", image_path.parent.name,
|
||||
"--image_files", image_path.name,
|
||||
"--use_dicom", "True",
|
||||
"--model_id", model_id],
|
||||
)
|
||||
script_params=["--model_id", model_id,
|
||||
"--script_name", SCORE_SCRIPT,
|
||||
"--datastore_name", azure_config.datastore_name,
|
||||
"--datastore_image_path", str(Path(target_path) / IMAGEDATA_FILE_NAME)])
|
||||
run_config = create_run_config(azure_config, source_config, environment_name=python_environment_name)
|
||||
exp = Experiment(workspace=workspace, name=args.experiment_name)
|
||||
run = exp.submit(run_config)
|
||||
|
@ -122,4 +124,4 @@ def submit_for_inference(args: SubmitForInferenceConfig, workspace: Workspace, a
|
|||
logging.info(f"Run URL: {run.get_portal_url()}")
|
||||
source_directory.cleanup()
|
||||
logging.info(f"Deleted submission directory {source_directory_path}")
|
||||
return run.id
|
||||
return run.id, target_path
|
||||
|
|
Загрузка…
Ссылка в новой задаче