put create workspace in the first place
This commit is contained in:
Родитель
0c79d68381
Коммит
9e6c4680fe
|
@ -96,9 +96,7 @@
|
|||
" * 1.3. [Preprocess for GenSen Model](#1.3-Preprocess-for-GenSen-Model) \n",
|
||||
" * 1.4. [Upload to Azure Blob Storage](#1.4-Upload-to-Azure-Blob-Storage) \n",
|
||||
"2. [Train GenSen Model with Distributed Pytorch with Horovod on AzureML](#2-Train-GenSen-Model-with-Distributed-Pytorch-with-Horovod-on-AzureML) \n",
|
||||
" * 2.1. [Initialization](#2.1-Initialization) \n",
|
||||
" * 2.1.1 [Initialize Workspace](#2.1.1-Initialize-Workspace) \n",
|
||||
" * 2.1.2 [Create or Attach Existing AmlCompute](#2.1.2-Create-or-Attach-Existing-AmlCompute) \n",
|
||||
" * 2.1 [Create or Attach Existing AmlCompute](#2.1-Create-or-Attach-Existing-AmlCompute) \n",
|
||||
" * 2.2. [Access to a Project Directory](#2.2-Access-to-a-Project-Directory) \n",
|
||||
" * 2.3. [Train Model on the Remote Compute](#2.3-Train-Model-on-the-Remote-Compute) \n",
|
||||
" * 2.3.1 [Prepare Training Script](#2.3.1-Prepare-Training-Script) \n",
|
||||
|
@ -129,7 +127,9 @@
|
|||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"metadata": {},
|
||||
"metadata": {
|
||||
"scrolled": true
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
|
@ -165,6 +165,60 @@
|
|||
"print(\"Pandas version: {}\".format(pd.__version__))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"**Initialize Workspace**\n",
|
||||
"\n",
|
||||
"Initialize a [Workspace](https://docs.microsoft.com/azure/machine-learning/service/concept-azure-machine-learning-architecture#workspace) object from the existing workspace you created in the Prerequisites step. For instructions on how to do this, see [here](README.md). `Workspace.from_config()` creates a workspace object from the details stored in `config.json`."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Performing interactive authentication. Please follow the instructions on the terminal.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"WARNING - Note, we have launched a browser for you to login. For old experience with device code, use \"az login --use-device-code\"\n",
|
||||
"WARNING - You have logged in. Now let us find all the subscriptions to which you have access...\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Interactive authentication successfully completed.\n",
|
||||
"Workspace name: MAIDAPTest\n",
|
||||
"Azure region: eastus2\n",
|
||||
"Subscription id: 15ae9cb6-95c1-483d-a0e3-b1a1a3b06324\n",
|
||||
"Resource group: nlprg\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"ws = azureml_utils.get_or_create_workspace(\n",
|
||||
" subscription_id=\"<SUBSCRIPTION_ID>\",\n",
|
||||
" resource_group=\"<RESOURCE_GROUP>\",\n",
|
||||
" workspace_name=\"<WORKSPACE_NAME>\",\n",
|
||||
" workspace_region=\"<WORKSPACE_REGION>\"\n",
|
||||
")\n",
|
||||
"print('Workspace name: ' + ws.name, \n",
|
||||
" 'Azure region: ' + ws.location, \n",
|
||||
" 'Subscription id: ' + ws.subscription_id, \n",
|
||||
" 'Resource group: ' + ws.resource_group, sep='\\n')"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
|
@ -217,7 +271,7 @@
|
|||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 64,
|
||||
"execution_count": 31,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
|
@ -563,24 +617,32 @@
|
|||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 74,
|
||||
"execution_count": 32,
|
||||
"metadata": {},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import azureml.data\n",
|
||||
"from azureml.data.azure_storage_datastore import AzureFileDatastore\n",
|
||||
"\n",
|
||||
"data_folder = os.path.join(BASE_DATA_PATH, \"clean\\snli_1.0\")"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 33,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"AzureFile maidaptest3334372853 azureml-filestore-792de9d4-7d0a-464c-b40a-58584f23f5ec $AZUREML_DATAREFERENCE_liqungensen ..\\..\\data\\clean\\snli_1.0\n"
|
||||
"AzureFile maidaptest3334372853 azureml-filestore-792de9d4-7d0a-464c-b40a-58584f23f5ec $AZUREML_DATAREFERENCE_liqungensen\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"import azureml.data\n",
|
||||
"from azureml.data.azure_storage_datastore import AzureFileDatastore\n",
|
||||
"\n",
|
||||
"data_folder = os.path.join(BASE_DATA_PATH, \"clean\\snli_1.0\")\n",
|
||||
"ds = ws.get_default_datastore()\n",
|
||||
"print(ds.datastore_type, ds.account_name, ds.container_name, ds.as_mount(), data_folder)"
|
||||
"print(ds.datastore_type, ds.account_name, ds.container_name, ds.as_mount())"
|
||||
]
|
||||
},
|
||||
{
|
||||
|
@ -675,69 +737,7 @@
|
|||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 2.1 Initialization\n",
|
||||
"In this section, we will initialize a workspace and create a AmlCompute for training."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### 2.1.1 Initialize Workspace\n",
|
||||
"\n",
|
||||
"Initialize a [Workspace](https://docs.microsoft.com/azure/machine-learning/service/concept-azure-machine-learning-architecture#workspace) object from the existing workspace you created in the Prerequisites step. For instructions on how to do this, see [here](README.md). `Workspace.from_config()` creates a workspace object from the details stored in `config.json`."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"metadata": {},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Performing interactive authentication. Please follow the instructions on the terminal.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"WARNING - Note, we have launched a browser for you to login. For old experience with device code, use \"az login --use-device-code\"\n",
|
||||
"WARNING - You have logged in. Now let us find all the subscriptions to which you have access...\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Interactive authentication successfully completed.\n",
|
||||
"Workspace name: MAIDAPTest\n",
|
||||
"Azure region: eastus2\n",
|
||||
"Subscription id: 15ae9cb6-95c1-483d-a0e3-b1a1a3b06324\n",
|
||||
"Resource group: nlprg\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"ws = azureml_utils.get_or_create_workspace(\n",
|
||||
" subscription_id=\"<SUBSCRIPTION_ID>\",\n",
|
||||
" resource_group=\"<RESOURCE_GROUP>\",\n",
|
||||
" workspace_name=\"<WORKSPACE_NAME>\",\n",
|
||||
" workspace_region=\"<WORKSPACE_REGION>\"\n",
|
||||
")\n",
|
||||
"print('Workspace name: ' + ws.name, \n",
|
||||
" 'Azure region: ' + ws.location, \n",
|
||||
" 'Subscription id: ' + ws.subscription_id, \n",
|
||||
" 'Resource group: ' + ws.resource_group, sep='\\n')"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### 2.1.2 Create or Attach Existing AmlCompute\n",
|
||||
"## 2.1 Create or Attach Existing AmlCompute\n",
|
||||
"You will need to create a [compute target](https://docs.microsoft.com/azure/machine-learning/service/concept-azure-machine-learning-architecture#compute-target) for training your model. In this tutorial, we use Azure ML managed compute ([AmlCompute](https://docs.microsoft.com/azure/machine-learning/service/how-to-set-up-training-targets#amlcompute)) for our remote training compute resource. Specifically, the below code creates an `STANDARD_NC6` GPU cluster that autoscales from `0` to `4` nodes.\n",
|
||||
"\n",
|
||||
"**Creation of AmlCompute takes approximately 5 minutes.** If the AmlCompute with that name is already in your workspace, this code will skip the creation process.\n",
|
||||
|
|
|
@ -0,0 +1,45 @@
|
|||
{
|
||||
"training": {
|
||||
"optimizer": "adam",
|
||||
"clip_c": 1,
|
||||
"lrate": 0.0001,
|
||||
"batch_size": 48,
|
||||
"n_gpus": 1,
|
||||
"stop_patience": 2
|
||||
},
|
||||
"management": {
|
||||
"monitor_loss": 480,
|
||||
"print_samples": 12800,
|
||||
"checkpoint_freq": 480000,
|
||||
"eval_freq": 9600
|
||||
},
|
||||
"data": {"paths": [
|
||||
{
|
||||
"train_src": "data/processed/snli_1.0_train.txt.s1.tok",
|
||||
"train_trg": "data/processed/snli_1.0_train.txt.s2.tok",
|
||||
"val_src": "data/processed/snli_1.0_dev.txt.s1.tok",
|
||||
"val_trg": "data/processed/snli_1.0_dev.txt.s1.tok",
|
||||
"taskname": "snli"
|
||||
}
|
||||
],
|
||||
"max_src_length": 90,
|
||||
"max_trg_length": 90,
|
||||
"task": "multi-seq2seq-nli",
|
||||
"save_dir": "data/models/example",
|
||||
"nli_train": "data/processed/snli_1.0_train.txt.clean.noblank",
|
||||
"nli_dev": "data/processed/snli_1.0_dev.txt.clean.noblank",
|
||||
"nli_test": "data/processed/snli_1.0_test.txt.clean.noblank"
|
||||
},
|
||||
"model": {
|
||||
"dim_src": 2048,
|
||||
"dim_trg": 2048,
|
||||
"dim_word_src": 512,
|
||||
"dim_word_trg": 512,
|
||||
"n_words_src": 80000,
|
||||
"n_words_trg": 30000,
|
||||
"n_layers_src": 1,
|
||||
"bidirectional": true,
|
||||
"layernorm": false,
|
||||
"dropout": 0.8
|
||||
}
|
||||
}
|
Загрузка…
Ссылка в новой задаче