* get started script

* add account_setup.sh

* fix typo

* fix typo

* support shared key

* fix typos

* retrieve batch/storage account keys

* fix bug

* typo

* fix if logic

* fix typo

* bug

* retrieve batch primary key

* storage list_keys

* storage keys

* storage key exceptin

* storage key

* storage key

* storage key

* storage key

* storage key

* storage key

* storage key

* storage key

* storage key\

* storage key

* storage key

* storage key

* storage key

* storage key

* storage key

* storage key

* storage key

* storage key

* storage key

* storage key

* storage key

* storage key

* storage key

* storage key

* storage key

* storage key

* storage key

* storage key

* storage key\

* storage key

* storage key

* storage key

* delete resource group

* print

* print

* print

* exit from delete rg

* delete resource group

* aad auth

* resource group name

* add docs for get started script

* update python script location

* update doc

* update doc

* update doc

* fix credential setting names

* fix credential setting names

* address review feedback
This commit is contained in:
zfengms 2018-04-27 11:43:36 -07:00 коммит произвёл GitHub
Родитель 0fbfd4cb07
Коммит e1a3c14cdf
Не найден ключ, соответствующий данной подписи
Идентификатор ключа GPG: 4AEE18F83AFDEB23
5 изменённых файлов: 687 добавлений и 14 удалений

576
account_setup.py Normal file
Просмотреть файл

@ -0,0 +1,576 @@
'''
doAzureParallel Getting Started script
'''
import sys
import threading
import time
import uuid
import json
import string
import random
from azure.common import credentials
from azure.graphrbac import GraphRbacManagementClient
from azure.graphrbac.models import ApplicationCreateParameters, PasswordCredential, ServicePrincipalCreateParameters
from azure.graphrbac.models.graph_error import GraphErrorException
from azure.mgmt.authorization import AuthorizationManagementClient
from azure.mgmt.batch import BatchManagementClient
from azure.mgmt.batch.models import AutoStorageBaseProperties, BatchAccountCreateParameters
from azure.mgmt.network import NetworkManagementClient
from azure.mgmt.network.models import AddressSpace, Subnet, VirtualNetwork
from azure.mgmt.resource import ResourceManagementClient
from azure.mgmt.storage import StorageManagementClient
from azure.mgmt.storage.models import Kind, Sku, SkuName, StorageAccountCreateParameters
from azure.mgmt.subscription import SubscriptionClient
from datetime import datetime, timezone
from msrestazure.azure_cloud import AZURE_PUBLIC_CLOUD
from msrestazure.azure_exceptions import CloudError
class AccountSetupError(Exception):
pass
class DefaultSettings():
authentication = "sharedkey"
resource_group = 'doazp'
storage_account = 'doazpstorage'
batch_account = 'doazpbatch'
application_name = 'doazpapp'
application_credential_name = 'doazpappcredential'
service_principal = 'doazpsp'
region = 'westus'
def create_resource_group(credentials, subscription_id, **kwargs):
"""
Create a resource group
:param credentials: msrestazure.azure_active_directory.AdalAuthentication
:param subscription_id: str
:param **resource_group: str
:param **region: str
"""
resource_client = ResourceManagementClient(credentials, subscription_id)
resource_client.resource_groups.list()
for i in range(3):
try:
resource_group = resource_client.resource_groups.create_or_update(
resource_group_name=kwargs.get("resource_group", DefaultSettings.resource_group),
parameters={
'location': kwargs.get("region", DefaultSettings.region),
}
)
except CloudError as e:
if i == 2:
raise AccountSetupError(
"Unable to create resource group in region {}".format(kwargs.get("region", DefaultSettings.region)))
print(e.message)
print("Please try again.")
kwargs["resource_group"] = prompt_with_default("Azure Region", DefaultSettings.region)
return resource_group.id
def delete_resource_group(credentials, subscription_id, resource_group):
"""
Delete a resource group
:param credentials: msrestazure.azure_active_directory.AdalAuthentication
:param subscription_id: str
:param resource_group: str
"""
resource_client = ResourceManagementClient(credentials, subscription_id)
resource_client.resource_groups.list()
delete_async_operation = resource_client.resource_groups.delete(
resource_group_name=resource_group
)
delete_async_operation.wait()
def create_storage_account(credentials, subscription_id, **kwargs):
"""
Create a Storage account
:param credentials: msrestazure.azure_active_directory.AdalAuthentication
:param subscription_id: str
:param **resource_group: str
:param **storage_account: str
:param **region: str
"""
storage_management_client = StorageManagementClient(credentials, subscription_id)
storage_account = storage_management_client.storage_accounts.create(
resource_group_name=kwargs.get("resource_group", DefaultSettings.resource_group),
account_name=kwargs.get("storage_account", DefaultSettings.storage_account),
parameters=StorageAccountCreateParameters(
sku=Sku(SkuName.standard_lrs),
kind=Kind.storage,
location=kwargs.get('region', DefaultSettings.region)
)
)
return storage_account.result().id
def storage_account_get_keys(credentials, subscription_id, **kwargs):
"""
get Storage account keys
:param credentials: msrestazure.azure_active_directory.AdalAuthentication
:param subscription_id: str
:param **resource_group: str
:param **storage_account: str
:param **region: str
"""
storage_management_client = StorageManagementClient(credentials, subscription_id)
storage_account_keys = storage_management_client.storage_accounts.list_keys(
resource_group_name=kwargs.get("resource_group", DefaultSettings.resource_group),
account_name=kwargs.get("storage_account", DefaultSettings.storage_account)
)
return storage_account_keys.keys[0].value
def storage_account_get_endpoint_suffix(credentials, subscription_id, **kwargs):
"""
get Storage account keys
:param credentials: msrestazure.azure_active_directory.AdalAuthentication
:param subscription_id: str
:param **resource_group: str
:param **storage_account: str
:param **region: str
"""
storage_management_client = StorageManagementClient(credentials, subscription_id)
storage_account = storage_management_client.storage_accounts.get_properties(
resource_group_name=kwargs.get("resource_group", DefaultSettings.resource_group),
account_name=kwargs.get("storage_account", DefaultSettings.storage_account)
)
# convert https://accountname.blob.core.windows.net/ to core.windows.net
endpoint_suffix = storage_account.primary_endpoints.blob.split(".blob.")[1].split("/")[0]
return endpoint_suffix
def create_batch_account(credentials, subscription_id, **kwargs):
"""
Create a Batch account
:param credentials: msrestazure.azure_active_directory.AdalAuthentication
:param subscription_id: str
:param **resource_group: str
:param **batch_account: str
:param **region: str
:param **storage_account_id: str
"""
batch_management_client = BatchManagementClient(credentials, subscription_id)
batch_account = batch_management_client.batch_account.create(
resource_group_name=kwargs.get("resource_group", DefaultSettings.resource_group),
account_name=kwargs.get("batch_account", DefaultSettings.batch_account),
parameters=BatchAccountCreateParameters(
location=kwargs.get('region', DefaultSettings.region),
auto_storage=AutoStorageBaseProperties(
storage_account_id=kwargs.get('storage_account_id', DefaultSettings.region)
)
)
)
return batch_account.result().id
def batch_account_get_keys(credentials, subscription_id, **kwargs):
"""
get Batch account keys
:param credentials: msrestazure.azure_active_directory.AdalAuthentication
:param subscription_id: str
:param **resource_group: str
:param **batch_account: str
"""
batch_management_client = BatchManagementClient(credentials, subscription_id)
batch_account_keys = batch_management_client.batch_account.get_keys(
resource_group_name=kwargs.get("resource_group", DefaultSettings.resource_group),
account_name=kwargs.get("batch_account", DefaultSettings.batch_account)
)
return batch_account_keys.primary
def batch_account_get_url(credentials, subscription_id, **kwargs):
"""
get Batch account url
:param credentials: msrestazure.azure_active_directory.AdalAuthentication
:param subscription_id: str
:param **resource_group: str
:param **batch_account: str
"""
batch_management_client = BatchManagementClient(credentials, subscription_id)
batch_account = batch_management_client.batch_account.get(
resource_group_name=kwargs.get("resource_group", DefaultSettings.resource_group),
account_name=kwargs.get("batch_account", DefaultSettings.batch_account)
)
return "https://" + batch_account.account_endpoint
def create_vnet(credentials, subscription_id, **kwargs):
"""
Create a Batch account
:param credentials: msrestazure.azure_active_directory.AdalAuthentication
:param subscription_id: str
:param **resource_group: str
:param **virtual_network_name: str
:param **subnet_name: str
:param **region: str
"""
network_client = NetworkManagementClient(credentials, subscription_id)
resource_group_name = kwargs.get("resource_group", DefaultSettings.resource_group)
virtual_network_name = kwargs.get("virtual_network_name", DefaultSettings.virtual_network_name)
subnet_name = kwargs.get("subnet_name", DefaultSettings.subnet_name)
# get vnet, and subnet if they exist
virtual_network = subnet = None
try:
virtual_network = network_client.virtual_networks.get(
resource_group_name=resource_group_name,
virtual_network_name=virtual_network_name,
)
except CloudError as e:
pass
if virtual_network:
confirmation_prompt = "A virtual network with the same name ({}) was found. \n"\
"Please note that the existing address space and subnets may be changed or destroyed. \n"\
"Do you want to use this virtual network? (y/n): ".format(virtual_network_name)
deny_error = AccountSetupError("Virtual network already exists, not recreating.")
unrecognized_input_error = AccountSetupError("Input not recognized.")
prompt_for_confirmation(confirmation_prompt, deny_error, unrecognized_input_error)
virtual_network = network_client.virtual_networks.create_or_update(
resource_group_name=resource_group_name,
virtual_network_name=kwargs.get("virtual_network_name", DefaultSettings.virtual_network_name),
parameters=VirtualNetwork(
location=kwargs.get("region", DefaultSettings.region),
address_space=AddressSpace(["10.0.0.0/24"])
)
)
virtual_network = virtual_network.result()
subnet = network_client.subnets.create_or_update(
resource_group_name=resource_group_name,
virtual_network_name=virtual_network_name,
subnet_name=subnet_name,
subnet_parameters=Subnet(
address_prefix='10.0.0.0/24'
)
)
return subnet.result().id
def create_aad_user(credentials, tenant_id, **kwargs):
"""
Create an AAD application and service principal
:param credentials: msrestazure.azure_active_directory.AdalAuthentication
:param tenant_id: str
:param **application_name: str
"""
graph_rbac_client = GraphRbacManagementClient(
credentials,
tenant_id,
base_url=AZURE_PUBLIC_CLOUD.endpoints.active_directory_graph_resource_id
)
application_credential = uuid.uuid4()
try:
display_name = kwargs.get("application_name", DefaultSettings.application_name)
application = graph_rbac_client.applications.create(
parameters=ApplicationCreateParameters(
available_to_other_tenants=False,
identifier_uris=["http://{}.com".format(display_name)],
display_name=display_name,
password_credentials=[
PasswordCredential(
end_date=datetime(2299, 12, 31, 0, 0, 0, 0, tzinfo=timezone.utc),
value=application_credential,
key_id=uuid.uuid4()
)
]
)
)
service_principal = graph_rbac_client.service_principals.create(
ServicePrincipalCreateParameters(
app_id=application.app_id,
account_enabled=True
)
)
except GraphErrorException as e:
if e.inner_exception.code == "Request_BadRequest":
application = next(graph_rbac_client.applications.list(
filter="identifierUris/any(c:c eq 'http://{}.com')".format(display_name)))
confirmation_prompt = "Previously created application with name {} found. "\
"Would you like to use it? (y/n): ".format(application.display_name)
prompt_for_confirmation(confirmation_prompt, e, ValueError("Response not recognized. Please try again."))
service_principal = next(graph_rbac_client.service_principals.list(
filter="appId eq '{}'".format(application.app_id)))
else:
raise e
return application.app_id, service_principal.object_id, str(application_credential)
def create_role_assignment(credentials, subscription_id, scope, principal_id):
"""
Gives service principal contributor role authorization on scope
:param credentials: msrestazure.azure_active_directory.AdalAuthentication
:param subscription_id: str
:param scope: str
:param principal_id: str
"""
authorization_client = AuthorizationManagementClient(credentials, subscription_id)
role_name = 'Contributor'
roles = list(authorization_client.role_definitions.list(
scope,
filter="roleName eq '{}'".format(role_name)
))
contributor_role = roles[0]
for i in range(10):
try:
authorization_client.role_assignments.create(
scope,
uuid.uuid4(),
{
'role_definition_id': contributor_role.id,
'principal_id': principal_id
}
)
break
except CloudError as e:
# ignore error if service principal has not yet been created
time.sleep(1)
if i == 10:
raise e
def format_secrets(**kwargs):
'''
Returns the secrets for the created resources to be placed in credentials.json
The following form is returned:
For Azure Acitve Directory Authentication:
"service_principal": {
"tenantId": "<AAD Diretory ID>"
"clientId": "<AAD App Application ID>"
"credential": "<AAD App Password>"
"batchAccountResourceId": "</batch/account/resource/id>"
"storageAccountResourceId": "</storage/account/resource/id>"
"storageEndpointSuffix": "<storage account endpoint suffix>"
For Shared Key Authentication:
"batchAccount": {
"name": "<batch account name>",
"key": "<batch account key>",
"url": "https://batchaccount.region.batch.azure.com"
},
"storageAccount": {
"name": "<storage account name>",
"key": "<storage account key>",
"endpointSuffix": "core.windows.net"
}
'''
return json.dumps(kwargs, indent = 4)
def prompt_for_confirmation(prompt, deny_error, unrecognized_input_error):
"""
Prompt user for confirmation, 'y' for confirm, 'n' for deny
:param prompt: str
:param deny_error: Exception
:param unrecognized_input_error: Exception
:return None if prompt successful, else raises error
"""
confirmation = input(prompt).lower()
for i in range(3):
if confirmation == "n":
raise deny_error
elif confirmation == "y":
break
elif confirmation != "y" and i == 2:
raise unrecognized_input_error
confirmation = input("Please input 'y' or 'n': ").lower()
def prompt_with_default(key, value):
user_value = input("{0} [{1}]: ".format(key, value))
if user_value != "":
return user_value
else:
return value
def prompt_tenant_selection(tenant_ids):
print("Multiple tenants detected. Please input the ID of the tenant you wish to use.")
print("Tenants:", ", ".join(tenant_ids))
given_tenant_id = input("Please input the ID of the tenant you wish to use: ")
for i in range(3):
if given_tenant_id in tenant_ids:
return given_tenant_id
if i != 2:
given_tenant_id = input("Input not recognized, please try again: ")
raise AccountSetupError("Tenant selection not recognized after 3 attempts.")
class Spinner:
busy = False
delay = 0.1
@staticmethod
def spinning_cursor():
while 1:
for cursor in '|/-\\': yield cursor
def __init__(self, delay=None):
self.spinner_generator = self.spinning_cursor()
if delay and float(delay): self.delay = delay
def __enter__(self):
return self.start()
def __exit__(self, exc_type, exc_val, exc_tb):
return self.stop()
def spinner_task(self):
while self.busy:
sys.stdout.write(next(self.spinner_generator))
sys.stdout.flush()
time.sleep(self.delay)
sys.stdout.write('\b')
sys.stdout.flush()
def start(self):
self.busy = True
threading.Thread(target=self.spinner_task, daemon=True).start()
def stop(self):
self.busy = False
time.sleep(self.delay)
if __name__ == "__main__":
print("\nGetting credentials.")
# get credentials and tenant_id
creds, subscription_id = credentials.get_azure_cli_credentials()
subscription_client = SubscriptionClient(creds)
tenant_ids = [tenant.id for tenant in subscription_client.tenants.list()]
if len(tenant_ids) != 1:
tenant_id = prompt_tenant_selection(tenant_ids)
else:
tenant_id = tenant_ids[0]
if len(sys.argv) > 1 and sys.argv[1] == "deleteresourcegroup":
resource_group = input("Resource Group Name: ")
with Spinner():
delete_resource_group(creds, subscription_id, resource_group)
print("Deleted resource group.")
sys.exit()
if len(sys.argv) > 1 and sys.argv[1] == "serviceprincipal":
authentication = "serviceprincipal"
else:
authentication = DefaultSettings.authentication
print("Input the desired names and values for your Azure resources. "\
"Default values are provided in the brackets. "\
"Hit enter to use default.")
chars = string.ascii_lowercase
suffix = "".join(random.choice(chars) for i in range(4))
DefaultSettings.storage_account += suffix
DefaultSettings.batch_account += suffix
if authentication == DefaultSettings.authentication:
kwargs = {
"region": prompt_with_default("Azure Region", DefaultSettings.region),
"resource_group": prompt_with_default("Resource Group Name", DefaultSettings.resource_group),
"storage_account": prompt_with_default("Storage Account Name", DefaultSettings.storage_account),
"batch_account": prompt_with_default("Batch Account Name", DefaultSettings.batch_account)
}
else:
kwargs = {
"region": prompt_with_default("Azure Region", DefaultSettings.region),
"resource_group": prompt_with_default("Resource Group Name", DefaultSettings.resource_group),
"storage_account": prompt_with_default("Storage Account Name", DefaultSettings.storage_account),
"batch_account": prompt_with_default("Batch Account Name", DefaultSettings.batch_account),
# "virtual_network_name": prompt_with_default("Virtual Network Name", DefaultSettings.virtual_network_name),
# "subnet_name": prompt_with_default("Subnet Name", DefaultSettings.subnet_name),
"application_name": prompt_with_default("Active Directory Application Name", DefaultSettings.application_name),
"application_credential_name": prompt_with_default("Active Directory Application Credential Name", DefaultSettings.resource_group),
"service_principal": prompt_with_default("Service Principal Name", DefaultSettings.service_principal)
}
print("Creating the Azure resources.")
# create resource group
with Spinner():
resource_group_id = create_resource_group(creds, subscription_id, **kwargs)
kwargs["resource_group_id"] = resource_group_id
print("Created Resource Group.")
# create storage account
with Spinner():
storage_account_id = create_storage_account(creds, subscription_id, **kwargs)
kwargs["storage_account_id"] = storage_account_id
print("Created Storage Account.")
with Spinner():
storage_account_endpoint_suffix = storage_account_get_endpoint_suffix(creds, subscription_id, **kwargs)
kwargs["storage_account_endpoint_suffix"] = storage_account_endpoint_suffix
print("Retrieved Storage Account endpoint suffix.")
# create batch account
with Spinner():
batch_account_id = create_batch_account(creds, subscription_id, **kwargs)
print("Created Batch Account.")
if authentication == DefaultSettings.authentication:
# retrieve batch account key
with Spinner():
batch_account_key = batch_account_get_keys(creds, subscription_id, **kwargs)
kwargs["batch_account_key"] = batch_account_key
print("Retrieved Batch Account key.")
# retrieve batch account url
with Spinner():
batch_account_url = batch_account_get_url(creds, subscription_id, **kwargs)
kwargs["batch_account_url"] = batch_account_url
print("Retrieved Batch Account url.")
with Spinner():
storage_account_keys = storage_account_get_keys(creds, subscription_id, **kwargs)
kwargs["storage_account_key"] = storage_account_keys
print("Retrieved Storage Account key.")
secrets = format_secrets(
**{
"batchAccount": {
"name": kwargs["batch_account"],
"key": kwargs["batch_account_key"],
"url": kwargs["batch_account_url"]
},
"storageAccount": {
"name": kwargs["storage_account"],
"key": kwargs["storage_account_key"],
"endpointSuffix": kwargs["storage_account_endpoint_suffix"]
}
}
)
else:
# create vnet with a subnet
# subnet_id = create_vnet(creds, subscription_id)
# create AAD application and service principal
with Spinner():
profile = credentials.get_cli_profile()
aad_cred, subscirption_id, tenant_id = profile.get_login_credentials(
resource=AZURE_PUBLIC_CLOUD.endpoints.active_directory_graph_resource_id
)
application_id, service_principal_object_id, application_credential = create_aad_user(aad_cred, tenant_id, **kwargs)
print("Created Azure Active Directory service principal.")
with Spinner():
create_role_assignment(creds, subscription_id, resource_group_id, service_principal_object_id)
print("Configured permsisions.")
secrets = format_secrets(
**{
"servicePrincipal": {
"tenantId": tenant_id,
"clientId": application_id,
"credential": application_credential,
"batchAccountResourceId": batch_account_id,
"storageAccountResourceId": storage_account_id,
"storageEndpointSuffix": kwargs["storage_account_endpoint_suffix"]
}
}
)
print("\n# Copy the following into your credentials.json file\n{}".format(secrets))

11
account_setup.sh Normal file
Просмотреть файл

@ -0,0 +1,11 @@
#!/bin/bash
echo "Installing dependencies..." &&
pip install --force-reinstall --upgrade --user pyyaml==3.12 azure==3.0.0 azure-cli-core==2.0.30 msrestazure==0.4.25 > /dev/null 2>&1 &&
echo "Finished installing dependencies." &&
echo "Getting account setup script..." &&
wget -q https://raw.githubusercontent.com/Azure/doAzureParallel/master/account_setup.py -O account_setup.py &&
chmod 755 account_setup.py &&
echo "Finished getting account setup script." &&
echo "Running account setup script..." &&
python3 account_setup.py $1

Просмотреть файл

@ -0,0 +1,82 @@
# Getting Started Script
The provided account setup script creates and configures all of the required Azure resources.
The script will create and configure the following resources:
- Resource group
- Storage account
- Batch account
- Azure Active Directory application and service principal if AAD authentication is used, default is shared key authentication
The script outputs all of the necessary information to use `doAzureParallel`, just copy the output into your credentials.json file created by doAzureParallel::generateCredentialsConfig().
## Usage
### Create credentials
Copy and paste the following into an [Azure Cloud Shell](https://shell.azure.com):
```sh
wget -q https://raw.githubusercontent.com/Azure/doAzureParallel/master/account_setup.sh &&
chmod 755 account_setup.sh &&
/bin/bash account_setup.sh
```
A series of prompts will appear, and you can set the values you desire for each field. Default values appear in brackets `[]` and will be used if no value is provided.
```
Azure Region [westus]:
Resource Group Name [doazp]:
Storage Account Name [doazpstorage]:
Batch Account Name [doazpbatch]:
```
following prompts will only show up when you use AAD auth by running
```sh
wget -q https://raw.githubusercontent.com/Azure/doAzureParallel/master/account_setup.sh &&
chmod 755 account_setup.sh &&
/bin/bash account_setup.sh serviceprincipal
```
```
Active Directory Application Name [doazpapp]:
Active Directory Application Credential Name [doazp]:
Service Principal Name [doazpsp]
```
Once the script has finished running you will see the following output:
For Shared Key Authentication (Default):
```
"batchAccount": {
"name": "batchaccountname",
"key": "batch account key",
"url": "https://batchaccountname.region.batch.azure.com"
},
"storageAccount": {
"name": "storageaccoutname",
"key": "storage account key",
"endpointSuffix": "core.windows.net"
}
```
For Azure Active Directory Authentication:
```
"service_principal": {
"tenant_id": "<AAD Diretory ID>"
"client_id": "<AAD App Application ID>"
"credential": "<AAD App Password>"
"batch_account_resource_id": "</batch/account/resource/id>"
"storage_account_resource_id": "</storage/account/resource/id>"
}
```
Copy the entire section to your `credentials.json`. If you do not have a `credentials.json` file, you can create one in your current working directory by running `doAzureParallel::generateCredentialsConfig()`.
### Delete resource group
Copy and paste the following into an [Azure Cloud Shell](https://shell.azure.com):
```sh
wget -q https://raw.githubusercontent.com/Azure/doAzureParallel/master/account_setup.sh &&
chmod 755 account_setup.sh &&
/bin/bash account_setup.sh deleteresourcegroup
```
Following prompt will appear, and you can set the resource group name, and all resources contained in the resource group will be deleted.
```
Resource Group Name:

Просмотреть файл

@ -15,7 +15,7 @@ results <- foreach(i = 1:number_of_iterations) %dopar% {
## Chunking Data ## Chunking Data
A common scenario would be to chunk your data accross the pool so that your R code is running agaisnt a single chunk. In doAzureParallel, we help you achieve this by iterating through your chunks so that each chunk is mapped to an interation of the distributed *foreach* loop. A common scenario would be to chunk your data accross the pool so that your R code is running agaisnt a single chunk. In doAzureParallel, we help you achieve this by iterating through your chunks so that each chunk is mapped to an interaction of the distributed *foreach* loop.
```R ```R
chunks <- split(<data_set>, 10) chunks <- split(<data_set>, 10)
@ -31,7 +31,7 @@ Some workloads may require data pre-loaded into the cluster as soon as the clust
**NOTE** The default setting for storage containers is _private_. You can either use a [SAS](https://docs.microsoft.com/en-us/azure/storage/common/storage-dotnet-shared-access-signature-part-1) to access the resources or [make the container public using the Azure Portal](https://docs.microsoft.com/en-us/azure/storage/blobs/storage-manage-access-to-resources). **NOTE** The default setting for storage containers is _private_. You can either use a [SAS](https://docs.microsoft.com/en-us/azure/storage/common/storage-dotnet-shared-access-signature-part-1) to access the resources or [make the container public using the Azure Portal](https://docs.microsoft.com/en-us/azure/storage/blobs/storage-manage-access-to-resources).
**IMPORTANT** Public storage containers can be ready by anyone who knows the URL. We do not recommend storing any private or sensitive information in public storage containers! **IMPORTANT** Public storage containers can be read by anyone who knows the URL. We do not recommend storing any private or sensitive information in public storage containers!
Here's an example that uses data stored in a public location on Azure Blob Storage: Here's an example that uses data stored in a public location on Azure Blob Storage:

Просмотреть файл

@ -3,49 +3,53 @@ This section will provide information about how Azure works, how best to take ad
1. **Azure Introduction** [(link)](./00-azure-introduction.md) 1. **Azure Introduction** [(link)](./00-azure-introduction.md)
Using the *Data Science Virtual Machine (DSVM)* & *Azure Batch* Using the *Data Science Virtual Machine (DSVM)* & *Azure Batch*
2. **Virtual Machine Sizes** [(link)](./10-vm-sizes.md) 2. **Getting Started Script** [(link)](./02-getting-started-script.md)
Using the *Getting Started Script* to create credentials
3. **Virtual Machine Sizes** [(link)](./10-vm-sizes.md)
How do you choose the best VM type/size for your workload? How do you choose the best VM type/size for your workload?
3. **Autoscale** [(link)](./11-autoscale.md) 4. **Autoscale** [(link)](./11-autoscale.md)
Automatically scale up/down your cluster to save time and/or money. Automatically scale up/down your cluster to save time and/or money.
4. **Azure Limitations** [(link)](./12-quota-limitations.md) 5. **Azure Limitations** [(link)](./12-quota-limitations.md)
Learn about the limitations around the size of your cluster and the number of foreach jobs you can run in Azure. Learn about the limitations around the size of your cluster and the number of foreach jobs you can run in Azure.
4. **Package Management** [(link)](./20-package-management.md) 6. **Package Management** [(link)](./20-package-management.md)
Best practices for managing your R packages in code. This includes installation at the cluster or job level as well as how to use different package providers. Best practices for managing your R packages in code. This includes installation at the cluster or job level as well as how to use different package providers.
5. **Distributing your Data** [(link)](./21-distributing-data.md) 7. **Distributing your Data** [(link)](./21-distributing-data.md)
Best practices and limitations for working with distributed data. Best practices and limitations for working with distributed data.
6. **Parallelizing on each VM Core** [(link)](./22-parallelizing-cores.md) 8. **Parallelizing on each VM Core** [(link)](./22-parallelizing-cores.md)
Best practices and limitations for parallelizing your R code to each core in each VM in your pool Best practices and limitations for parallelizing your R code to each core in each VM in your pool
7. **Persistent Storage** [(link)](./23-persistent-storage.md) 9. **Persistent Storage** [(link)](./23-persistent-storage.md)
Taking advantage of persistent storage for long-running jobs Taking advantage of persistent storage for long-running jobs
8. **Customize Cluster** [(link)](./30-customize-cluster.md) 10. **Customize Cluster** [(link)](./30-customize-cluster.md)
Setting up your cluster to user's specific needs Setting up your cluster to user's specific needs
9. **Long Running Job** [(link)](./31-long-running-job.md) 11. **Long Running Job** [(link)](./31-long-running-job.md)
Best practices for managing long running jobs Best practices for managing long running jobs
10. **Programmatically generated config** [(link)](./33-programmatically-generate-config.md) 12. **Programmatically generated config** [(link)](./33-programmatically-generate-config.md)
Generate credentials and cluster config at runtime programmatically Generate credentials and cluster config at runtime programmatically
11. **National Cloud configuration" [(link)](/.34-national-clouds.md) 13. **National Cloud configuration" [(link)](/.34-national-clouds.md)
How to run workload in Azure national clouds How to run workload in Azure national clouds