The *doAzureParallel* package is a parallel backend for the widely popular *foreach* package. With *doAzureParallel*, each iteration of the *foreach* loop runs in parallel on an Azure Virtual Machine (VM), allowing users to scale up their R jobs to tens or hundreds of machines.
*doAzureParallel* is built to support the *foreach* parallel computing package. The *foreach* package supports parallel execution - it can execute multiple processes across some parallel backend. With just a few lines of code, the *doAzureParallel* package helps create a cluster in Azure, register it as a parallel backend, and seamlessly connects to the *foreach* package.
NOTE: The terms *pool* and *cluster* are used interchangably throughout this document.
4) Follow the on screen prompts to create the necessary Azure resources and copy the output into your credentials file. For more information, see [Getting Started Scripts](./docs/02-getting-started-script.md).
Set up your parallel backend with Azure. This is your set of Azure VMs.
```R
# 1. Generate your credential and cluster configuration files.
generateClusterConfig("cluster.json")
generateCredentialsConfig("credentials.json")
# 2. Fill out your credential config and cluster config files.
# Enter your Azure Batch Account & Azure Storage keys/account-info into your credential config ("credentials.json") and configure your cluster in your cluster config ("cluster.json")
# 3. Set your credentials - you need to give the R session your credentials to interact with Azure
setCredentials("credentials.json")
# 4. Register the pool. This will create a new pool if your pool hasn't already been provisioned.
cluster <-makeCluster("cluster.json")
# 5. Register the pool as your parallel backend
registerDoAzureParallel(cluster)
# 6. Check that your parallel backend has been registered
getDoParWorkers()
```
Run your parallel *foreach* loop with the *%dopar%* keyword. The *foreach* function will return the results of your parallel code.
This section will provide information about how Azure works, how best to take advantage of Azure, and best practices when using the doAzureParallel package.
Best practices for managing your R packages in code. This includes installation at the cluster or job level as well as how to use different package providers.