Updating website.
This commit is contained in:
Родитель
7dd9ec012c
Коммит
40c0366c79
|
@ -27,8 +27,7 @@ solution.
|
|||
</div>
|
||||
</div>
|
||||
<div class="col-md-6">
|
||||
If you have deployed a VM through the
|
||||
<a href="http://aka.ms/loanchargeoffsql">Azure AI Gallery</a>, all the steps below have already been performed and your database on that machine has all the resulting tables and stored procedures. Skip to the <a href="Typical.html?platform=cig">Typical Workflow</a> for a description of how these files were first created in R by a Data Scientist and then deployed to SQL stored procedures.
|
||||
If you have deployed a VM using the 'Deploy to Azure' button on the <a href="START_HERE.html">Quick start</a> page, all the steps below have already been performed and your database on that machine has all the resulting tables and stored procedures. Skip to the <a href="Typical.html?platform=cig">Typical Workflow</a> for a description of how these files were first created in R by a Data Scientist and then deployed to SQL stored procedures.
|
||||
</div>
|
||||
</div>
|
||||
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
This website contains information for multiple versions of a solution. Currently, the three pathways it shows are:
|
||||
|
||||
1. Azure SQL - deployed from Azure AI Gallery
|
||||
1. Azure SQL - deployed from 'Deploy to Azure'
|
||||
2. On-Premises SQL
|
||||
3. HDInsight (in progress; currently commented out)
|
||||
|
||||
|
@ -23,11 +23,11 @@ The third way to configure the website is through a parameter on the url:
|
|||
```
|
||||
https://microsoft.github.io/r-server-loan-chargeoff?path=hdi
|
||||
```
|
||||
The three values that can be specified are: `cig` (Azure AI Gallery), `onp` (On-Prem), `hdi` (HDInsight). Any other values are ignored, which means the site stays in its current configuration.
|
||||
The three values that can be specified are: `cig` ('Deploy to Azure' button), `onp` (On-Prem), `hdi` (HDInsight). Any other values are ignored, which means the site stays in its current configuration.
|
||||
|
||||
## Editing the Website
|
||||
|
||||
When you have some content that pertains to only one of the above solutions, add it to the page in a `<div>`, setting the class to one of the three values: `cig` (Azure AI Gallery), `onp` (On-Prem), `hdi` (HDInsight). The content inside these divs are visible only when the corresponding solution path is chosen. For example, in the code below, only one of the sentences would ever be visible on the website:
|
||||
When you have some content that pertains to only one of the above solutions, add it to the page in a `<div>`, setting the class to one of the three values: `cig` ('Deploy to Azure' button), `onp` (On-Prem), `hdi` (HDInsight). The content inside these divs are visible only when the corresponding solution path is chosen. For example, in the code below, only one of the sentences would ever be visible on the website:
|
||||
|
||||
```html
|
||||
<div class="cig"> This sentence will only appear when the CIG solution has been chosen.</div>
|
||||
|
|
|
@ -8,14 +8,15 @@ title: Quick Start
|
|||
|
||||
There are multiple ways you can try this solution package out for yourself.
|
||||
|
||||
* For **{{ site.cig_text }}**: Visit the [Azure AI Gallery](http://aka.ms/loanchargeoffsql) and use the `Deploy` button. All necessary software will be installed and configured for you as well as the initial deployment of the solution. You will be all set to follow along with the [Typical Workflow](Typical.html?platform=cig) on the deployed virtual machine.
|
||||
## {{ site.cig_text }}
|
||||
* For **{{ site.cig_text }}**: Use the button below to deploy the solution. All necessary software will be installed and configured for you as well as the initial deployment of the solution. You will be all set to follow along with the [Typical Workflow](Typical.html?platform=cig) on the deployed virtual machine.<br/>
|
||||
[![Deploy to Azure (SQL Server)](https://raw.githubusercontent.com/Azure/Azure-CortanaIntelligence-SolutionAuthoringWorkspace/master/docs/images/DeployToAzure.PNG)](https://portal.azure.com/#create/Microsoft.Template/uri/https%3A%2F%2Fraw.githubusercontent.com%2FMicrosoft%2Fr-server-loan-chargeoff%2Fmaster%2FArmTemplates%2Floan-chargeoff_arm.json)<br/>
|
||||
|
||||
|
||||
* For an **On-Prem SQL Server** installation, follow [these instructions](SetupSQL.html?platform=onp) to setup the server.
|
||||
## {{ site.onp_text }}
|
||||
* For an **{{ site.onp_text }}** installation, follow [these instructions](SetupSQL.html?platform=onp) to setup the server.
|
||||
|
||||
* Once the server is configured, you can use the [PowerShell Instructions](Powershell_Instructions.html?platform=onp) for a quick deployment of all tables to your own machine.
|
||||
|
||||
* For **{{ site.hdi_text }}**: Visit the [Azure AI Gallery](http://aka.ms/loanchargeoffhdi) and use the `Deploy` button. All necessary software will be installed and configured for you. You will be all set to follow along with the [Typical Workflow](Typical.html?platform=hdi) using the deployed cluster.
|
||||
|
||||
|
||||
|
||||
## {{ site.hdi_text }}
|
||||
* For **{{ site.hdi_text }}**: Use the button below to deploy the solution. All necessary software will be installed and configured for you. You will be all set to follow along with the [Typical Workflow](Typical.html?platform=hdi) using the deployed cluster.<br/>
|
||||
[![Deploy to Azure (HDInsight)](https://raw.githubusercontent.com/Azure/Azure-CortanaIntelligence-SolutionAuthoringWorkspace/master/docs/images/DeployToAzure.PNG)](https://portal.azure.com/#create/Microsoft.Template/uri/https%3A%2F%2Fraw.githubusercontent.com%2FMicrosoft%2Fr-server-loan-chargeoff%2Fmaster%2FArmTemplates%2Floan-chargeoff_hdi_arm.json)
|
|
@ -22,7 +22,7 @@ solution.
|
|||
<div class="col-md-6">
|
||||
The instructions on this page will help you to add this solution to your on premises SQL Server 2017.
|
||||
<p>
|
||||
If you instead would like to try this solution out on a virtual machine, visit the <a href="{{ site.aka_url }}">Azure AI Gallery</a> and use the Deploy button. All the configuration described below will be done for you, as well as the initial deployment of the solution. </p>
|
||||
If you instead would like to try this solution out on a virtual machine, visit the <a href="START_HERE.md">Quick Start page</a> and use the Deploy button. All the configuration described below will be done for you, as well as the initial deployment of the solution. </p>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
|
@ -32,11 +32,11 @@ solution.
|
|||
The rest of this page assumes you are configuring your on premises SQL Server 2016 or higher for this solution.
|
||||
|
||||
If you need a trial version of SQL Server 2017, see [What's New in SQL Server 2017](https://docs.microsoft.com/en-us/sql/sql-server/what-s-new-in-sql-server-2017) for download or VM options.
|
||||
For more information about SQL server 2017 and R service, please visit: <a href="https://msdn.microsoft.com/en-us/library/mt604847.aspx">https://msdn.microsoft.com/en-us/library/mt604847.asp</a>
|
||||
For more information about SQL server 2017 and ML Services, please visit: <a href="https://docs.microsoft.com/en-us/sql/advanced-analytics/what-s-new-in-sql-server-machine-learning-services">https://docs.microsoft.com/en-us/sql/advanced-analytics/what-s-new-in-sql-server-machine-learning-services</a>
|
||||
|
||||
Complete the steps in the Set up SQL Server R Services (In-Database) Instructions. The set up instructions file can found at <a href="https://msdn.microsoft.com/en-us/library/mt696069.aspx" target="_blank"> https://msdn.microsoft.com/en-us/library/mt696069.aspx</a>
|
||||
Complete the steps in the Set up SQL Server ML Services (In-Database) Instructions. The set up instructions file can found at <a href="https://docs.microsoft.com/en-us/sql/advanced-analytics/install/sql-r-services-windows-install" target="_blank"> https://docs.microsoft.com/en-us/sql/advanced-analytics/install/sql-r-services-windows-install</a>
|
||||
|
||||
* If you are using SQL Server 2016, make sure R Services (In-Database) is installed.
|
||||
* If you are using SQL Server 2016, make sure ML Services (In-Database) is installed.
|
||||
* If you are using SQL Server 2017, make sure Machine Learning Services (In-Database) is installed.
|
||||
|
||||
|
||||
|
|
10
Typical.md
10
Typical.md
|
@ -53,7 +53,7 @@ If you want to follow along and have *not* run the PowerShell script, you must t
|
|||
|
||||
<div class="cig">
|
||||
<p/><p>
|
||||
This step has already been done on your deployed Azure AI Gallery VM.
|
||||
This step has already been done on your 'Deploy to Azure' VM.
|
||||
</p>
|
||||
</div>
|
||||
|
||||
|
@ -65,7 +65,7 @@ You can perform these steps in your environment by using the instructions <a hr
|
|||
|
||||
<div class="hdi">
|
||||
<p/><p>
|
||||
The cluster has been created and data loaded for you when you used the Deploy button in the <a href="https://aka.ms/loanchargeoffhdi">Azure AI Gallery</a>. <strong>Once you complete the walkthrough, you will want to delete this cluster as it incurs expense whether it is in use or not - see <a href="https://microsoft.github.io/r-server-loan-chargeoff/hdinsight">HDInsight Cluster Maintenance</a> for more details.</strong>
|
||||
The cluster has been created and data loaded for you when you used the Deploy button on the <a href="START_HERE.md">Quick Start page</a>. <strong>Once you complete the walkthrough, you will want to delete this cluster as it incurs expense whether it is in use or not - see <a href="https://microsoft.github.io/r-server-loan-chargeoff/hdinsight">HDInsight Cluster Maintenance</a> for more details.</strong>
|
||||
</p>
|
||||
</div>
|
||||
|
||||
|
@ -75,12 +75,12 @@ The cluster has been created and data loaded for you when you used the Deploy bu
|
|||
## Step 2: Data Prep and Modeling with Debra the Data Scientist
|
||||
-----------------------------------------------------------------
|
||||
|
||||
Now let’s meet Debra, the Data Scientist. Debra’s job is to use loan payment data to predict loan chargeoff risk. <span class="sql">Debra’s preferred language for developing the models is using R and SQL. She uses Microsoft R Services with SQL Server 2017 as it provides the capability to run large datasets and also is not constrained by memory restrictions of Open Source R.</span><span class="hdi">Debra will develop these models using <a href="https://azure.microsoft.com/en-us/services/hdinsight/">HDInsight</a>, the managed cloud Hadoop solution with integration to Microsoft R Server.</span>
|
||||
Now let’s meet Debra, the Data Scientist. Debra’s job is to use loan payment data to predict loan chargeoff risk. <span class="sql">Debra’s preferred language for developing the models is using R and SQL. She uses Microsoft ML Services with SQL Server 2017 as it provides the capability to run large datasets and also is not constrained by memory restrictions of Open Source R.</span><span class="hdi">Debra will develop these models using <a href="https://azure.microsoft.com/en-us/services/hdinsight/">HDInsight</a>, the managed cloud Hadoop solution with integration to Microsoft ML Server.</span>
|
||||
|
||||
After analyzing the data she opted to create multiple models and choose the best one. She will create five machine learning models and compare them, then use the one she likes best to compute a prediction for each loan, and then select the loan with the highest probability of chargeoff.
|
||||
|
||||
<div class="sql">
|
||||
Debra will work on her own machine, using <a href = "https://msdn.microsoft.com/en-us/microsoft-r/install-r-client-windows">R Client</a> to execute these R scripts. <span class="cig">R Client is already installed on the VM.</span> She will also use an IDE to run R.
|
||||
Debra will work on her own machine, using <a href = "https://docs.microsoft.com/en-us/machine-learning-server/r-client/what-is-microsoft-r-client">R Client</a> to execute these R scripts. <span class="cig">R Client is already installed on the VM.</span> She will also use an IDE to run R.
|
||||
</div>
|
||||
|
||||
<div class="cig">
|
||||
|
@ -93,7 +93,7 @@ On your VM, R Tools for Visual Studio is installed. You will however have to ei
|
|||
<p/>
|
||||
<a name="rstudiologin"></a>
|
||||
|
||||
Debra will develop her R scripts in the Open Source Edition of RStudio Server, installed on her cluster's edge node. You can follow along on <a href="https://aka.ms/loanchargeoffhdi">your own cluster deployed by Cortana Analytics Gallery</a>. Access RStudio by using the url of the form: <br/> <code>http://CLUSTERNAME.azurehdinsight.net/rstudio</code>.
|
||||
Debra will develop her R scripts in the Open Source Edition of RStudio Server, installed on her cluster's edge node. You can follow along on <a href="https://aka.ms/loanchargeoffhdi">your own cluster deployed by using the 'Deploy to Azure' button on the <a href="START_HERE.md">Quick Start page</a>. Access RStudio by using the url of the form: <br/> <code>http://CLUSTERNAME.azurehdinsight.net/rstudio</code>.
|
||||
<p/>
|
||||
<div class="alert alert-info" role="alert">
|
||||
When you first visit the url to access RStudio, you will see two different logins. Use the username and password you created when you deployed the HDInsight solution for both of these prompts.
|
||||
|
|
|
@ -1,13 +1,13 @@
|
|||
author:
|
||||
name: Microsoft
|
||||
|
||||
description: Use SQL Server 2017 + R Services to build and deploy a machine learning model which predicts whether a loan will charge off in the next three months.
|
||||
description: Use SQL Server ML Services to build and deploy a machine learning model which predicts whether a loan will charge off in the next three months.
|
||||
|
||||
gems:
|
||||
- jekyll-redirect-from
|
||||
|
||||
cig_text: SQL Server 2017 on Azure
|
||||
onp_text: SQL Server 2017 On-Prem (with MicrosoftML)
|
||||
cig_text: SQL Server VM on Azure
|
||||
onp_text: SQL Server On-Premises (with MicrosoftML)
|
||||
hdi_text: HDInsight Spark
|
||||
db_name: LoanChargeOff_R
|
||||
folder_name: LoanChargeOff
|
||||
|
|
|
@ -2,12 +2,6 @@
|
|||
<p>You can access this dashboard in either of the following ways:</p>
|
||||
<p/>
|
||||
<ul>
|
||||
<li class="sql">
|
||||
<p>Visit the <a href="{{ site.pbix_sqlview_url }}">online version</a>.</p>
|
||||
</li>
|
||||
<li class="hdi">
|
||||
<p>Visit the <a href="{{ site.pbix_hdiview_url }}">online version</a>.</p>
|
||||
</li>
|
||||
<li class="cig">
|
||||
<p>Open the PowerBI file from the <strong>D:\LoanChargeOffSolution\Reports</strong> directory on the deployed VM desktop.</p>
|
||||
</li>
|
||||
|
|
|
@ -8,14 +8,14 @@
|
|||
<div class="sql">
|
||||
Let me introduce you to Danny, the Database Analyst. Danny is the main contact for SQL Server database administration and application integration. Danny was responsible for installing and configuring the SQL Server. He has added a user named with all the necessary permissions to execute R scripts on the server and modify the LoanChargeOff database. This was done through the createuser.sql file.
|
||||
|
||||
This step has already been done on your deployed Azure AI Gallery VM.
|
||||
This step has already been done on the VM you deployed using the 'Deploy to Azure' button on the <a href="START_HERE.html">Quick start</a> page.
|
||||
Alternatively, Danny could also run LoanChargeOff.ps1 to run the end to end workflow that includes setting up of SQL Server user login, import raw data to SQL Server tables, view creation, training and testing and prediction.
|
||||
</div>
|
||||
|
||||
<div class="hdi">
|
||||
Let me introduce you to Ivan, the IT Administrator. Ivan is responsible for implementation as well as ongoing administration of the Hadoop infrastructure at his company, which uses <a href="https://azure.microsoft.com/en-us/solutions/hadoop/">Hadoop in the Azure Cloud</a> from Microsoft.
|
||||
|
||||
Ivan created the <a href="https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-hadoop-r-server-get-started">HDInsight cluster with R Server</a> for Debra. He also uploaded the data onto the storage account associated with the cluster.
|
||||
Ivan created the <a href="https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-hadoop-r-server-get-started">HDInsight cluster with ML Server</a> for Debra. He also uploaded the data onto the storage account associated with the cluster.
|
||||
</div>
|
||||
|
||||
|
||||
|
|
|
@ -21,9 +21,9 @@ You can find this script in the <strong>SQLR</strong> directory, and execute it
|
|||
<div class="hdi">
|
||||
<p/>
|
||||
In the steps above, we saw the first way of scoring new data, using <strong>loanchargeoff_scoring.R</strong> script.
|
||||
Debra now creates an analytic web service with <a href="https://msdn.microsoft.com/en-us/microsoft-r/operationalize/about">R Server Operationalization</a> that incorporates these same steps: data processing and scoring.
|
||||
Debra now creates an analytic web service with <a href="https://docs.microsoft.com/en-us/machine-learning-server/what-is-operationalization">ML Server Operationalization</a> that incorporates these same steps: data processing and scoring.
|
||||
<p/>
|
||||
<strong>loanchargeoff_deployment.R</strong> will create a web service and test it on the edge node. If you wish, you can also download the file <strong>loanchargeoff_web_scoring.R</strong> and access the web service on any computer with Microsoft R Server 9.1.0 installed.
|
||||
<strong>loanchargeoff_deployment.R</strong> will create a web service and test it on the edge node. If you wish, you can also download the file <strong>loanchargeoff_web_scoring.R</strong> and access the web service on any computer with Microsoft ML Server installed.
|
||||
<p/>
|
||||
<div class="alert alert-info" role="alert">
|
||||
Before running <strong>loanchargeoff_web_scoring.R</strong> on any computer, you must first connect to edge node from that computer.
|
||||
|
|
|
@ -20,13 +20,13 @@ This loan chargeoff prediction uses a simulated loan history data to predict pro
|
|||
Loan manager is also presented with the trends and analysis of the chargeoff loans by branch locations. Characteristics of the loans that are highly probable of getting chargedoff have lower credit score for example will help loan managers to make a business plan for loan offering in that geographical area.
|
||||
|
||||
<div class="sql">
|
||||
SQL Server R Services brings the compute to the data by allowing R to run on the same computer as the database. It includes a database service that runs outside the SQL Server process and communicates securely with the R runtime.
|
||||
SQL Server ML Services brings the compute to the data by allowing R to run on the same computer as the database. It includes a database service that runs outside the SQL Server process and communicates securely with the R runtime.
|
||||
|
||||
This solution template walks through how to create and clean up a set of simulated data, use various algorithms to train the R models, select the best performant model and perform chargeoff predictions and save the prediction results back to SQL Server. A PowerBI report connects to the prediction result table and show interactive reports with the user on the predictive analytics.
|
||||
</div>
|
||||
|
||||
<div class="hdi">
|
||||
Microsoft R Server on HDInsight Spark clusters provides distributed and scalable machine learning capabilities for big data, leveraging the combined power of R Server and Apache Spark. This solution demonstrates how to develop machine learning models for Loan ChargeOff Prediction (including data processing, feature engineering, training and evaluating models), deploy the models as a web service (on the edge node) and consume the web service remotely with Microsoft R Server on Azure HDInsight Spark clusters.
|
||||
Microsoft ML Server on HDInsight Spark clusters provides distributed and scalable machine learning capabilities for big data, leveraging the combined power of er and Apache Spark. This solution demonstrates how to develop machine learning models for Loan ChargeOff Prediction (including data processing, feature engineering, training and evaluating models), deploy the models as a web service (on the edge node) and consume the web service remotely with Microsoft ML Server on Azure HDInsight Spark clusters.
|
||||
|
||||
The final predictions are saved to a Hive table containing loan chargeoff predictions. This data is then visualized in Power BI.
|
||||
</div>
|
||||
|
|
|
@ -13,7 +13,7 @@ The following is the directory structure for this template:
|
|||
- [**SQLR**](#operationalize-in-sql) This contains T-SQL code to pre-process the datasets, train the models, identify the champion model and provide predictions. It also contains a PowerShell script to automate the entire process, including loading the data into the database (not included in the T-SQL code).
|
||||
- [**HDI**](#hdinsight-solution-on-spark-cluster) This contains the R code to pre-process the datasets, train the models, identify the champion model and provide predictions on a Spark cluster.
|
||||
|
||||
In this template with SQL Server R Services, two versions of the SQL implementation and another version for HDInsight implementation:
|
||||
In this template with SQL Server ML Services, two versions of the SQL implementation and another version for HDInsight implementation:
|
||||
|
||||
1. [**Model Development in R IDE**](#model-development-in-r) . Run the R code in R IDE (e.g., RStudio, R Tools for Visual Studio).
|
||||
2. [**Operationalize in SQL**](#operationalize-in-sql). Run the SQL code in SQL Server using SQLR scripts from SSMS or from the PowerShell script.
|
||||
|
@ -104,7 +104,7 @@ These files are in the **HDI/RSparkCluster** directory.
|
|||
<tr><td>loanchargeoff_main.R </td><td> is used to define the data and directories and then run all of the steps to process data, perform feature engineering, training, and scoring. </td></tr>
|
||||
<tr><td>loanchargeoff_scoring.R</td><td>uses the previously trained model and invokes the steps to process data, perform feature engineering and scoring.</td></tr>
|
||||
<tr><td>loanchargeoff_deployment.R </td><td> create a web service and test it on the edge node </td></tr>
|
||||
<tr><td>loanchargeoff_web_scoring.R </td><td> access the web service on any computer with Microsoft R Server 9.1.0 installed</td></tr>
|
||||
<tr><td>loanchargeoff_web_scoring.R </td><td> access the web service on any computer with Microsoft ML Server installed</td></tr>
|
||||
<tr><td>loan_main.R</td><td> Main R script that executes the rest of the R scripts </td></tr>
|
||||
<tr><td>loan_scoring.R </td><td> Perform loan scoring using the model with the best performance </td></tr>
|
||||
<tr><td>step1_get_training_testing_data.R </td><td> Read input data which contains all the history information for all the loans from HDFS. Extract training/testing data based on process date (paydate) from the input data. </td></tr>
|
||||
|
|
|
@ -29,20 +29,20 @@ solution.
|
|||
</div>
|
||||
<div class="col-md-6">
|
||||
<div class="sql">
|
||||
SQL Server R Services takes advantage of the power of SQL Server and RevoScaleR (Microsoft R Server package) by allowing R to run on the same server as the database. It includes a database service that runs outside the SQL Server process and communicates securely with the R runtime.
|
||||
SQL Server ML Services takes advantage of the power of SQL Server and RevoScaleR (Microsoft ML Server package) by allowing R to run on the same server as the database. It includes a database service that runs outside the SQL Server process and communicates securely with the R runtime.
|
||||
<p></p>
|
||||
This solution package shows how to pre-process data (cleaning and feature engineering), train prediction models, and perform scoring on the SQL Server machine.
|
||||
</div>
|
||||
<div class="hdi">
|
||||
HDInsight is a cloud Spark and Hadoop service for the enterprise. HDInsight is also the only managed cloud Hadoop solution with integration to Microsoft R Server.
|
||||
HDInsight is a cloud Spark and Hadoop service for the enterprise. HDInsight is also the only managed cloud Hadoop solution with integration to Microsoft ML Server.
|
||||
<p></p>
|
||||
This solution shows how to pre-process data (cleaning and feature engineering), train prediction models, and perform scoring on an HDInsight Spark cluster with Microsoft R Server.
|
||||
This solution shows how to pre-process data (cleaning and feature engineering), train prediction models, and perform scoring on an HDInsight Spark cluster with Microsoft ML Server.
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="sql">
|
||||
Data scientists who are testing and developing solutions can work from the convenience of their R IDE on their client machine, while <a href="https://msdn.microsoft.com/en-us/library/mt604885.aspx">setting the computation context to SQL</a> (see <bd>R</bd> folder for code). They can also deploy the completed solutions to SQL Server (2016 or higher) by embedding calls to R in stored procedures (see <strong>SQLR</strong> folder for code). These solutions can then be further automated by the use of SQL Server Integration Services and SQL Server agent: a PowerShell script (.ps1 file) automates the running of the SQL code.
|
||||
Data scientists who are testing and developing solutions can work from the convenience of their R IDE on their client machine, while <a href="https://docs.microsoft.com/en-us/sql/advanced-analytics/r/sql-server-r-services">setting the computation context to SQL</a> (see <bd>R</bd> folder for code). They can also deploy the completed solutions to SQL Server (2016 or higher) by embedding calls to R in stored procedures (see <strong>SQLR</strong> folder for code). These solutions can then be further automated by the use of SQL Server Integration Services and SQL Server agent: a PowerShell script (.ps1 file) automates the running of the SQL code.
|
||||
</div>
|
||||
<div class="hdi">
|
||||
Data scientists who are testing and developing solutions can work from the browser-based Open Source Edition of RStudio Server on the HDInsight Spark cluster edge node, while <a href="https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-hadoop-r-server-compute-contexts">using a compute context</a> to control whether computation will be performed locally on the edge node, or whether it will be distributed across the nodes in the HDInsight Spark cluster.
|
||||
|
@ -149,7 +149,7 @@ Chargeoff prediction result stores in SQL Server table. The final step is to con
|
|||
</div>
|
||||
<div class="hdi">
|
||||
<h2>Deploy</h2>
|
||||
The script <strong>loanchargeoff_deployment.R </strong> creates and tests a analytic web service. The web service can then be used from another application to score future data. The file <strong>loanchargeoff_web_scoring.R</strong> can be downloaded to invoke this web service locally on any computer with Microsoft R Server 9.1.0 installed.
|
||||
The script <strong>loanchargeoff_deployment.R </strong> creates and tests a analytic web service. The web service can then be used from another application to score future data. The file <strong>loanchargeoff_web_scoring.R</strong> can be downloaded to invoke this web service locally on any computer with Microsoft ML Server installed.
|
||||
<p></p>
|
||||
<div class="alert alert-info" role="alert">
|
||||
Before running <strong>loanchargeoff_web_scoring.R</strong> on any computer, you must first connect to edge node from that computer.
|
||||
|
@ -173,10 +173,10 @@ The final step of this solution visualizes these recommendations.
|
|||
|
||||
The following are required to run the scripts in this solution:
|
||||
<ul>
|
||||
<li>SQL Server (2016 or higher) with Microsoft R Server (version 9.1.0) installed and configured. </li>
|
||||
<li>SQL Server (2016 or higher) with Microsoft ML Server (version 9.1.0) installed and configured. </li>
|
||||
<li>The SQL user name and password, and the user configured properly to execute R scripts in-memory.</li>
|
||||
<li>SQL Database which the user has write permission and execute stored procedures.</li>
|
||||
<li>For more information about SQL server and R service, please visit: <a href="https://msdn.microsoft.com/en-us/library/mt604847.aspx">https://msdn.microsoft.com/en-us/library/mt604847.aspx</a></li>
|
||||
<li>For more information about SQL server and ML Services, please visit: <a href="https://docs.microsoft.com/en-us/sql/advanced-analytics/what-s-new-in-sql-server-machine-learning-services">https://docs.microsoft.com/en-us/sql/advanced-analytics/what-s-new-in-sql-server-machine-learning-services</a></li>
|
||||
</ul>
|
||||
</div>
|
||||
|
||||
|
|
8
dba.md
8
dba.md
|
@ -29,14 +29,14 @@ solution.
|
|||
</div>
|
||||
</div>
|
||||
<div class="col-md-6">
|
||||
SQL Server R Services takes advantage of the power of SQL Server and RevoScaleR (Microsoft R Server package) by allowing R to run on the same server as the database. It includes a database service that runs outside the SQL Server process and communicates securely with the R runtime. This allows the application analyst to use the power of SQL Server to build advanced analytics application.
|
||||
SQL Server ML Services takes advantage of the power of SQL Server and RevoScaleR (Microsoft ML Server package) by allowing R to run on the same server as the database. It includes a database service that runs outside the SQL Server process and communicates securely with the R runtime. This allows the application analyst to use the power of SQL Server to build advanced analytics application.
|
||||
</div>
|
||||
</div>
|
||||
<p>
|
||||
With the simulated data and R scripts contained in this solution, application analyst is able to use SQL Server 2017 to evaluate the solution end to end, including the steps needed to deploy machine learning model in SQL Server and consumed by business application. This template deploys a Data Science Virtual Machine (DSVM) that has SQL Server 2017 with Microsoft ML Server installed.
|
||||
</p>
|
||||
|
||||
For businesses that prefer an on-prem solution, the implementation with SQL Server R Services is a great option, which takes advantage of the power of SQL Server and RevoScaleR (Microsoft R Server). In this template, we implemented all steps in SQL stored procedures: data preprocessing, and feature engineering are implemented in pure SQL, while data cleaning, and the model training, scoring and evaluation steps are implemented with SQL stored procedures calling R (Microsoft R Server) code.
|
||||
For businesses that prefer an on-prem solution, the implementation with SQL Server ML Services is a great option, which takes advantage of the power of SQL Server and RevoScaleR (Microsoft ML Server). In this template, we implemented all steps in SQL stored procedures: data preprocessing, and feature engineering are implemented in pure SQL, while data cleaning, and the model training, scoring and evaluation steps are implemented with SQL stored procedures calling R (Microsoft ML Server) code.
|
||||
|
||||
All the steps can be executed on SQL Server client environment (SQL Server Management Studio). We provide a Windows PowerShell script which invokes the SQL scripts and demonstrates the end-to-end modeling process.
|
||||
|
||||
|
@ -45,10 +45,10 @@ All the steps can be executed on SQL Server client environment (SQL Server Manag
|
|||
|
||||
The following are required to run the scripts in this solution:
|
||||
<ul>
|
||||
<li>SQL Server (2016 or higher) with Microsoft R Server (version 9.1.0) installed and configured. </li>
|
||||
<li>SQL Server (2016 or higher) with Microsoft ML Server installed and configured. </li>
|
||||
<li>The SQL user name and password, and the user configured properly to execute R scripts in-memory.</li>
|
||||
<li>SQL Database which the user has write permission and execute stored procedures.</li>
|
||||
<li>For more information about SQL server and R service, please visit: <a href="https://msdn.microsoft.com/en-us/library/mt604847.aspx">https://msdn.microsoft.com/en-us/library/mt604847.aspx</a></li>
|
||||
<li>For more information about SQL server and ML Services, please visit: <a href="https://docs.microsoft.com/en-us/sql/advanced-analytics/what-s-new-in-sql-server-machine-learning-services">https://docs.microsoft.com/en-us/sql/advanced-analytics/what-s-new-in-sql-server-machine-learning-services</a></li>
|
||||
</ul>
|
||||
</ul>
|
||||
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
---
|
||||
layout: default
|
||||
title: Operationalization with R Server
|
||||
title: Operationalization with ML Server
|
||||
---
|
||||
<div class="alert alert-success" role="alert"> This page describes the
|
||||
<strong>
|
||||
|
@ -9,9 +9,9 @@ title: Operationalization with R Server
|
|||
solution.
|
||||
</div>
|
||||
|
||||
## Operationalization with R Server
|
||||
## Operationalization with ML Server
|
||||
---------------------------------------
|
||||
To use R Server Operationalization services from your local computer, you must first connect to the edge node using the steps below.
|
||||
To use ML Server Operationalization services from your local computer, you must first connect to the edge node using the steps below.
|
||||
|
||||
## Connect to Edge Node
|
||||
<ul>
|
||||
|
@ -24,7 +24,7 @@ For instructions on how to use the terminal to connect to your HDInsight Spark c
|
|||
<a href="http://go.microsoft.com/fwlink/p/?LinkID=619886">Azure documentation</a>. The edge node address is of the form <code>sshuser@CLUSTERNAME-ed-ssh.azurehdinsight.net</code>
|
||||
</li>
|
||||
<li>
|
||||
<strong>All platforms:</strong> Your login name and password are the ones you created when you deployed this solution from the <a href="http://aka.ms/loanchargeoffhdi">Azure AI Gallery</a>
|
||||
<strong>All platforms:</strong> Your login name and password are the ones you created when you deployed this solution using the 'Deploy to Azure' button on the <a href="START_HERE.html">Quick start</a> page
|
||||
</li>
|
||||
</ul>
|
||||
|
||||
|
|
|
@ -12,9 +12,9 @@ solution.
|
|||
## HDInsight Cluster
|
||||
--------------------------------
|
||||
|
||||
An initial cluster was created when you used the `Deploy` button from [Azure AI Gallery](https://aka.ms/loanchargeoffhdi). Along with the cluster, a storage account was created. This is where all data is stored.
|
||||
An initial cluster was created when you used the 'Deploy to Azure' button on the <a href="START_HERE.html">Quick start</a> page. Along with the cluster, a storage account was created. This is where all data is stored.
|
||||
|
||||
When you are finished using the entire solution, you can delete all your resources on the [Azure AI Gallery Deployments](https://start.cortanaintelligence.com/Deployments?type=loanchargeoffhdi) page. On this page, hover over a line to reveal the trash can icon on the far right. Then click the trash can to delete the deployment.
|
||||
When you are finished using the entire solution, you can delete all your resources fom Azure.
|
||||
|
||||
If you would like to continue using the solution, you can delete the cluster while keeping the storage account. You can then re-use the storage account later on a new cluster.
|
||||
|
||||
|
@ -49,5 +49,5 @@ Once your cluster is ready, go to RStudio and Import the files by with the <code
|
|||
## Scaling a Cluster
|
||||
You can also [use the Azure portal](https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-administer-use-portal-linux#scale-clusters) to scale your cluster.
|
||||
|
||||
## Scaling Microsoft R Server Operationalization Compute Nodes
|
||||
## Scaling Microsoft ML Server Operationalization Compute Nodes
|
||||
This solution currently uses a single node for Operationalization - the cluster edge node. [View instructions here](https://docs.microsoft.com/en-us/azure/hdinsight/hdinsight-hadoop-r-server-get-started#how-to-scale-microsoft-r-server-operationalization-compute-nodes-on-hdinsight-worker-nodes) to add compute nodes to the Operationalization Server.
|
||||
|
|
14
index.md
14
index.md
|
@ -8,33 +8,33 @@ Are you unable to connect to your Virtual Machine? See this important informatio
|
|||
</div>
|
||||
|
||||
A charged off loan is a loan that is declared by a creditor (usually a lending institution) that an amount of debt is unlikely to be collected, usually when the loan repayment is severely delinquent by the debtor. Given that high chargeoff has negative impact on lending institutions' year end financials, lending institutions often monitor loan chargeoff risk very closely to prevent loans from getting charged-off.
|
||||
Using Azure HDInsight R Server, a lending institution can leverage machine learning predictive analytics to predict the likelihood of loans getting charged off and run a report on the analytics result stored in HDFS and hive tables.
|
||||
Using Azure HDInsight ML Server, a lending institution can leverage machine learning predictive analytics to predict the likelihood of loans getting charged off and run a report on the analytics result stored in HDFS and hive tables.
|
||||
|
||||
<div class="alert alert-success">
|
||||
<h2>Select the platform you wish to explore:</h2>
|
||||
<form style="margin-left:30px">
|
||||
<label class="radio">
|
||||
<input type="radio" name="optradio" class="rb" value="cig" >{{ site.cig_text }}, deployed from <a href="https://aka.ms/loanchargeoffsql">Azure AI Gallery</a>
|
||||
<input type="radio" name="optradio" class="rb" value="cig" >{{ site.cig_text }}, deployed using the 'Deploy to Azure' button on the <a href="START_HERE.html">Quick start</a> page.
|
||||
</label>
|
||||
<label class="radio">
|
||||
<input type="radio" name="optradio" class="rb" value="onp">{{ site.onp_text }}
|
||||
</label>
|
||||
<label class="radio">
|
||||
<input type="radio" name="optradio" class="rb" value="hdi">{{ site.hdi_text }}, deployed from <a href="https://aka.ms/loanchargeoffhdi">Azure AI Gallery</a>
|
||||
<input type="radio" name="optradio" class="rb" value="hdi">{{ site.hdi_text }}, deployed using the 'Deploy to Azure' button on the <a href="START_HERE.html">Quick start</a> page.
|
||||
</label>
|
||||
</form>
|
||||
</div>
|
||||
<p></p>
|
||||
<div class="cig">
|
||||
On the VM created for you from the <a href="https://aka.ms/loanchargeoffsql">Azure AI Gallery</a>, the SQL Server 2017 database <code>{{ site.db_name}}</code> contains all the data and results of the end-to-end modeling process.
|
||||
On the VM created for you using the 'Deploy to Azure' button on the <a href="START_HERE.html">Quick start</a> page, the SQL Server 2017 database <code>{{ site.db_name}}</code> contains all the data and results of the end-to-end modeling process.
|
||||
</div>
|
||||
|
||||
<div class="onp">
|
||||
For customers who prefer an on-premise solution, the implementation with SQL Server R Services is a great option that takes advantage of the powerful combination of SQL Server and the R language. A Windows PowerShell script to invoke the SQL scripts that execute the end-to-end modeling process is provided for convenience. <b>Note that you may need to upgrade R Services to install at least the earliest version that shipped with MicrosoftML package (9.0.1). Instructions to upgrade R Services on SQL Server 2016 are <a href="https://docs.microsoft.com/en-us/sql/advanced-analytics/r/use-sqlbindr-exe-to-upgrade-an-instance-of-sql-server">here</a></b>.
|
||||
For customers who prefer an on-premise solution, the implementation with SQL Server ML Services is a great option that takes advantage of the powerful combination of SQL Server and the R language. A Windows PowerShell script to invoke the SQL scripts that execute the end-to-end modeling process is provided for convenience. <b>Note that you may need to upgrade ML Services to install at least the earliest version that shipped with MicrosoftML package (9.0.1). Instructions to upgrade ML Services on SQL Server 2016 are <a href="https://docs.microsoft.com/en-us/sql/advanced-analytics/r/use-sqlbindr-exe-to-upgrade-an-instance-of-sql-server">here</a></b>.
|
||||
</div>
|
||||
|
||||
<div class="hdi">
|
||||
This solution shows how to pre-process data (cleaning and feature engineering), train prediction models, and perform scoring on the HDInsight Spark cluster with Microsoft R Server deployed from the <a href="https://aka.ms/loanchargeoffhdi">Azure AI Gallery</a>.
|
||||
This solution shows how to pre-process data (cleaning and feature engineering), train prediction models, and perform scoring on the HDInsight Spark cluster with Microsoft ML Server deployed using the 'Deploy to Azure' button on the <a href="START_HERE.html">Quick start</a> page.
|
||||
<p></p>
|
||||
<strong>HDInsight Spark cluster billing starts once a cluster is created and stops when the cluster is deleted. See <a href="hdinsight.html"> these instructions for important information</a> about deleting a cluster and re-using your files on a new cluster.</strong>
|
||||
|
||||
|
@ -46,7 +46,7 @@ This solution shows how to pre-process data (cleaning and feature engineering),
|
|||
DBAs can take care of the deployment using SQL stored procedures with embedded R code. We show how each of these steps can be executed on a SQL Server client environment such as SQL Server Management Studio.
|
||||
</span>
|
||||
<span class="hdi">
|
||||
Scoring is implemented with <a href="https://msdn.microsoft.com/en-us/microsoft-r/operationalize/about">R Server Operationalization</a>.
|
||||
Scoring is implemented with <a href="https://docs.microsoft.com/en-us/machine-learning-server/what-is-operationalization">ML Server Operationalization</a>.
|
||||
</span>
|
||||
Finally, a Power BI report is used to visualize the deployed results.
|
||||
|
||||
|
|
2
it.md
2
it.md
|
@ -36,7 +36,7 @@ While this solution demonstrates the code with 100,000 loans for developing the
|
|||
|
||||
This solution uses:
|
||||
|
||||
* [R Server for HDInsight](https://azure.microsoft.com/en-us/services/hdinsight/r-server/)
|
||||
* [ML Server for HDInsight](https://azure.microsoft.com/en-us/services/hdinsight/r-server/)
|
||||
|
||||
|
||||
## Cluster Maintenance
|
||||
|
|
4
local.md
4
local.md
|
@ -12,7 +12,7 @@ solution.
|
|||
|
||||
## Setup for Local Code Execution
|
||||
|
||||
You can execute code on your local computer and push the computations to the SQL Server on the VM that was created by the Azure AI Gallery. But first you must perform the following steps.
|
||||
You can execute code on your local computer and push the computations to the SQL Server on the VM that was created by using the 'Deploy to Azure' button on the <a href="START_HERE.html">Quick start</a> page. But first you must perform the following steps.
|
||||
|
||||
## On the VM: Configure VM for Remote Access
|
||||
|
||||
|
@ -41,4 +41,4 @@ This will create a folder **loanchargeoff** containing the full solution package
|
|||
Finally, in the **chargeoff_batch_prediction.R** file, replace `Server=.` in the connection_string with the DNS of the VM followed by ",1433".
|
||||
|
||||
|
||||
<a href="CIG_Workflow.html#step2">Return to Typical Workflow for Azure AI Gallery Deployment<a>
|
||||
<a href="CIG_Workflow.html#step2">Return to Typical Workflow for 'Deploy to Azure' Deployment<a>
|
|
@ -1,6 +1,6 @@
|
|||
---
|
||||
layout: default
|
||||
title: Using RStudio with R Server
|
||||
title: Using RStudio with ML Server
|
||||
---
|
||||
<div class="alert alert-success" role="alert"> This page describes the
|
||||
<strong>
|
||||
|
@ -25,14 +25,14 @@ RStudio is already installed and configured on the edge node of your cluster. T
|
|||
</div>
|
||||
|
||||
## Set Up RStudio for R Client
|
||||
RStudio needs to use R Server for the code in this solution. Follow the instructions below to set up RStudio to use R Server and/or to verify that you are using the correct version.
|
||||
RStudio needs to use ML Server for the code in this solution. Follow the instructions below to set up RStudio to use ML Server and/or to verify that you are using the correct version.
|
||||
<div class="hdi">(These steps are is not necessary for the version on the cluster edge node.)</div>
|
||||
<ol>
|
||||
<li>Launch RStudio.</li>
|
||||
<li> Update the path to R.</li>
|
||||
<ol type="a">
|
||||
<li>From the <code>Tools</code> menu, choose <code>Global Options</code>.</li>
|
||||
<li>In the General tab, update the path to R to point to R Server:</li>
|
||||
<li>In the General tab, update the path to R to point to ML Server:</li>
|
||||
<ul><li>When you install R from the SQL Server 2017 install program, the path is <code>C:\Program Files\Microsoft\MLServer\R_SERVER</code>. </li>
|
||||
</ul>
|
||||
</ol>
|
||||
|
@ -42,4 +42,4 @@ RStudio needs to use R Server for the code in this solution. Follow the instruc
|
|||
|
||||
|
||||
|
||||
<a href="CIG_Workflow.html#step2">Return to Typical Workflow for Azure AI Gallery Deployment<a>
|
||||
<a href="CIG_Workflow.html#step2">Return to Typical Workflow for 'Deploy to Azure' Deployment<a>
|
|
@ -14,10 +14,10 @@ title: Site Map
|
|||
* [Typical Workflow](Typical.html)
|
||||
* [SQL: Setup for Local Code Execution](local.html)
|
||||
* [SQL: Using a Jupyter Notebook](jupyter.html)
|
||||
* [SQL: Using RStudio with R Server](rstudio.html)
|
||||
* [SQL: Using RStudio with ML Server](rstudio.html)
|
||||
* [HDI: Using an HDInsight Spark cluster for Loan ChargeOff Prediction](hdinsight.html)
|
||||
* [HDI: Using XGBoost package in HDInsight Spark Cluster for Loan ChargeOff Prediction](xgboost.html)
|
||||
* [HDI: Configuring Operationalization with R Server](deployr.html)
|
||||
* [HDI: Configuring Operationalization with ML Server](deployr.html)
|
||||
* [Quick Start](START_HERE.html)
|
||||
* [On-Prem: Setup SQL Server 2016 ](SetupSQL.html)
|
||||
* [SQL: PowerShell Instructions](Powershell_Instructions.html)
|
||||
|
|
|
@ -10,6 +10,6 @@ solution.
|
|||
{% include choices.md %}
|
||||
</div>
|
||||
|
||||
## HDInsight Spark Deployment from Azure AI Gallery
|
||||
## HDInsight Spark Deployment
|
||||
|
||||
[Click here](https://aka.ms/loanchargeoffhdi) to access the HDInsight Spark Deployment - if it comes back to this page, this version of the solution has not yet been published in the Azure AI Gallery.
|
||||
Use the corresponding 'Deploy to Azure' button on the <a href="START_HERE.md">Quick Start page</a> to access the HDInsight Spark Deployment.
|
Загрузка…
Ссылка в новой задаче