Added RAI Text Classification Scenario, Workflow and config ini updated (#2780)

2023-11-10 01:30:10 +05:30 · 2023-11-10 01:30:10 +05:30 · 461da2c035
--- a/.github/workflows/sdk-responsible-ai-text-responsibleaidashboard-text-classification-financial-news-responsibleaidashboard-text-classification-financial-news.yml
+++ b/.github/workflows/sdk-responsible-ai-text-responsibleaidashboard-text-classification-financial-news-responsibleaidashboard-text-classification-financial-news.yml
@ -0,0 +1,77 @@
+# This code is autogenerated.
+# Code is generated by running custom script: python3 readme.py
+# Any manual changes to this file may cause incorrect behavior.
+# Any manual changes will be overwritten if the code is regenerated.
+
+name: sdk-responsible-ai-text-responsibleaidashboard-text-classification-financial-news-responsibleaidashboard-text-classification-financial-news
+# This file is created by sdk/python/readme.py.
+# Please do not edit directly.
+on:
+  workflow_dispatch:
+  schedule:
+    - cron: "49 0/12 * * *"
+  pull_request:
+    branches:
+      - main
+    paths:
+      - sdk/python/responsible-ai/text/responsibleaidashboard-text-classification-financial-news/**
+      - .github/workflows/sdk-responsible-ai-text-responsibleaidashboard-text-classification-financial-news-responsibleaidashboard-text-classification-financial-news.yml
+      - sdk/python/dev-requirements.txt
+      - infra/bootstrapping/**
+      - sdk/python/setup.sh
+concurrency:
+  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
+  cancel-in-progress: true
+jobs:
+  build:
+    runs-on: ubuntu-latest
+    steps:
+    - name: check out repo
+      uses: actions/checkout@v2
+    - name: setup python
+      uses: actions/setup-python@v2
+      with:
+        python-version: "3.8"
+    - name: pip install notebook reqs
+      run: pip install -r sdk/python/dev-requirements.txt
+    - name: pip install mlflow reqs
+      run: pip install -r sdk/python/mlflow-requirements.txt
+    - name: azure login
+      uses: azure/login@v1
+      with:
+        creds: ${{secrets.AZUREML_CREDENTIALS}}
+    - name: bootstrap resources
+      run: |
+          echo '${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}';
+          bash bootstrap.sh
+      working-directory: infra/bootstrapping
+      continue-on-error: false
+    - name: setup SDK
+      run: |
+          source "${{ github.workspace }}/infra/bootstrapping/sdk_helpers.sh";
+          source "${{ github.workspace }}/infra/bootstrapping/init_environment.sh";
+          bash setup.sh
+      working-directory: sdk/python
+      continue-on-error: true
+    - name: setup-cli
+      run: |
+          source "${{ github.workspace }}/infra/bootstrapping/sdk_helpers.sh";
+          source "${{ github.workspace }}/infra/bootstrapping/init_environment.sh";
+          bash setup.sh
+      working-directory: cli
+      continue-on-error: true
+    - name: run responsible-ai/text/responsibleaidashboard-text-classification-financial-news/responsibleaidashboard-text-classification-financial-news.ipynb
+      run: |
+          source "${{ github.workspace }}/infra/bootstrapping/sdk_helpers.sh";
+          source "${{ github.workspace }}/infra/bootstrapping/init_environment.sh";
+          bash "${{ github.workspace }}/infra/bootstrapping/sdk_helpers.sh" generate_workspace_config "../../.azureml/config.json";
+          bash "${{ github.workspace }}/infra/bootstrapping/sdk_helpers.sh" replace_template_values "responsibleaidashboard-text-classification-financial-news.ipynb";
+          [ -f "../../.azureml/config" ] && cat "../../.azureml/config";
+          papermill -k python responsibleaidashboard-text-classification-financial-news.ipynb responsibleaidashboard-text-classification-financial-news.output.ipynb
+      working-directory: sdk/python/responsible-ai/text/responsibleaidashboard-text-classification-financial-news
+    - name: upload notebook's working folder as an artifact
+      if: ${{ always() }}
+      uses: actions/upload-artifact@v2
+      with:
+        name: responsibleaidashboard-text-classification-financial-news
+        path: sdk/python/responsible-ai/text/responsibleaidashboard-text-classification-financial-news
--- a/sdk/python/notebooks_config.ini
+++ b/sdk/python/notebooks_config.ini
@ -41,3 +41,7 @@ COMPUTE_NAMES = demand-fcst-mm-cluster
 [automl-forecasting-distributed-tcn]
 USE_FORECAST_REQUIREMENTS = 0
 COMPUTE_NAMES = "distributed-tcn-cluster"
+
+[responsibleaidashboard-text-classification-financial-news]
+USE_FORECAST_REQUIREMENTS = 0
+COMPUTE_NAMES = "cpucluster"
--- a/sdk/python/responsible-ai/text/responsibleaidashboard-text-classification-financial-news/data_news_classification/test/Financial_news_test_data.parquet
+++ b/sdk/python/responsible-ai/text/responsibleaidashboard-text-classification-financial-news/data_news_classification/test/Financial_news_test_data.parquet
--- a/sdk/python/responsible-ai/text/responsibleaidashboard-text-classification-financial-news/data_news_classification/train/Financial_news_train_data.parquet
+++ b/sdk/python/responsible-ai/text/responsibleaidashboard-text-classification-financial-news/data_news_classification/train/Financial_news_train_data.parquet
--- a/sdk/python/responsible-ai/text/responsibleaidashboard-text-classification-financial-news/media/nlp_classification/dataAnalysis_chartView_negativeWords.png
+++ b/sdk/python/responsible-ai/text/responsibleaidashboard-text-classification-financial-news/media/nlp_classification/dataAnalysis_chartView_negativeWords.png
--- a/sdk/python/responsible-ai/text/responsibleaidashboard-text-classification-financial-news/media/nlp_classification/dataAnalysis_chartView_positiveWords.png
+++ b/sdk/python/responsible-ai/text/responsibleaidashboard-text-classification-financial-news/media/nlp_classification/dataAnalysis_chartView_positiveWords.png
--- a/sdk/python/responsible-ai/text/responsibleaidashboard-text-classification-financial-news/media/nlp_classification/dataAnalysis_chartView_positiveWordsGreaterThan.png
+++ b/sdk/python/responsible-ai/text/responsibleaidashboard-text-classification-financial-news/media/nlp_classification/dataAnalysis_chartView_positiveWordsGreaterThan.png
--- a/sdk/python/responsible-ai/text/responsibleaidashboard-text-classification-financial-news/media/nlp_classification/dataAnalysis_chartView_positiveWordsLessThan.png
+++ b/sdk/python/responsible-ai/text/responsibleaidashboard-text-classification-financial-news/media/nlp_classification/dataAnalysis_chartView_positiveWordsLessThan.png
--- a/sdk/python/responsible-ai/text/responsibleaidashboard-text-classification-financial-news/media/nlp_classification/dataAnalysis_featureImportance.png
+++ b/sdk/python/responsible-ai/text/responsibleaidashboard-text-classification-financial-news/media/nlp_classification/dataAnalysis_featureImportance.png
--- a/sdk/python/responsible-ai/text/responsibleaidashboard-text-classification-financial-news/media/nlp_classification/errorAnalysis_featureList.png
+++ b/sdk/python/responsible-ai/text/responsibleaidashboard-text-classification-financial-news/media/nlp_classification/errorAnalysis_featureList.png
--- a/sdk/python/responsible-ai/text/responsibleaidashboard-text-classification-financial-news/media/nlp_classification/errorAnalysis_heatMap.png
+++ b/sdk/python/responsible-ai/text/responsibleaidashboard-text-classification-financial-news/media/nlp_classification/errorAnalysis_heatMap.png
--- a/sdk/python/responsible-ai/text/responsibleaidashboard-text-classification-financial-news/media/nlp_classification/errorAnalysis_heatMap_errorCoverage.png
+++ b/sdk/python/responsible-ai/text/responsibleaidashboard-text-classification-financial-news/media/nlp_classification/errorAnalysis_heatMap_errorCoverage.png
--- a/sdk/python/responsible-ai/text/responsibleaidashboard-text-classification-financial-news/media/nlp_classification/errorAnalysis_prebuiltCohorts.png
+++ b/sdk/python/responsible-ai/text/responsibleaidashboard-text-classification-financial-news/media/nlp_classification/errorAnalysis_prebuiltCohorts.png
--- a/sdk/python/responsible-ai/text/responsibleaidashboard-text-classification-financial-news/media/nlp_classification/errorAnalysis_saveCohort.png
+++ b/sdk/python/responsible-ai/text/responsibleaidashboard-text-classification-financial-news/media/nlp_classification/errorAnalysis_saveCohort.png
--- a/sdk/python/responsible-ai/text/responsibleaidashboard-text-classification-financial-news/media/nlp_classification/errorAnalysis_treeMap.png
+++ b/sdk/python/responsible-ai/text/responsibleaidashboard-text-classification-financial-news/media/nlp_classification/errorAnalysis_treeMap.png
--- a/sdk/python/responsible-ai/text/responsibleaidashboard-text-classification-financial-news/media/nlp_classification/errorAnalysis_treeMap_leftBranch_leftNode.png
+++ b/sdk/python/responsible-ai/text/responsibleaidashboard-text-classification-financial-news/media/nlp_classification/errorAnalysis_treeMap_leftBranch_leftNode.png
--- a/sdk/python/responsible-ai/text/responsibleaidashboard-text-classification-financial-news/media/nlp_classification/errorAnalysis_treeMap_leftBranch_rightNode.png
+++ b/sdk/python/responsible-ai/text/responsibleaidashboard-text-classification-financial-news/media/nlp_classification/errorAnalysis_treeMap_leftBranch_rightNode.png
--- a/sdk/python/responsible-ai/text/responsibleaidashboard-text-classification-financial-news/media/nlp_classification/modelOverview_featureCohorts_positiveWords.png
+++ b/sdk/python/responsible-ai/text/responsibleaidashboard-text-classification-financial-news/media/nlp_classification/modelOverview_featureCohorts_positiveWords.png
--- a/sdk/python/responsible-ai/text/responsibleaidashboard-text-classification-financial-news/media/nlp_classification/modelPerf_confusionMatrix.png
+++ b/sdk/python/responsible-ai/text/responsibleaidashboard-text-classification-financial-news/media/nlp_classification/modelPerf_confusionMatrix.png
--- a/sdk/python/responsible-ai/text/responsibleaidashboard-text-classification-financial-news/media/nlp_classification/modelPerf_overviewBarchart.png
+++ b/sdk/python/responsible-ai/text/responsibleaidashboard-text-classification-financial-news/media/nlp_classification/modelPerf_overviewBarchart.png
--- a/sdk/python/responsible-ai/text/responsibleaidashboard-text-classification-financial-news/media/nlp_classification/modelPerf_overviewGrid.png
+++ b/sdk/python/responsible-ai/text/responsibleaidashboard-text-classification-financial-news/media/nlp_classification/modelPerf_overviewGrid.png
--- a/sdk/python/responsible-ai/text/responsibleaidashboard-text-classification-financial-news/readme.md
+++ b/sdk/python/responsible-ai/text/responsibleaidashboard-text-classification-financial-news/readme.md
@ -0,0 +1,232 @@
+## Text Classification with traditional NLP
+
+One of the many uses of text classification is to categorize news articles.
+This can be done on one, or more, of many different dimensions.
+Topic, length, source, are just a few possibilities. 
+In this example we will focus business news articles related to financial services topics. 
+
+
+## Deploy and run the notebook
+This demo relies on the included Jupyter notebook. 
+This notebook does the following: 
+- Loads and transforms data
+- Trains and tests a model
+- Creates a Responsible AI dashboard
+
+Once you load this notebook to your Azure ML studio workspace, follow the steps described in the notebook to create your RAI dashboard. 
+Then return to this page to explore the dashboard.
+
+
+For help finding the RAI dashboard, please review this [information](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-responsible-ai-dashboard)
+
+
+## Synthetic dataset and model 
+
+**Machine learning model** -- The notebook trains a **()**. 
+
+**WARNING: Use of synthetic data**  
+This accelerator was developed using synthetic data to emphasize the importance of data privacy. 
+For this reason, you my find some anomalies in certain metrics or dashboard components. 
+These should not distract from the demonstration. 
+The tools and techniques described here remain valid, despite any data shortcomings that may be present. 
+
+Data dictionary 
+
+|Type           | Feature Name          |   Feature Description|
+|---------------|-----------------------|----------------------|
+Feature         |Article ID             |Unique identifier for each article
+Feature         |Article Heading        |The title or headline of the article
+Feature	        |Article Text           |Text from the main body of the article 
+Feature	        |Category           	|The category or class to which the article belongs
+
+
+Here are descriptions of the different article categories: 
+- Banking and finance - Debt Market: News and updates related to the debt market within the banking and finance industry
+- Stock Market Updates: News and updates about the stock market, including stock prices, market indices, and company earnings
+- Business: News and articles about general business topics, such as corporate strategies, management, and industry trends
+- Real Estate: News and updates related to the real estate industry, including property market trends and housing prices
+- Cryptocurrency: News and information about cryptocurrencies, blockchain technology, and related developments
+- Personal Finance: News and tips related to personal finance management, including budgeting, saving, and investing
+- Financial Regulations: News and updates about financial regulations, compliance requirements, and legal developments
+
+## Debugging the model 
+In data science and software development, 
+the word debugging usually refers to finding and removing errors in a piece of code. 
+With the RAI dashboard, we can debug a machine learning model and improve its overall performance and responsible AI aspects of its predictions. 
+
+Here is how the Responsible AI dashboard could assist you with debugging this classification model: 
+
+## Pre-built cohorts
+The ability to define groups of data points, or cohorts, within the source data is one of the most useful features of the RAI dashboard. 
+In a classification scenario like this one, it is often helpful to start by using the class labels to define cohorts.  To do this:
+- Click the "+ New cohort" button, near the top of the dashboard
+- Select "True Y" as your filter criteria
+- Use the "Included values" drop-down menu to select only the "Banking and finance - Debt Market" option
+- Click the "Add filter" button
+- Rename the cohort to something useful to you, in the "Dataset cohort name" dialog box
+- Click "Save" 
+
+![Pre-built Cohort](./media/nlp_classification/errorAnalysis_prebuiltCohorts.png)
+
+Now repeat this for each of the news article classes, until you have defined seven pre-built cohorts. 
+
+
+## Error analysis
+Next, let's review the error analysis tree map, which you will find at the top of the dashboard.  
+This chart simplifies the process of discovering and highlighting common failure patterns. 
+Look for the nodes with the darkest red color (i.e. high error rate) and a high fill line (i.e. high error coverage). Error rate means what percentage of the selected node’s datapoints are receiving erroneous predictions. 
+Error coverage defines what percentage of overall errors are concentrated in the given node.  
+
+The Error Tree visualization helps you uncover your model’s “blind spots”. 
+In this context, a blind spot is a group of datapoints, or a cohort, for which the model is less accurate and performant. 
+This could be any group of observations, that when grouped by a common characteristic, show significantly worse model performance.
+
+
+Looking at the root node of the tree (representing errors on all data) we see the error rate for all predictions is about 24.4%. 
+
+![Error-Analysis-00](./media/nlp_classification/errorAnalysis_treeMap.png) #whole tree
+
+-   Explore nodes 
+Look at the left node on the second level of left branch, with less than 1.5 negative words, 
+and less than,17.5 positive words. 
+This has a high error rate of 39.1%.  
+
+![Error-Analysis-01](./media/nlp_classification/errorAnalysis_treeMap_leftBranch_leftNode.png)
+
+This contrasts sharply with the right node, on the same level, with greater than 17.5 positive words. 
+This node has an error rate of only 9.5%.  
+
+![Error-Analysis-02](./media/nlp_classification/errorAnalysis_treeMap_leftBranch_rightNode.png)
+
+These two cohorts are two good options to save in order to explore further. 
+Click on the more erroneous node again and choose the “Save as new cohort” button, at the upper-right corner of the tree map.
+
+![Error-Analysis-03](./media/nlp_classification/errorAnalysis_saveCohort.png)
+
+Do the same thing for the lower-right node. 
+Here we name it, "positive_words > 17.50, negative_words <= 1.50"
+
+**Top features leading to errors** – Click on the feature list icon at the top of the error analysis section. This will surface a list of features, ranked by their correlations to the model’s errors.
+
+![Error-Analysis-04](./media/nlp_classification/errorAnalysis_featureList.png)
+
+The **Heat map** can also be a very helpful tool to identify areas of lower performance. 
+Click on the heat map tab, at the top of the Error analysis section.  
+
+The heat map is useful for taking a closer look at certain groups, to explore different “slices” of the data. 
+Select up to two features to see how errors are distributed across these groups. 
+Find the heat map by selecting the “Heat map” tab, next to the “Feature list” tab shown above. 
+For Feature 1, select "negative_words".
+For Feature 2, select "positive_words". 
+Make Binning threshold 8.  
+
+
+Each cell of the heatmap represents a slice of the dataset, 
+and the percentage of errors out of the total number of data points in that slice. 
+Here we can see that the worst performance tends to happen when there is only one or two negative words. 
+It is also interesting to note that accuracy suffers across the board when there are no negative words, 
+while accuracy is very good when there are at least three negative words in the article. 
+
+
+![Heat-Map-00](./media/nlp_classification/errorAnalysis_heatMap.png)
+
+Next, hover over one of the squares. 
+You will see Error Rate and Error Coverage. 
+Now compare the error rates and error coverage values for different cells. 
+You will see that error coverage is not consistent, 
+even in cells with the same error rate. 
+In this way, the RAI dashboard can help deepen your understanding of different slices of the data. 
+And as we saw with error nodes on the tree map, 
+you can select individual cells and save the data points as a custom cohort. 
+
+![Heat-Map-01](./media/nlp_classification/errorAnalysis_heatMap_errorCoverage.png) #error coverage 
+
+
+## Model overview and performance analysis
+In the Model overview section, we can look at all the data and compare individual cohorts. 
+Each of the pre-built and custom cohorts defined above is included in this view. 
+You will see the accuracy score for each cohorts. 
+
+![Model-Perf-00](./media/nlp_classification/modelPerf_overviewGrid.png) #Grid view
+
+There are also several options for visualizing metrics, 
+including a bar chart of accuracy scores and confusion matrix. These can be customized by different dimensions, such as cohort or metric.
+
+**Accuracy for all cohorts**
+
+![Model-Perf-01](./media/nlp_classification/modelPerf_overviewBarchart.png)  
+
+**Confusion matrix** 
+
+![Model-Perf-02](./media/nlp_classification/modelPerf_confusionMatrix.png) 
+
+Similar investigations can be performed using the visualization options presented below the metrics table. 
+- Select the "Chart view" tab
+- Click on the vertical axis and set the Feature to "positive_words"
+- For the horizontal axis, set this value to "True Y" 
+- In the "Select a dataset cohort to explore" drop-down, pick the more error prone cohort "positive_words <= 17.50, negative_words <= 1.50"
+
+
+![Model-Perf-03](./media/nlp_classification/dataAnalysis_chartView_positiveWordsLessThan.png) 
+
+Now switch the dataset cohort to "positive_words > 17.50, negative_words <= 1.50." 
+You will see that only 4 categories have articles that fit these criteria.  
+
+![Model-Perf-04](./media/nlp_classification/dataAnalysis_chartView_positiveWordsGreaterThan.png) 
+
+
+
+
+## Fairness and bias
+Does this model consistently classify news articles from across the different news categories? 
+The RAI dashboard can help identify if different groups within your overall population are being impacted differently by model results. 
+These can be cohorts you previously defined or newly defined groups.  
+
+**Configure the Feature Analysis tool** -- At the top of the Model overview section, select the Feature cohort option. 
+This component also allows you look more closely at how the  model performs with respect to certain features. 
+Use the Feature(s) drop-down to select positive_words. 
+
+![Fairness-00](./media/nlp_classification/modelOverview_featureCohorts_positiveWords.png)
+
+Here you will see that there are differences in performance, depending on the number of positive words in the article. 
+Articles with the most positive words, greater than 18.7, have the highest accuracy. 
+
+
+
+## Data analysis
+Next, we come to the data analysis section. 
+This tool allows you to look at the data behind the cohorts 
+and can often give you clues as to why some groups are more error-prone than others. 
+This information allows you to not just identify where are the blind spots, 
+but also understand why. 
+For example, your training data may have only a handful of observations for the error-prone cohort.
+
+Start by selecting "Chart view," then:
+- Select your cohort
+- Select the y-axis and change it to "negative_words"
+- Select the x-axis and change it to "True Y" data  
+
+This will allow us to explore the "ground truth" data from the cohort.
+
+Here we see that most categories contain articles with very few negative words. 
+In fact, articles about Personal Finance and Real Estate almost always have no negative words, 
+and the few that do can be considered outliers. 
+
+![Data-Analysis-00](./media/nlp_classification/dataAnalysis_chartView_negativeWords.png)
+
+Finally, scroll down to the Feature importances section. 
+This tool gives you information about what model inputs, in this case text words, 
+are most important in determing the model output. 
+- Adjust the slider at the top of the chart to show the "Top 10 feature by their importance" 
+
+![Feature-Importance-00](./media/nlp_classification/dataAnalysis_featureImportance.png)
+
+## Conclusion
+The Responsible AI dashboard provides valuable tools to help you debug model performance and improve customer experience. 
+In this example, we saw how RAI dashboard components provided valuable insight into text classification performance for a variety of business news articles.  
+These tools can help you focus your model tuning efforts, 
+to make your model development more efficient and your models are performant across a variety of scenarios. 
+
+## Next steps
+1) Complete the text classification demo using the Azure OpenAI service and compare your results
+1) Make the RAI dashboard part of your regular ML Ops practice
--- a/sdk/python/responsible-ai/text/responsibleaidashboard-text-classification-financial-news/responsibleaidashboard-text-classification-financial-news.ipynb
+++ b/sdk/python/responsible-ai/text/responsibleaidashboard-text-classification-financial-news/responsibleaidashboard-text-classification-financial-news.ipynb