Create a data mart using Azure Data Factory as ELT / ETL, Azure Synapse as database and Power BI as visualization tool.
Перейти к файлу
brunohca 5b97b83ed0
Add files via upload
2021-06-24 15:23:37 -03:00
1 - Setup - PowerShell scripts Update 3 - Create Azure Synapse Workspace.ps1 2021-06-23 12:43:13 -03:00
2 - Setup database scripts Update 5 - Create external tables.sql 2021-06-23 13:11:14 -03:00
3 - ARM Template Update arm_template.json 2021-06-24 15:11:56 -03:00
4 - Others
5 - Power BI
Support Files Add files via upload 2021-06-24 15:23:37 -03:00
.gitignore
AzureServices.png
CODE_OF_CONDUCT.md
LICENSE
README.md Update README.md 2021-06-24 15:16:29 -03:00
SECURITY.md
SUPPORT.md
Understanding configuration table.md

README.md

Combining Azure Synapse with Microsoft Power BI

The code will provive you a step-by-step guide to create a DataMart with transport information.

This DataMart is build using Azure Data Factory for ELT/ETL and Azure Synapse as database. The final stage is a Power BI report reading the information from Synapse database.

Some important aspects in the solution is how to take advantage of dynamic content in Azure Data Factoty, creating a clean end-to-end solution.

What can be achieved?

The solution idea is to speed-up ETL / ETL development and simplify how you code your pipelines in Azure Data Factory. Also how you can take advantage of using a combination of Azure Synapse and Power BI as your standard serving layer solution.

The solution is splited in:

  1. Create Azure Services.
  2. Deploy database scripts.
  3. Deploy Azure Data Factory ARM template.
  4. Process data in Azure Data Factory.
  5. Import data into a Power BI template.

Final Power BI data model:

alt text

General architecture:

alt text

Azure Data Factory solution:

alt text

Understanding configuration table

Requirements

  • Access to Azure account with permission to create Azure services
  • Permission to install PowerShell components in your local computer
  • A Power BI Free or Pro license

You need around 1 hour to deploy entire solution.

1 - Create Azure Services

The Azure Services can be created in Azure Portal or executing PowerShell scripts:

Open: Scripts location

Follow below steps to create Azure Services: alt text

  1. Create resource group.ps1

  2. Create Storage Account.ps1

  3. Create Azure Synapse Workspace.ps1

  4. Create Azure SQL Pool.ps1

  5. Create Azure Data Factory.ps1

2 - Create Synapse objects

Once you create Azure services, is necessary to deploy database objects. Please execute next steps:

Open: Scripts location

Follow below steps to create Azure Services: alt text

  1. Create schemas.sql

  2. Create operational tables.sql

  3. Create data warehouse tables.sql

  4. Create stage tables.sql

  5. Create external tables.sql

  6. Insert dimensions data.sql

  7. Insert configuration table.sql

3 - Deploy ARM template

This is the ARM template you are about to deploy: alt text

Once you deployed Azure services and Synapse objects you need to deploy Azure Data Factory ARM template. Please execute next steps:

Open: ARM template location

1.1 - Before deploy ARM template file, you need to replace some values. There are 6 points to replace the information before deploy scripts. Please replace these values carfully:

alt text

  1. Once you replace the values with your own information, do the follow:
  • Open your Azure Data Factory;
  • On manage tab, select "ARM Template" option;
  • Select option "Import ARM Template";
  • On "Custom Deployment" editor, select option "Build your own template in the editor";
  • Load the file "arm_template.json" and save;
  • Select the resource group, region and factory name (same name you deployed with PoweShell);

alt text

  1. Once deploy is done, will be necessary to process the pipeline:

DO NOT FORGET TO ADD YOUR SYNAPSE PASSWORD AT SYNAPSE LINKED SERVICES

alt text

4 - Others steps

Is necessary to upload a file into the storage account. Please follow the instructions:

  1. Download file named "taxi_zone_lookup.csv": Dowload CSV file

  2. Upload file into this location in the storage account: "nyccabdata\ingestion\lookup\taxi_zone_lookup.csv"

5 - Power BI

After process execution, open Power BI template file and fill the parameters values:

Open: Power BI file template

  • RangeStart: '2019-01-01 00:00:00.000'
  • RangeEnd: '2019-01-01 23:59:59.000'
  • ServerName: [Use your server name]
  • DatabaseName: "TransportDW"

alt text

This Power BI file is configured with incremental refresh. You can refresh Power BI file and then deploy it to Power BI Service.

alt text

6 - Help us to improve!

Please check: Code of conduct

Please check: Security

Please check: Support

Contributing

This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.

When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.

Trademarks

This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.