Updates to data factory and source content
|
@ -0,0 +1,100 @@
|
|||
# Data pipeline
|
||||
|
||||
In this lab, we will guide you through the process of creating a comprehensive data ingestion solution using data pipelines in Microsoft Fabric. We will start by setting up the copy activity to transfer data from a sample source to a dynamic destination within a lakehouse. This includes using the expression builder to create a dynamic folder structure based on the current date of execution.
|
||||
|
||||
Throughout the lab, you will validate and run the pipeline, ensuring that the data ingestion process is successful and that the data is organized correctly in the lakehouse. By the end of this lab, you will have a solid understanding of how to efficiently manage a data ingestion workflow.
|
||||
|
||||
### Create a data pipeline
|
||||
|
||||
1. Continuing from the [Getting started](GettingStarted.md) tutorial where we built a lakehouse and a sample data pipeline, we will now proceed to select and add a **New item** from the **High-volume data ingest** task.
|
||||
|
||||
![Low volume new item](./Media/new-pipeline-contoso.png)
|
||||
|
||||
1. Within the Create an item window, the available options within Microsoft Fabric have been filtered down to **Recommended items** only again. Select the **Data pipeline** item.
|
||||
|
||||
![High volume data ingest](./Media/task-flow-new-item-high-volume.png)
|
||||
|
||||
1. In the New pipeline window, set the data pipeline name to "**getContosoSample**" and then select **Create**.
|
||||
|
||||
![Bronze data lakehouse](./Media/new-pipeline-name-getcontososample.png)
|
||||
|
||||
### Creating a data connection
|
||||
|
||||
1. From the new and empty data pipeline, select the **Pipeline activity** watermark option and then choose **Copy data** to add this activity to the authoring canvas.
|
||||
|
||||
![Copy data from watermark](./Media/pipeline-activity-copy-data.png)
|
||||
|
||||
1. With the **Copy data** activity selected, navigate to the **Source** tab. Within the **Connection** drop-down menu, select the **More** option to launch the Get data navigator. This navigator provides a comprehensive interface for connecting to various data sources, ensuring that you can easily integrate different data streams into your pipeline.
|
||||
|
||||
![Copy data connection more option](./Media/source-connection-more.png)
|
||||
|
||||
1. From the Get data navigator, select **Add** from the left side-rail and then choose the **Http** connector. The Http connector allows you to connect to web-based data sources, providing flexibility in accessing data from various online resources.
|
||||
|
||||
![Get data http](./Media/get-data-http.png)
|
||||
|
||||
1. Paste the following sample Zip file address from GitHub into the **Url** path. Optionally, you can also set the **Connection name** property to something more discoverable for future use, such as "ContosoSample." Naming your connections helps in easily identifying and managing different data sources within your project. Once complete, select **Connect** to establish the connection.
|
||||
|
||||
```text
|
||||
https://github.com/microsoft/pbiworkshops/raw/main/Day%20After%20Dashboard%20in%20a%20Day/Source_Files/ContosoSales.zip
|
||||
```
|
||||
|
||||
![Contoso sample connection](./Media/contoso-sample-connection.png)
|
||||
|
||||
### Copy activity settings
|
||||
|
||||
1. With the **Copy data** activity selected and the **Source** tab displayed, select the **Settings** option next to the File format field. Within the **Compression type** setting, choose **ZipDeflate (.zip)** and select **OK** to complete.
|
||||
|
||||
![Contoso sample connection](./Media/zip-deflate-compression.png)
|
||||
|
||||
1. Next, with the Copy data activity still selected and the Source tab displayed, expand the **Advanced** section. Deselect the option to **Preserve zip file name as folder**. This allows you to customize the folder name for your zip contents, providing more flexibility in organizing your data.
|
||||
|
||||
![Deselect preserve zip file name](./Media/deselect-preserve-zip-file-name.png)
|
||||
|
||||
1. With the Copy data activity still selected, navigate to the **Destination** tab. From the list of connections, select the previously configured lakehouse **b_IADLake**. This step ensures that the data is being copied to the correct destination, which is essential for maintaining data integrity and organization.
|
||||
|
||||
![Destination lakehouse](./Media/destination-biadlake.png)
|
||||
|
||||
1. Within the Destination settings, select the **Files** option and then the **Directory** file path text input box. This will display the **Add dynamic content [Alt+Shift+D]** property. Select this text to open the pipeline expression builder. The expression builder allows us to create dynamic file paths, which can be customized based on various dynamic parameters such as date and time or static text values.
|
||||
|
||||
![Destination lakehouse](./Media/destination-files-directory.png)
|
||||
|
||||
1. In the Pipeline expression builder window, select the **Functions** tab. Here, you can explore various functions that exist within the expression library, in this example we'll use both date and string functions to create a dynamic folder path. When you're ready, copy and paste the code block below into the expression input box. Press **Ok** when complete.
|
||||
|
||||
```text
|
||||
@concat(
|
||||
formatDateTime(
|
||||
convertFromUtc(
|
||||
utcnow(), 'Central Standard Time'
|
||||
),
|
||||
'yyyy/MM/dd'
|
||||
),
|
||||
'/ContosoSales'
|
||||
)
|
||||
```
|
||||
|
||||
![Destination folder](./Media/expression-builder-date-and-folder.png)
|
||||
|
||||
1. With the Copy data activity and Destination settings still selected, select the drop-down for the **Copy behavior** and then choose the **Preserve hierarchy** option. This option maintains the original file names as they are within the zip file, ensuring that the file structure is preserved during the copy process.
|
||||
|
||||
![Destination folder](./Media/copy-behavior-preserve.png)
|
||||
|
||||
1. Navigate to the General tab with the Copy data activity selected. Update the **Name** and **Description** fields with the appropriate text. This step helps in identifying and managing the activity within your pipeline, making it easier to understand its purpose and functionality.
|
||||
|
||||
| Property | Text |
|
||||
| :-- | :-- |
|
||||
| Name | Get and Unzip files |
|
||||
| Description | Copies sample data from GitHub and stores raw content in lakehouse files |
|
||||
|
||||
![Copy data general descriptions](./Media/copy-data-general.png)
|
||||
|
||||
1. From the **Home** tab, select the **Validate** option to first confirm that there are no issues with your pipeline. This validation step helps in identifying any errors to be fixed before running the pipeline. Once validated, select the **Save** option and then **Run** to start the ingestion from the data pipeline. Running the pipeline initiates the data transfer, allowing you to see the results of your configuration in action within the output window.
|
||||
|
||||
![Validate save and run the pipeline](./Media/pipeline-validate-save-run.png)
|
||||
|
||||
1. Deselect any previously selected activities within the authoring canvas. This action will make the global properties and **Output** view visible. After starting the run of your pipeline, both the Pipeline status and any individual Activity statuses should show a **Succeeded** status. This indicates that everything ran as intended, confirming that your data ingestion process was successful.
|
||||
|
||||
![Unzip copy output succeeded](./Media/unzip-pipeline-status-succeeded.png)
|
||||
|
||||
1. If we return to our previously created **b_IADLake** lakehouse item (either by selecting it on the left side rail if still open or by returning to the workspace item list to open), we can confirm that the zip file's content has now been added to the Files section. The files should be organized with a nested folder structure based on the year, month, date, and data source title for the pipeline run.
|
||||
|
||||
![Copy output succeeded](./Media/unzip-lakehouse-contents.png)
|
После Ширина: | Высота: | Размер: 74 KiB |
После Ширина: | Высота: | Размер: 94 KiB |
После Ширина: | Высота: | Размер: 51 KiB |
|
@ -0,0 +1,42 @@
|
|||
<svg width="48" height="48" viewBox="0 0 48 48" fill="none" xmlns="http://www.w3.org/2000/svg">
|
||||
<path d="M15.5574 44C14.729 44 14.0903 43.26 14.21 42.4401C15.9865 30.4101 26.2961 21.5001 38.4321 21.5001H40.8872C42.5639 21.5001 43.7915 20.5001 43.7915 18.8202L43.9312 18.1602V29.8301C43.9312 31.6701 42.4441 33.1601 40.6078 33.1601H39.2305C34.1206 33.1601 29.6095 36.5001 28.1025 41.4001L27.6634 42.82C27.4438 43.52 26.8051 44 26.0765 44H15.5773H15.5574Z" fill="url(#paint0_linear_35749_112798)"/>
|
||||
<path d="M15.5574 44C14.729 44 14.0903 43.26 14.21 42.4401C15.9865 30.4101 26.2961 21.5001 38.4321 21.5001H40.8872C42.5639 21.5001 43.7915 20.5001 43.7915 18.8202L43.9312 18.1602V29.8301C43.9312 31.6701 42.4441 33.1601 40.6078 33.1601H39.2305C34.1206 33.1601 29.6095 36.5001 28.1025 41.4001L27.6634 42.82C27.4438 43.52 26.8051 44 26.0765 44H15.5773H15.5574Z" fill="url(#paint1_linear_35749_112798)"/>
|
||||
<path d="M5.6667 30.6699C4.74852 30.6699 4 29.9199 4 28.9999V18.9999C4 18.08 4.74852 17.33 5.6667 17.33H35.6074C36.5256 17.33 37.2741 18.08 37.2741 18.9999V28.9999C37.2741 29.9199 36.5256 30.6699 35.6074 30.6699H5.6667Z" fill="url(#paint2_linear_35749_112798)"/>
|
||||
<path d="M5.6667 30.6699C4.74852 30.6699 4 29.9199 4 28.9999V18.9999C4 18.08 4.74852 17.33 5.6667 17.33H35.6074C36.5256 17.33 37.2741 18.08 37.2741 18.9999V28.9999C37.2741 29.9199 36.5256 30.6699 35.6074 30.6699H5.6667Z" fill="url(#paint3_radial_35749_112798)"/>
|
||||
<path d="M15.5771 4C14.7388 4 14.0901 4.75 14.2098 5.57999C15.9863 17.8299 26.4656 26.9099 38.8211 26.9099H40.887C42.5637 26.9099 43.921 28.2699 43.921 29.9499V18.1699C43.921 16.3299 42.434 14.84 40.5976 14.84H39.2203C34.1104 14.84 29.5994 11.5 28.0924 6.59999L27.6532 5.17999C27.4337 4.48 26.7949 4 26.0664 4H15.5871H15.5771Z" fill="url(#paint4_linear_35749_112798)"/>
|
||||
<path opacity="0.4" d="M40.5969 14.83H39.2196C36.8543 14.83 34.6187 14.11 32.7524 12.85V26.15C34.6986 26.64 36.7346 26.91 38.8304 26.91H40.8963C42.573 26.91 43.9303 28.27 43.9303 29.95V18.17C43.9303 16.33 42.4433 14.84 40.6069 14.84L40.5969 14.83Z" fill="url(#paint5_linear_35749_112798)"/>
|
||||
<g opacity="0.7">
|
||||
<path d="M15.6559 4.00006C14.8175 4.00006 14.1688 4.75006 14.2886 5.58005C16.0651 17.83 26.5443 26.91 38.8998 26.91H40.9658C42.6424 26.91 43.9998 28.27 43.9998 29.9499V18.17C43.9998 16.33 42.5127 14.84 40.6763 14.84H39.2991C34.1892 14.84 29.6781 11.5 28.1711 6.60005L27.732 5.18006C27.5124 4.48006 26.8737 4.00006 26.1451 4.00006H15.6658H15.6559Z" fill="url(#paint6_linear_35749_112798)"/>
|
||||
</g>
|
||||
<defs>
|
||||
<linearGradient id="paint0_linear_35749_112798" x1="16.5454" y1="45.96" x2="44.1552" y2="18.4047" gradientUnits="userSpaceOnUse">
|
||||
<stop offset="0.26" stop-color="#0D7012"/>
|
||||
<stop offset="1" stop-color="#085714"/>
|
||||
</linearGradient>
|
||||
<linearGradient id="paint1_linear_35749_112798" x1="57.4145" y1="26.1101" x2="5.02848" y2="37.7263" gradientUnits="userSpaceOnUse">
|
||||
<stop offset="0.04" stop-color="#114A8A"/>
|
||||
<stop offset="1" stop-color="#0C59A3" stop-opacity="0"/>
|
||||
</linearGradient>
|
||||
<linearGradient id="paint2_linear_35749_112798" x1="-16.8587" y1="23.9999" x2="36.2462" y2="23.9999" gradientUnits="userSpaceOnUse">
|
||||
<stop stop-color="#107C10"/>
|
||||
<stop offset="0.96" stop-color="#56B50E"/>
|
||||
</linearGradient>
|
||||
<radialGradient id="paint3_radial_35749_112798" cx="0" cy="0" r="1" gradientUnits="userSpaceOnUse" gradientTransform="translate(30.4728 21.8302) rotate(-56.3222) scale(10.1638 25.898)">
|
||||
<stop stop-opacity="0.3"/>
|
||||
<stop offset="1" stop-opacity="0"/>
|
||||
</radialGradient>
|
||||
<linearGradient id="paint4_linear_35749_112798" x1="45.8272" y1="37.6599" x2="19.9699" y2="0.797966" gradientUnits="userSpaceOnUse">
|
||||
<stop offset="0.04" stop-color="#33980F"/>
|
||||
<stop offset="1" stop-color="#BAE884"/>
|
||||
</linearGradient>
|
||||
<linearGradient id="paint5_linear_35749_112798" x1="37.7925" y1="21.4" x2="43.8006" y2="21.4" gradientUnits="userSpaceOnUse">
|
||||
<stop stop-color="#E5FAC1" stop-opacity="0"/>
|
||||
<stop offset="0.56" stop-color="#E5FAC1" stop-opacity="0.52"/>
|
||||
<stop offset="1" stop-color="#E5FAC1"/>
|
||||
</linearGradient>
|
||||
<linearGradient id="paint6_linear_35749_112798" x1="12.9412" y1="-14.9099" x2="28.5365" y2="12.0334" gradientUnits="userSpaceOnUse">
|
||||
<stop offset="0.14" stop-color="#FDE100"/>
|
||||
<stop offset="0.98" stop-color="#FFC600" stop-opacity="0"/>
|
||||
</linearGradient>
|
||||
</defs>
|
||||
</svg>
|
После Ширина: | Высота: | Размер: 4.2 KiB |
После Ширина: | Высота: | Размер: 93 KiB |
После Ширина: | Высота: | Размер: 54 KiB |
После Ширина: | Высота: | Размер: 70 KiB |
После Ширина: | Высота: | Размер: 93 KiB |
После Ширина: | Высота: | Размер: 338 KiB |
После Ширина: | Высота: | Размер: 38 KiB |
После Ширина: | Высота: | Размер: 15 KiB |
После Ширина: | Высота: | Размер: 70 KiB |
После Ширина: | Высота: | Размер: 102 KiB |
После Ширина: | Высота: | Размер: 56 KiB |
После Ширина: | Высота: | Размер: 98 KiB |
После Ширина: | Высота: | Размер: 76 KiB |
После Ширина: | Высота: | Размер: 125 KiB |
|
@ -1,4 +1,4 @@
|
|||
![Microsoft Fabric](https://raw.githubusercontent.com/microsoft/FabricCAT/main/Asset%20Library/MicrosoftFabric.png)
|
||||
![Microsoft Fabric](./Media/data_factory_48_color.svg)
|
||||
</br>
|
||||
</br>
|
||||
|
||||
|
@ -37,32 +37,17 @@ In this lab, you will learn how to create a project task flow and understand the
|
|||
|
||||
To start the lab, visit [Getting started](./GettingStarted.md)
|
||||
|
||||
## Data pipelines
|
||||
## Data pipeline
|
||||
|
||||
|
||||
To start the lab, visit [Data pipeline](./DataPipeline.md)
|
||||
|
||||
## Dataflow Gen2
|
||||
|
||||
In this lab you'll learn about how to shape and orchestrate your data using Data Factory experiences.
|
||||
|
||||
1. How to create a [Dataflow Gen2](https://docs.microsoft.com/power-bi/transform-model/dataflows/dataflows-introduction-self-service) to prepare and load data using Power Query Online.
|
||||
1. Understanding the [storage and compute staging](https://blog.fabric.microsoft.com/blog/data-factory-spotlight-dataflows-gen2?ft=Data-factory:category) architecture for large scale data transformations.
|
||||
1. How to use [Pipelines](https://learn.microsoft.com/fabric/data-factory/activity-overview) to orchestrate and control your data movement.
|
||||
1. Configuring [data destination outputs](https://learn.microsoft.com/fabric/data-factory/dataflow-gen2-data-destinations-and-managed-settings).
|
||||
|
||||
To start the lab, visit [Data Preparation](./DataPreparation.md)
|
||||
|
||||
## Dataflow Gen2
|
||||
|
||||
In this lab you'll learn how to create datasets optimized for scale and performance usability using web model editing in the Microsoft Fabric service (cloud).
|
||||
|
||||
1. How to create a [Direct Lake](https://docs.microsoft.com/power-bi/transform-model/desktop-storage-mode) semantic model.
|
||||
1. How to properly model your [data](https://learn.microsoft.com/power-bi/guidance/star-schema) using the web model editing experience.
|
||||
1. How to add metadata to your [semantic model](https://learn.microsoft.com/en-us/power-bi/transform-model/) for deeper analysis and insights.
|
||||
|
||||
To start the lab, visit [Data Modeling](./DataModeling.md)
|
||||
|
||||
## Data Visualization
|
||||
|
||||
In this lab you'll learn about designing efficient and stunning reports using Power BI Desktop.
|
||||
|
||||
1. How to use canvas backgrounds and shape elements to create professional report layouts.
|
||||
1. How to create report level measures for dynamic report elements.
|
||||
1. How to leverage no-code artificial intelligence to find new insights in your data.
|
||||
|
||||
To start the lab, visit [Data Visualization](./DataVisualization.md)
|
||||
To start the lab, visit [Dataflow Gen2](./DataflowGen2.md)
|