Analysis-Services/UsqlScripts
KayUnkroth d6a2230ba9 RTF version with hyperlinks. 2017-08-02 14:16:29 -07:00
..
Modelling Full set of U-SQL scripts with storage account name replaced with placeholder. 2017-08-02 14:06:14 -07:00
all_single Full set of U-SQL scripts with storage account name replaced with placeholder. 2017-08-02 14:06:14 -07:00
large_multiple Full set of U-SQL scripts with storage account name replaced with placeholder. 2017-08-02 14:06:14 -07:00
last_available_year Full set of U-SQL scripts with storage account name replaced with placeholder. 2017-08-02 14:06:14 -07:00
readme.rtf RTF version with hyperlinks. 2017-08-02 14:16:29 -07:00
readme.txt Providing a text version of the readme. 2017-08-02 14:13:19 -07:00

readme.txt

Этот файл содержит невидимые символы Юникода!

Этот файл содержит невидимые символы Юникода, которые могут быть отображены не так, как показано ниже. Если это намеренно, можете спокойно проигнорировать это предупреждение. Используйте кнопку Экранировать, чтобы показать скрытые символы.

U-SQL Scripts for Processing a TPC-DS Data Set
The U-SQL scripts for processing a TPC-DS data set demonstrate how to use Azure Data Lake Analytics to prepare raw data for import into an Azure Analysis Services data model. For a detailed discussion, see the blog article “Using Azure Analysis Services on Top of Azure Data Lake Storage” on the Analysis Services Team Blog.
To use these scripts, the TPC-DS data set must be generated by using the dsdgen tool, which can be downloaded as source code from the TPC-DS web site. Run the dsdgen tool with /PARALLEL 100 and /CHILD ids ranging from 1 – 100 to generate the source files with the expected file naming conventions and place the source files in an Azure Blob Storage account, as discussed in “Building an Azure Analysis Services Model on Top of Azure Blob Storage—Part 2” on the Analysis Services Team Blog. Finally, edit the U-SQL scripts and replace the storage account placeholder (@<blob storage account name>) with your actual storage account.
The subfolders containing the U-SQL scripts highlight different scenarios:
* all_single   These scripts create a single csv file per table containing all the source data.
* large_multiple   These scripts 4 csv files for each of the large tables (catalog_returns, catalog_sales, inventory, store_returns, store_sales, web_returns, and web_sales) and a single csv file for each of the remaining tables.
* last_available_year   These scripts create a single csv file per table containing only the source data for the last year in the data set, which is the year 2003.
* modelling    These scripts create a data set for modelling purposes with a single csv file per table containing up to 100 rows of data.