Azure-TDSP-ProjectTemplate/Sample_Data
deguhath 5d4e8f4c1a Changing folder names to be more self-explanatory 2017-09-09 19:13:29 -07:00
..
For_Modeling Changing folder names to be more self-explanatory 2017-09-09 19:13:29 -07:00
Processed Changing folder names to be more self-explanatory 2017-09-09 19:13:29 -07:00
Raw Changing folder names to be more self-explanatory 2017-09-09 19:13:29 -07:00
README.md Changing folder names to be more self-explanatory 2017-09-09 19:13:29 -07:00

README.md

The Sample_Data directory in the project git repository is the place to store SAMPLE datasets which should be of small size, NOT the entire datasets. If your client does not allow you to store even the sample data on the github repository, if possible, store a sample dataset with all confidential fields hashed. If still not allowed, please do not store sample data here. But, please still fill in the table in each sub-directory.

The small sample datasets can be used to make your data preprocessing, feature engineering, or modeling scripts runnable. It can be helpful to quickly run the scripts that process or model the data, and understand what the scripts are doing.

In each directory, there is a markdown file, which lists all datasets in each directory. Please provide the link to the full dataset in case one wants to access the full dataset.