Граф коммитов

3 Коммитов

Автор SHA1 Сообщение Дата
Chenhui Hu 0607fd568f First Release of Forecasting Repo (#181)
* Handled edge case where ts_id_col_names is None

* Split long line into separate lines

* Added notebook template

* Added a test yml file

* Added yml file for python unit test pipeline

* Minor update

* Minor update

* Minor update

* Minor update

* Removed triggers

* Removed triggers

* Created a base ts estimator and inherit BaseTSFeaturizer from the BaseTSEstimator.

* Refactored featurizer class hierachy.

* Added week of month method.

* add script to source entire

* formatting

* source only test files

* Inherit temporal featurizers from BaseTSFeaturizer.

* Minor update.

* Replaced max_test_timestamp with max_horizon

* Refactored rolling window featurizers.

* Renamed hour_of_year feature to normalized_hour_of_year

* Inherit all normalizers from base normalizer class.

* address review comments for the PR of contributing

* minor update

* address review comments for PR of r test pipeline

* add a test yml file

* Remove checking target column existence, because testing data may not have the target column.

* Create setter and getter of ts_id_col_names.

* Fixed bug caused by unexpected behavior of pandas.shift

* Some code cleanup.

* Updated some featurizer names.

* Some minor changes in df_config and feature configs.

* Some minor changes in feature names.

* Added usage examples in docstring.

* Computation time update after feature engineering refactoring.

* Removed setting frequency.

* Added docstring to convert_to_tsdf function.

* Removed frequency in convert_to_tsdf call.

* Fixed week_of_month function.

* Added popularity featurizer

* Added utility function for checking Iterable but not string.

* Updated LightGBM feature engineering code to use new feature engineering classes.

* Improved checking whether input column names are Iterable and conver to list.

* Made future_value_available a read-only property.

* Minor docstring update.

* Removed extra space in docstring examples.

* Made some methods staticmethods.

* Minor QRF result update after feature engineering code change.

* Removed calling of validate_file and added catching of the exception

* Update python_unit_tests_base.yml for Azure Pipelines [skip ci]

Updated path of the test results

* Test if the download link is wrong

* Fixed minor format issues.

* Fixed minor format issues.

* Fixed formatting issues.

* Fixed line length.

* Removed data files before downloading and checked dimensions of energy data

* Removed the change made for testing

* Changed folder structure of tests and added table to show build status

* Added missing files

* Updated based on review comments

* new folder structure

* add repo metrics

* remove prototypes folder

* add models placeholder

* adjust featurizers to the new structure of folders

* changes in README and evaluation files

* adjust data download to new folders

* delete unnecessary files

* energy load baseline model with new folders

* delete data files

* fix links in benchmarks file

* fix bug

* adjust GBM, QRF and FNN submissions to the new folder structure

* Replace pd.to_timedelta with pd.offsets.

* Added get_offset_by_frequency helper function.

* fix small bugs

* fix small bugs

* Update TSCVSplitter.

* refactored high-level folders

* added a placeholder folder for PR/issue templates

* added subfolders under notebooks/

* updated tests folder

* renamed notebooks/ to examples/

* Update to CONTRIBUTING instructions (#34)

* style checking and formatting files

* git hook installation guide

* issue and PR templates

* minor change

*  working with github instructions

* added specific issue templates

* addressed PR comments

* addressed Chenhui's comment

* addressing chenhuis comments

* conda environment file (#36)

* conda environment file

* updated environment file

* updated instructions for installing conda env

* Vapaunic/lib (#37)

* initial core for forecasting library

* syncing with new structure

* __init__ files in modules

* renamed lib directory

* Added legal headers and some formatting of py files

* restructured benchmarking directory in lib

* fixed imports, warnings, legal headers

* more import fixes and legal headers

* updated instructions with package installation

* barebones library README

* moved energy benchmark to contrib

* formatting changes plus more legal headers

* Added license to the setup

* moved .swp file to contrib, not sure we need to keep it at all

* added missing headers and a brief snipet to README file

* minor wording change in readme

* Chenhui/cpu unit test pipeline (#38)

* address review comments

* added full conda path

* minor change

* added conda to PATH

* added build status in README

* removed energy data prep placeholder notebook

* moved out data energy explore notebook into contrib

* moved data download script to tools/

* Added getting started section to readme

* Added rbase and rbayesm to conda environment

* modified data download script

* added instructions for data download

* renamed data download script

* fixing issues with test pipeline

* parsing issue in yml file

* cleaning up ci test yaml file for more diagnostic info

* fixed a missing argument in instructions

* removed retail directory under dataset module

* moved feature_engineering.py to the feature engineering module

* moved evaluate.py to evaluation module

* combined benchmark settings into a single file

* moved download sript to the package and modified the tests

* modified instructions

* fixed the build pipeline yml

* fix to the pipeline yml

* fix to the pipeline yml

* moved serve_folds into ojdata.py

* removed data_schema.py file as all content moved to ojdata.py

* fixed split_train_test in ojdata.py

* moved retail_data_schema into ojdata.py

* moved all oj utilities to ojdata.py

* removed paths from benchmark_settings

* fixed up a docstring

* quick fix a typo

* removed benchmark_settings

* parameterized experiment settings

* refactored experiment settings

* Fixed docstrings

* addressed chenhuis comment around round file naming

* renamed experiment to forecast settings

* Chenhui/light gbm quick start (#40)

* initial example notebook for lightgbm

* reduced to one round forecast

* added text

* added text

* added text

* moved week_of_month to feature engineering utils

* moved df_from_cartesian_product to feature utils

* moved functions to feature utils

* moved functions to feature utils

* added lightgbm model utils

* updated plots

* added text and renamed predict function

* reduced print out frequency in model training

* moved data visualization code to utils

* added text

* updated plot function and added docstring

* renamed the notebook

* updated text

* added NOTICE file, currently empty as we're not redistributing any packages

* Chenhui/add scrapbook (#43)

* added scrapbook support

* Added gitpython to environtment.yml file

* added git_repo_path function to utils

* updated notebook

* added test for lightgbm notebook

* included testing of notebooks

* resolve test error

* resolve test error

* added kernel name

* updated kernel name

* trying installing bayesm from cmd

* trying installing bayesm from cmd

* trying installing bayesm from cmd

* excluded notebook test

* excluded notebook test

* added lapack.so link fix

* included notebook tests

* excluded files for notebook test

Co-authored-by: vapaunic <15053814+vapaunic@users.noreply.github.com>

* added integration test

* added initial data prep notebook

* updated notebook

* updated notebook

* updated notebook

* updated url

* init

* model parameters

* removed blank quick start notebooks

* removed blank modeling notebooks

* removed blank evaluation notebooks

* Removed blank model selection notebooks

* removed blank o16n notebooks

* removed outdated text from contrib/README

* removed outdated swp file

* updating .gitignore

* removed change log, as we don't plan to maintain this

* Excluding irrelevant directories

* fix settings

* separated out the setup guide

* fix settings

* simplemodel init

* typo

* add rproj file

* Renaming forecasting_lib to fclib (#59)

* renamed forecasting_lib directory

* modified references to forecasting_lib

* Vapaunic/envname (#61)

* renamed conda env

* modified setup instructions

* minor change in contributing guide

* keep top-level gitignore only

* formatting fixes

* Chenhui/add automl example (#62)

* added multiple linear models and example notebook for AutoML

* removed commented code

* address review comments

* minor update to the notebook

* minor update to the notebook

* added text

* changed types in lightgbm to be consistent with the rest of the code

* modified docstrings in multiple_linear_regression.py

* updated ci yaml files

* changed import statement in confest.py

* updated gitpython version to the latest

Co-authored-by: vapaunic <15053814+vapaunic@users.noreply.github.com>

* Vapaunic/split bug (#65)

* fixed a yield bug

* removed two blank files

* modified split data function to auto-calculate the splits based on the parameters

* removed forecast_settings module

* removed unused parameter

* modified splitting function to use non-overlapping testing

* tested the split function after the update

* minor fix

* defaults changed in split function

* modified lightgbm example with new split function

* modified automl example (needs verification)

* modified data explore notebook

* quick fix:

* updated data preparation notebook

* changed defaults in split function

* Addressed changes in lightgbm

* addressed issues in automl notebook

* fixed typo in lightgbm plot

* first images of time series split

* updated the pictures

* updated evaluation periods (#66)

* Chenhui/env setup script (#67)

* added a shell script for setting up environment

* changed yaml to yml

* added comments and updated SETUP.md

* modified data preparation notebook with images

* moved r exploration notebook to contrib directory

* modified data explore notebook, updated info about the data, and removed reference to TSPerf

* addressed review feedback and fixed the explore notebook

* Chenhui/multiround lightgbm (#68)

* added initial multiround notebook for lightgbm

* updated data splitting

* updated text

* updated week list

* addressed review comments

* added pyramid-automl to conda file

* first draft of arima notebook

* replace pyramid with pmdarima

* Added a complete function

* minor type

* forecasting across many stores/brands

* complete arima notebook

* renamed data preparation/exploration notebooks

* added git clone to setup

* addressed PR comments

* typo

* Arima to ARIMA

* fixed docstring in plot function

* fixed a bug in MAPE calculation and added plotting

* fixed a bug in predict

* modeling arima on log scale

* Fixing AML Example Notebook (#84)

* Cleaning notebook output, adding get_or_create workspace call, and fixing get_or_create AmlCompute

* Add regression-based models (#64)

* modelling updates

* code tweak

* rebuild

* update mape

* update mape 2

* new forecasting structure

* update eval

* rebuild dataprep

* rebuild with profit

* rm profit

* add plot

* typo

* tidy up

* expand readme

* oops

* clarified setup guide (#94)

* Update SETUP (#95)

minor fix

* Cleaned up unused files and directories (#96)

* removed non-used files

* moved docs into a docs/ dir

* fixed broken links

* Chenhui/dilated cnn example and utils (#76)

* added initial model util file for DCNN

* initial notebook

* added feature utils for DCNN

* upadted evaluation and visualization

* removed plot function

* replaced PRED_HORIZON, PRED_STEPS by HORIZON, GAP

* removed log dir if it exists

* updated model utils

* generalized categorical features in dcnn model util

* generalized network definition

* update training code

* format with blackcellmagic

* address review comments and added README

* Chenhui/add ci tests (#146)

* Update conda env with versions (#99)

* 💥

* revert

* minor changes

Co-authored-by: Chenhui Hu <chenhhu@microsoft.com>

* Adding missing Jupyter Extension (#90)

* Update environment.yml

* specified version

Co-authored-by: Chenhui Hu <chenhhu@microsoft.com>

* fix links to examples/ (#104)

* Chenhui/rename notebooks and update automl notebook (#106)

* removed unused module

* added outputs in automl notebook

* fixed a notebook name

* Arima multi-round notebook (#91)

* working arima model

* final auto arima example

* added tqdm to requirements

* addressed review comments

* Revert "Chenhui/rename notebooks and update automl notebook (#106)" (#107)

This reverts commit 032c91d9bfa389f22ae1f1f2150913a4f063bd18 [formerly 15d25213dc].

Co-authored-by: Chenhui Hu <chenhhu@microsoft.com>

* Fixing data download issue (#109)

* removed dependency on __file__ from data download, doesn't work in jupyter

* changed aux to auxdata

* fixe data download function

* fixed path

* auxdata -> auxi

* adding tl;dr directions for setup to README.md (#88)

* adding tl;dr directions for setup to README.md

* added a bit more text

* Cleaned up obsolete (tsperf) code in fclib (#112)

* moved out tsperf files from evaluation module

* moved out tsperf tuning code

* removed more unused files

* Addressing documentation related issues (#111)

* Added conda activate to the setup readme

* added instructions for starting jupyter to setup

* minor

* deleted duplicate instructions

* addressed PR comments

* Chenhui/rename notebooks and updated AutoML example (#108)

* removed unused module

* added outputs in automl notebook

* fixed a notebook name

* updated pytest file

* address review comments

* reran notebook with blackcellmagic

* adding pylint  (#93)

* adding tl;dr directions for setup to README.md

* removing pylint hook and pylint_junit from the env file

* removed pylint config file

* Chenhui/update example folder (#115)

* restructure examples folder

* updated readme

* added readme

* minor update

* removed R folder

* minor change

* fixed a broken link

* another broken link

* fixing notebook tests

* Chenhui/fix aux file path (#118)

* fixed figure links

* changed to auxi_i.csv

* minor change

* [MINOR] Small changes to Arima notebooks (#121)

* fixed a broken link

* minor text changes

* Documentation (#120)

* added target audience section

* added intro on forecasting

* Added fclib documentation

* improved examples readme

* address comments

* added info about the dataset

* added items to be ignored (#123)

* added items to be ignored

* added *.log and score.py

* Chenhui/toplevel readme (#127)

* added content table

* added references

* added external repo links

* minor update

* Chenhui/tune deploy lgbm (#122)

* added notebook and utils

* updated readme links

* fix data path

* updated text

* group imports

* minor update

* using azureml utils to create workspace and compute (#126)

* using azureml utils to create workspace and compute

* group imports

* Download ojdata directly from github (#128)

* new function to download and load oj data directly from bayesm repo

* removed bayesm

* new R function to only load the data

* removed download R function

* minor fix

* added documentation to load_oj_data.R

* added requests to requirements

* fixed a syntax error (#130)

* fix setup.md link (#129)

* fix setup.md link

* mention related use cases

* Vapaunic/cgbuild (#133)

* added files to generate reqs.txt and the ci yml file

* Added notice generation task

* Checking if notice is there

* Update component_governance.yml for Azure Pipelines

* check in notice file

* Update component_governance.yml for Azure Pipelines

* fixed heading

* Chenhui/windows setup (#131)

* initial test

* added batch script and instructions

* align image to center

* adjust image size

* added text

* adjust image size

* address comments

* Readds R material (#116)

* redo R stuff in new dirs

* dirname fixup

* add Rproj file

* rebuild

* fixups

* roxygenise

* copyright notice

* dataprep

* updated yaml

* more updates

* more tweaks

* reg models

* update reg models

* more updates

* reword

* rendered prophet html

* name fix

* add lintr file

* move stuff

* renamed use case folder (#138)

* renamed use case folder

* dirname change

* updated readme

* added notebooks

* fix ci test

* Vapaunic/featutils (#137)

* moved feature engineering module to contrib

* removed lag submod

* cleaned up feature engineering

* rebuild R notebooks (#139)

* Chenhui/toplevel readme (#140)

* added content table

* added references

* added external repo links

* minor update

* updated setup instructions

* added text

* align text

* removed duplicated Content section

* address review comments

* Chenhui/hyperdrive example update (#142)

* removed blackcellmagic

* removed utils under aml_scripts and updated notebook

* added notebook path

* added ci test of lightgbm multi round example

* make forecast round as parameter

* Make -Agent Name

* resolve duplicated function name

* increased time limit and reduce number of rounds

* increase time limit

* added parameters tag to multiround lightgbm and dilatedcnn

* README change (#147)

* minor change

* hide tags

* hide tags

* added parameters tag

* Revert "Chenhui/add ci tests (#146)" (#149)

This reverts commit de7a19cfa7637476b9ebfc92f5c18a26a8eca4da [formerly f8bd22733c].

* Chenhui/add ci tests (#150)

* Update conda env with versions (#99)

* 💥

* revert

* minor changes

Co-authored-by: Chenhui Hu <chenhhu@microsoft.com>

* Adding missing Jupyter Extension (#90)

* Update environment.yml

* specified version

Co-authored-by: Chenhui Hu <chenhhu@microsoft.com>

* fix links to examples/ (#104)

* Chenhui/rename notebooks and update automl notebook (#106)

* removed unused module

* added outputs in automl notebook

* fixed a notebook name

* Arima multi-round notebook (#91)

* working arima model

* final auto arima example

* added tqdm to requirements

* addressed review comments

* Revert "Chenhui/rename notebooks and update automl notebook (#106)" (#107)

This reverts commit 032c91d9bfa389f22ae1f1f2150913a4f063bd18 [formerly 15d25213dc].

Co-authored-by: Chenhui Hu <chenhhu@microsoft.com>

* Fixing data download issue (#109)

* removed dependency on __file__ from data download, doesn't work in jupyter

* changed aux to auxdata

* fixe data download function

* fixed path

* auxdata -> auxi

* adding tl;dr directions for setup to README.md (#88)

* adding tl;dr directions for setup to README.md

* added a bit more text

* Cleaned up obsolete (tsperf) code in fclib (#112)

* moved out tsperf files from evaluation module

* moved out tsperf tuning code

* removed more unused files

* Addressing documentation related issues (#111)

* Added conda activate to the setup readme

* added instructions for starting jupyter to setup

* minor

* deleted duplicate instructions

* addressed PR comments

* Chenhui/rename notebooks and updated AutoML example (#108)

* removed unused module

* added outputs in automl notebook

* fixed a notebook name

* updated pytest file

* address review comments

* reran notebook with blackcellmagic

* adding pylint  (#93)

* adding tl;dr directions for setup to README.md

* removing pylint hook and pylint_junit from the env file

* removed pylint config file

* Chenhui/update example folder (#115)

* restructure examples folder

* updated readme

* added readme

* minor update

* removed R folder

* minor change

* fixed a broken link

* another broken link

* fixing notebook tests

* Chenhui/fix aux file path (#118)

* fixed figure links

* changed to auxi_i.csv

* minor change

* [MINOR] Small changes to Arima notebooks (#121)

* fixed a broken link

* minor text changes

* Documentation (#120)

* added target audience section

* added intro on forecasting

* Added fclib documentation

* improved examples readme

* address comments

* added info about the dataset

* added items to be ignored (#123)

* added items to be ignored

* added *.log and score.py

* Chenhui/toplevel readme (#127)

* added content table

* added references

* added external repo links

* minor update

* Chenhui/tune deploy lgbm (#122)

* added notebook and utils

* updated readme links

* fix data path

* updated text

* group imports

* minor update

* using azureml utils to create workspace and compute (#126)

* using azureml utils to create workspace and compute

* group imports

* Download ojdata directly from github (#128)

* new function to download and load oj data directly from bayesm repo

* removed bayesm

* new R function to only load the data

* removed download R function

* minor fix

* added documentation to load_oj_data.R

* added requests to requirements

* fixed a syntax error (#130)

* fix setup.md link (#129)

* fix setup.md link

* mention related use cases

* Vapaunic/cgbuild (#133)

* added files to generate reqs.txt and the ci yml file

* Added notice generation task

* Checking if notice is there

* Update component_governance.yml for Azure Pipelines

* check in notice file

* Update component_governance.yml for Azure Pipelines

* fixed heading

* Chenhui/windows setup (#131)

* initial test

* added batch script and instructions

* align image to center

* adjust image size

* added text

* adjust image size

* address comments

* Readds R material (#116)

* redo R stuff in new dirs

* dirname fixup

* add Rproj file

* rebuild

* fixups

* roxygenise

* copyright notice

* dataprep

* updated yaml

* more updates

* more tweaks

* reg models

* update reg models

* more updates

* reword

* rendered prophet html

* name fix

* add lintr file

* move stuff

* renamed use case folder (#138)

* renamed use case folder

* dirname change

* updated readme

* added notebooks

* fix ci test

* Vapaunic/featutils (#137)

* moved feature engineering module to contrib

* removed lag submod

* cleaned up feature engineering

* rebuild R notebooks (#139)

* Chenhui/toplevel readme (#140)

* added content table

* added references

* added external repo links

* minor update

* updated setup instructions

* added text

* align text

* removed duplicated Content section

* address review comments

* Chenhui/hyperdrive example update (#142)

* removed blackcellmagic

* removed utils under aml_scripts and updated notebook

* added notebook path

* added ci test of lightgbm multi round example

* make forecast round as parameter

* Make -Agent Name

* resolve duplicated function name

* increased time limit and reduce number of rounds

* increase time limit

* added parameters tag to multiround lightgbm and dilatedcnn

* README change (#147)

* minor change

* hide tags

* hide tags

* added parameters tag

* Revert "Chenhui/add ci tests (#150)" (#151)

This reverts commit 357453234088f2ebb8453bd8cd77527a1c6c2130 [formerly 21846168a7].

* Chenhui/Add CI tests for notebooks

This reverts commit 8a99549da8b9096b65130fd2f6634e2a217b2dd9 [formerly 89e986fe2c].

* minor update

* Added CI tests for example notebooks

* Update component governance pipeline

* Update component governance pipeline

* add ignored items

* Readds R material (#116)

* Chenhui/windows setup (#131)

* Vapaunic/featutils (#137)

* Chenhui/add CI tests for notebooks

* Vapaunic/arimaint (#154)


* modified conftests to add arima

* added tests

* modified notebooks with parameters

* Chenhui/code improvments (#157)

* updated docstring

* pinged package versions

* minor improvements

* minor improvement

* modified metrics to take any iterable (#158)

* improvement: using Ray to parallelize arima fitting (#159)

* using Ray to parallelize arima fitting

* added ray as dependency

* text about ray, disable warnings, and minor stuff

* scipy 1.4.1 or above

* reverting scipy, azuremlsdk issue

* minor mod

Co-authored-by: Vanja Paunic <15053814+vapaunic@users.noreply.github.com>

* chenhui/improve ray output (#166)

* modified arima multiround to run with ray (#167)

* Chenhui/improve doc (#168)

* minor changes

* remove redundancy

* updated text

* improved text in model tuning and deployment notebook

* clarify the data used

* updated text

* added description of the script

* add explanation of gaps in the curve

* add explanation of gaps in the curve

* updated text

* fix typos

* improve documentation and format

* Addressing a few issues around package dependencies (#169)

* syncronizing utils with other OSS AI repos

* exclude xlrd, leftover from tsperf

* exclude urlib3, leftover from tsperf

* moving tqdm to fclib as only used by lib at the moment

* included fclib dependencies in requirements.txt

* lower bounded package versions that we dont need specific versions of

* lower bound gitpython

* Chenhui/improve checking of run completion (#170)

* Chenhui/added ray dashboard (#171)

* Chenhui/update diagram (#172)

* update multiround training diagram

* minor change

* update diagram and minor change

* Addressing doc related issues (#173)

* taking out inventory optimization link

* pulled contributing out of docs

* Chenhui/ray windows (#177)

* add util to check if module exists

* use ray if available or use sequential training

* updated text

* updated text

* reduce code redundancy

* Chenhui/setup scripts (#178)

* move ray to linux setup script

* remove duplicated azureml-sdk to avoid errors

* add ray to ci yaml files

* update azureml-sdk

* update manual setup instructions

* minor change

* Chenhui/content table (#179)

* update readme

* minor change

* minor update

* Chenhui/multiround arima (#180)

* use ray if it is installed

* update text and reran notebook

* add reference

* Chenhui/dilatedcnn windows (#184)

* resolve format issues

* update log path and tensorboard path

* remove subprocess import

* fix path

* change env name to resolve pipeline failures

* Chenhui/hyperdrive windows (#185)

* resolve format issues

* update log path and tensorboard path

* remove subprocess import

* fetch common utils from chenhui/dilatedcnn_windows

* update notebook

* removed explain module and added notebooks module

* get updated ci yml files

* updated kernel name

* Chenhui/enhancement (#186)

* modified module_path

* updated tensorboard section

* rerun notebook

* only submit local run if python path is found

* minor change and rerun notebook

* updated content section (#187)

* updated content section

* minor change

* address comments

* add links

Co-authored-by: Hong Lu <honglu@microsoft.com>
Co-authored-by: ZhouFang928 <ZhouFang928@users.noreply.github.com>
Co-authored-by: pechyony <pechyony@outlook.com>
Co-authored-by: Ubuntu <chenhui@chhdsvmnc6.hyjxgt1qggauhj0g0g2jh3guwb.bx.internal.cloudapp.net>
Co-authored-by: vapaunic <15053814+vapaunic@users.noreply.github.com>
Co-authored-by: Hong Ooi <hongooi@microsoft.com>
Co-authored-by: Daniel Ciborowski <dciborow@microsoft.com>
Co-authored-by: Markus Cozowicz <marcozo@microsoft.com>
Former-commit-id: 6098ecf68c
2020-04-06 16:17:18 -04:00
hlums 3784befb89 Added energy data directory to .gitignore 2018-08-10 15:34:05 -04:00
angusrtaylor 0be9934e53 initial commit 2018-06-21 17:10:20 +00:00