microsoft/forecasting - forecasting - Git

29 строки

249 B

Plaintext

Исходник Постоянная ссылка Обычный вид История

First Release of Forecasting Repo (#181) * Handled edge case where ts_id_col_names is None * Split long line into separate lines * Added notebook template * Added a test yml file * Added yml file for python unit test pipeline * Minor update * Minor update * Minor update * Minor update * Removed triggers * Removed triggers * Created a base ts estimator and inherit BaseTSFeaturizer from the BaseTSEstimator. * Refactored featurizer class hierachy. * Added week of month method. * add script to source entire * formatting * source only test files * Inherit temporal featurizers from BaseTSFeaturizer. * Minor update. * Replaced max_test_timestamp with max_horizon * Refactored rolling window featurizers. * Renamed hour_of_year feature to normalized_hour_of_year * Inherit all normalizers from base normalizer class. * address review comments for the PR of contributing * minor update * address review comments for PR of r test pipeline * add a test yml file * Remove checking target column existence, because testing data may not have the target column. * Create setter and getter of ts_id_col_names. * Fixed bug caused by unexpected behavior of pandas.shift * Some code cleanup. * Updated some featurizer names. * Some minor changes in df_config and feature configs. * Some minor changes in feature names. * Added usage examples in docstring. * Computation time update after feature engineering refactoring. * Removed setting frequency. * Added docstring to convert_to_tsdf function. * Removed frequency in convert_to_tsdf call. * Fixed week_of_month function. * Added popularity featurizer * Added utility function for checking Iterable but not string. * Updated LightGBM feature engineering code to use new feature engineering classes. * Improved checking whether input column names are Iterable and conver to list. * Made future_value_available a read-only property. * Minor docstring update. * Removed extra space in docstring examples. * Made some methods staticmethods. * Minor QRF result update after feature engineering code change. * Removed calling of validate_file and added catching of the exception * Update python_unit_tests_base.yml for Azure Pipelines [skip ci] Updated path of the test results * Test if the download link is wrong * Fixed minor format issues. * Fixed minor format issues. * Fixed formatting issues. * Fixed line length. * Removed data files before downloading and checked dimensions of energy data * Removed the change made for testing * Changed folder structure of tests and added table to show build status * Added missing files * Updated based on review comments * new folder structure * add repo metrics * remove prototypes folder * add models placeholder * adjust featurizers to the new structure of folders * changes in README and evaluation files * adjust data download to new folders * delete unnecessary files * energy load baseline model with new folders * delete data files * fix links in benchmarks file * fix bug * adjust GBM, QRF and FNN submissions to the new folder structure * Replace pd.to_timedelta with pd.offsets. * Added get_offset_by_frequency helper function. * fix small bugs * fix small bugs * Update TSCVSplitter. * refactored high-level folders * added a placeholder folder for PR/issue templates * added subfolders under notebooks/ * updated tests folder * renamed notebooks/ to examples/ * Update to CONTRIBUTING instructions (#34) * style checking and formatting files * git hook installation guide * issue and PR templates * minor change * working with github instructions * added specific issue templates * addressed PR comments * addressed Chenhui's comment * addressing chenhuis comments * conda environment file (#36) * conda environment file * updated environment file * updated instructions for installing conda env * Vapaunic/lib (#37) * initial core for forecasting library * syncing with new structure * __init__ files in modules * renamed lib directory * Added legal headers and some formatting of py files * restructured benchmarking directory in lib * fixed imports, warnings, legal headers * more import fixes and legal headers * updated instructions with package installation * barebones library README * moved energy benchmark to contrib * formatting changes plus more legal headers * Added license to the setup * moved .swp file to contrib, not sure we need to keep it at all * added missing headers and a brief snipet to README file * minor wording change in readme * Chenhui/cpu unit test pipeline (#38) * address review comments * added full conda path * minor change * added conda to PATH * added build status in README * removed energy data prep placeholder notebook * moved out data energy explore notebook into contrib * moved data download script to tools/ * Added getting started section to readme * Added rbase and rbayesm to conda environment * modified data download script * added instructions for data download * renamed data download script * fixing issues with test pipeline * parsing issue in yml file * cleaning up ci test yaml file for more diagnostic info * fixed a missing argument in instructions * removed retail directory under dataset module * moved feature_engineering.py to the feature engineering module * moved evaluate.py to evaluation module * combined benchmark settings into a single file * moved download sript to the package and modified the tests * modified instructions * fixed the build pipeline yml * fix to the pipeline yml * fix to the pipeline yml * moved serve_folds into ojdata.py * removed data_schema.py file as all content moved to ojdata.py * fixed split_train_test in ojdata.py * moved retail_data_schema into ojdata.py * moved all oj utilities to ojdata.py * removed paths from benchmark_settings * fixed up a docstring * quick fix a typo * removed benchmark_settings * parameterized experiment settings * refactored experiment settings * Fixed docstrings * addressed chenhuis comment around round file naming * renamed experiment to forecast settings * Chenhui/light gbm quick start (#40) * initial example notebook for lightgbm * reduced to one round forecast * added text * added text * added text * moved week_of_month to feature engineering utils * moved df_from_cartesian_product to feature utils * moved functions to feature utils * moved functions to feature utils * added lightgbm model utils * updated plots * added text and renamed predict function * reduced print out frequency in model training * moved data visualization code to utils * added text * updated plot function and added docstring * renamed the notebook * updated text * added NOTICE file, currently empty as we're not redistributing any packages * Chenhui/add scrapbook (#43) * added scrapbook support * Added gitpython to environtment.yml file * added git_repo_path function to utils * updated notebook * added test for lightgbm notebook * included testing of notebooks * resolve test error * resolve test error * added kernel name * updated kernel name * trying installing bayesm from cmd * trying installing bayesm from cmd * trying installing bayesm from cmd * excluded notebook test * excluded notebook test * added lapack.so link fix * included notebook tests * excluded files for notebook test Co-authored-by: vapaunic <15053814+vapaunic@users.noreply.github.com> * added integration test * added initial data prep notebook * updated notebook * updated notebook * updated notebook * updated url * init * model parameters * removed blank quick start notebooks * removed blank modeling notebooks * removed blank evaluation notebooks * Removed blank model selection notebooks * removed blank o16n notebooks * removed outdated text from contrib/README * removed outdated swp file * updating .gitignore * removed change log, as we don't plan to maintain this * Excluding irrelevant directories * fix settings * separated out the setup guide * fix settings * simplemodel init * typo * add rproj file * Renaming forecasting_lib to fclib (#59) * renamed forecasting_lib directory * modified references to forecasting_lib * Vapaunic/envname (#61) * renamed conda env * modified setup instructions * minor change in contributing guide * keep top-level gitignore only * formatting fixes * Chenhui/add automl example (#62) * added multiple linear models and example notebook for AutoML * removed commented code * address review comments * minor update to the notebook * minor update to the notebook * added text * changed types in lightgbm to be consistent with the rest of the code * modified docstrings in multiple_linear_regression.py * updated ci yaml files * changed import statement in confest.py * updated gitpython version to the latest Co-authored-by: vapaunic <15053814+vapaunic@users.noreply.github.com> * Vapaunic/split bug (#65) * fixed a yield bug * removed two blank files * modified split data function to auto-calculate the splits based on the parameters * removed forecast_settings module * removed unused parameter * modified splitting function to use non-overlapping testing * tested the split function after the update * minor fix * defaults changed in split function * modified lightgbm example with new split function * modified automl example (needs verification) * modified data explore notebook * quick fix: * updated data preparation notebook * changed defaults in split function * Addressed changes in lightgbm * addressed issues in automl notebook * fixed typo in lightgbm plot * first images of time series split * updated the pictures * updated evaluation periods (#66) * Chenhui/env setup script (#67) * added a shell script for setting up environment * changed yaml to yml * added comments and updated SETUP.md * modified data preparation notebook with images * moved r exploration notebook to contrib directory * modified data explore notebook, updated info about the data, and removed reference to TSPerf * addressed review feedback and fixed the explore notebook * Chenhui/multiround lightgbm (#68) * added initial multiround notebook for lightgbm * updated data splitting * updated text * updated week list * addressed review comments * added pyramid-automl to conda file * first draft of arima notebook * replace pyramid with pmdarima * Added a complete function * minor type * forecasting across many stores/brands * complete arima notebook * renamed data preparation/exploration notebooks * added git clone to setup * addressed PR comments * typo * Arima to ARIMA * fixed docstring in plot function * fixed a bug in MAPE calculation and added plotting * fixed a bug in predict * modeling arima on log scale * Fixing AML Example Notebook (#84) * Cleaning notebook output, adding get_or_create workspace call, and fixing get_or_create AmlCompute * Add regression-based models (#64) * modelling updates * code tweak * rebuild * update mape * update mape 2 * new forecasting structure * update eval * rebuild dataprep * rebuild with profit * rm profit * add plot * typo * tidy up * expand readme * oops * clarified setup guide (#94) * Update SETUP (#95) minor fix * Cleaned up unused files and directories (#96) * removed non-used files * moved docs into a docs/ dir * fixed broken links * Chenhui/dilated cnn example and utils (#76) * added initial model util file for DCNN * initial notebook * added feature utils for DCNN * upadted evaluation and visualization * removed plot function * replaced PRED_HORIZON, PRED_STEPS by HORIZON, GAP * removed log dir if it exists * updated model utils * generalized categorical features in dcnn model util * generalized network definition * update training code * format with blackcellmagic * address review comments and added README * Chenhui/add ci tests (#146) * Update conda env with versions (#99) * :boom: * revert * minor changes Co-authored-by: Chenhui Hu <chenhhu@microsoft.com> * Adding missing Jupyter Extension (#90) * Update environment.yml * specified version Co-authored-by: Chenhui Hu <chenhhu@microsoft.com> * fix links to examples/ (#104) * Chenhui/rename notebooks and update automl notebook (#106) * removed unused module * added outputs in automl notebook * fixed a notebook name * Arima multi-round notebook (#91) * working arima model * final auto arima example * added tqdm to requirements * addressed review comments * Revert "Chenhui/rename notebooks and update automl notebook (#106)" (#107) This reverts commit 032c91d9bfa389f22ae1f1f2150913a4f063bd18 [formerly 15d25213dc005a844733c6b327b7a0d99294fed5]. Co-authored-by: Chenhui Hu <chenhhu@microsoft.com> * Fixing data download issue (#109) * removed dependency on __file__ from data download, doesn't work in jupyter * changed aux to auxdata * fixe data download function * fixed path * auxdata -> auxi * adding tl;dr directions for setup to README.md (#88) * adding tl;dr directions for setup to README.md * added a bit more text * Cleaned up obsolete (tsperf) code in fclib (#112) * moved out tsperf files from evaluation module * moved out tsperf tuning code * removed more unused files * Addressing documentation related issues (#111) * Added conda activate to the setup readme * added instructions for starting jupyter to setup * minor * deleted duplicate instructions * addressed PR comments * Chenhui/rename notebooks and updated AutoML example (#108) * removed unused module * added outputs in automl notebook * fixed a notebook name * updated pytest file * address review comments * reran notebook with blackcellmagic * adding pylint (#93) * adding tl;dr directions for setup to README.md * removing pylint hook and pylint_junit from the env file * removed pylint config file * Chenhui/update example folder (#115) * restructure examples folder * updated readme * added readme * minor update * removed R folder * minor change * fixed a broken link * another broken link * fixing notebook tests * Chenhui/fix aux file path (#118) * fixed figure links * changed to auxi_i.csv * minor change * [MINOR] Small changes to Arima notebooks (#121) * fixed a broken link * minor text changes * Documentation (#120) * added target audience section * added intro on forecasting * Added fclib documentation * improved examples readme * address comments * added info about the dataset * added items to be ignored (#123) * added items to be ignored * added .log and score.py Chenhui/toplevel readme (#127) * added content table * added references * added external repo links * minor update * Chenhui/tune deploy lgbm (#122) * added notebook and utils * updated readme links * fix data path * updated text * group imports * minor update * using azureml utils to create workspace and compute (#126) * using azureml utils to create workspace and compute * group imports * Download ojdata directly from github (#128) * new function to download and load oj data directly from bayesm repo * removed bayesm * new R function to only load the data * removed download R function * minor fix * added documentation to load_oj_data.R * added requests to requirements * fixed a syntax error (#130) * fix setup.md link (#129) * fix setup.md link * mention related use cases * Vapaunic/cgbuild (#133) * added files to generate reqs.txt and the ci yml file * Added notice generation task * Checking if notice is there * Update component_governance.yml for Azure Pipelines * check in notice file * Update component_governance.yml for Azure Pipelines * fixed heading * Chenhui/windows setup (#131) * initial test * added batch script and instructions * align image to center * adjust image size * added text * adjust image size * address comments * Readds R material (#116) * redo R stuff in new dirs * dirname fixup * add Rproj file * rebuild * fixups * roxygenise * copyright notice * dataprep * updated yaml * more updates * more tweaks * reg models * update reg models * more updates * reword * rendered prophet html * name fix * add lintr file * move stuff * renamed use case folder (#138) * renamed use case folder * dirname change * updated readme * added notebooks * fix ci test * Vapaunic/featutils (#137) * moved feature engineering module to contrib * removed lag submod * cleaned up feature engineering * rebuild R notebooks (#139) * Chenhui/toplevel readme (#140) * added content table * added references * added external repo links * minor update * updated setup instructions * added text * align text * removed duplicated Content section * address review comments * Chenhui/hyperdrive example update (#142) * removed blackcellmagic * removed utils under aml_scripts and updated notebook * added notebook path * added ci test of lightgbm multi round example * make forecast round as parameter * Make -Agent Name * resolve duplicated function name * increased time limit and reduce number of rounds * increase time limit * added parameters tag to multiround lightgbm and dilatedcnn * README change (#147) * minor change * hide tags * hide tags * added parameters tag * Revert "Chenhui/add ci tests (#146)" (#149) This reverts commit de7a19cfa7637476b9ebfc92f5c18a26a8eca4da [formerly f8bd22733cc9f58950f07fc53dc5dfe9b94fe11a]. * Chenhui/add ci tests (#150) * Update conda env with versions (#99) * :boom: * revert * minor changes Co-authored-by: Chenhui Hu <chenhhu@microsoft.com> * Adding missing Jupyter Extension (#90) * Update environment.yml * specified version Co-authored-by: Chenhui Hu <chenhhu@microsoft.com> * fix links to examples/ (#104) * Chenhui/rename notebooks and update automl notebook (#106) * removed unused module * added outputs in automl notebook * fixed a notebook name * Arima multi-round notebook (#91) * working arima model * final auto arima example * added tqdm to requirements * addressed review comments * Revert "Chenhui/rename notebooks and update automl notebook (#106)" (#107) This reverts commit 032c91d9bfa389f22ae1f1f2150913a4f063bd18 [formerly 15d25213dc005a844733c6b327b7a0d99294fed5]. Co-authored-by: Chenhui Hu <chenhhu@microsoft.com> * Fixing data download issue (#109) * removed dependency on __file__ from data download, doesn't work in jupyter * changed aux to auxdata * fixe data download function * fixed path * auxdata -> auxi * adding tl;dr directions for setup to README.md (#88) * adding tl;dr directions for setup to README.md * added a bit more text * Cleaned up obsolete (tsperf) code in fclib (#112) * moved out tsperf files from evaluation module * moved out tsperf tuning code * removed more unused files * Addressing documentation related issues (#111) * Added conda activate to the setup readme * added instructions for starting jupyter to setup * minor * deleted duplicate instructions * addressed PR comments * Chenhui/rename notebooks and updated AutoML example (#108) * removed unused module * added outputs in automl notebook * fixed a notebook name * updated pytest file * address review comments * reran notebook with blackcellmagic * adding pylint (#93) * adding tl;dr directions for setup to README.md * removing pylint hook and pylint_junit from the env file * removed pylint config file * Chenhui/update example folder (#115) * restructure examples folder * updated readme * added readme * minor update * removed R folder * minor change * fixed a broken link * another broken link * fixing notebook tests * Chenhui/fix aux file path (#118) * fixed figure links * changed to auxi_i.csv * minor change * [MINOR] Small changes to Arima notebooks (#121) * fixed a broken link * minor text changes * Documentation (#120) * added target audience section * added intro on forecasting * Added fclib documentation * improved examples readme * address comments * added info about the dataset * added items to be ignored (#123) * added items to be ignored * added .log and score.py Chenhui/toplevel readme (#127) * added content table * added references * added external repo links * minor update * Chenhui/tune deploy lgbm (#122) * added notebook and utils * updated readme links * fix data path * updated text * group imports * minor update * using azureml utils to create workspace and compute (#126) * using azureml utils to create workspace and compute * group imports * Download ojdata directly from github (#128) * new function to download and load oj data directly from bayesm repo * removed bayesm * new R function to only load the data * removed download R function * minor fix * added documentation to load_oj_data.R * added requests to requirements * fixed a syntax error (#130) * fix setup.md link (#129) * fix setup.md link * mention related use cases * Vapaunic/cgbuild (#133) * added files to generate reqs.txt and the ci yml file * Added notice generation task * Checking if notice is there * Update component_governance.yml for Azure Pipelines * check in notice file * Update component_governance.yml for Azure Pipelines * fixed heading * Chenhui/windows setup (#131) * initial test * added batch script and instructions * align image to center * adjust image size * added text * adjust image size * address comments * Readds R material (#116) * redo R stuff in new dirs * dirname fixup * add Rproj file * rebuild * fixups * roxygenise * copyright notice * dataprep * updated yaml * more updates * more tweaks * reg models * update reg models * more updates * reword * rendered prophet html * name fix * add lintr file * move stuff * renamed use case folder (#138) * renamed use case folder * dirname change * updated readme * added notebooks * fix ci test * Vapaunic/featutils (#137) * moved feature engineering module to contrib * removed lag submod * cleaned up feature engineering * rebuild R notebooks (#139) * Chenhui/toplevel readme (#140) * added content table * added references * added external repo links * minor update * updated setup instructions * added text * align text * removed duplicated Content section * address review comments * Chenhui/hyperdrive example update (#142) * removed blackcellmagic * removed utils under aml_scripts and updated notebook * added notebook path * added ci test of lightgbm multi round example * make forecast round as parameter * Make -Agent Name * resolve duplicated function name * increased time limit and reduce number of rounds * increase time limit * added parameters tag to multiround lightgbm and dilatedcnn * README change (#147) * minor change * hide tags * hide tags * added parameters tag * Revert "Chenhui/add ci tests (#150)" (#151) This reverts commit 357453234088f2ebb8453bd8cd77527a1c6c2130 [formerly 21846168a78e15f084e5edef1d83193211270f91]. * Chenhui/Add CI tests for notebooks This reverts commit 8a99549da8b9096b65130fd2f6634e2a217b2dd9 [formerly 89e986fe2ce354d6657f111170630c28072fb85e]. * minor update * Added CI tests for example notebooks * Update component governance pipeline * Update component governance pipeline * add ignored items * Readds R material (#116) * Chenhui/windows setup (#131) * Vapaunic/featutils (#137) * Chenhui/add CI tests for notebooks * Vapaunic/arimaint (#154) * modified conftests to add arima * added tests * modified notebooks with parameters * Chenhui/code improvments (#157) * updated docstring * pinged package versions * minor improvements * minor improvement * modified metrics to take any iterable (#158) * improvement: using Ray to parallelize arima fitting (#159) * using Ray to parallelize arima fitting * added ray as dependency * text about ray, disable warnings, and minor stuff * scipy 1.4.1 or above * reverting scipy, azuremlsdk issue * minor mod Co-authored-by: Vanja Paunic <15053814+vapaunic@users.noreply.github.com> * chenhui/improve ray output (#166) * modified arima multiround to run with ray (#167) * Chenhui/improve doc (#168) * minor changes * remove redundancy * updated text * improved text in model tuning and deployment notebook * clarify the data used * updated text * added description of the script * add explanation of gaps in the curve * add explanation of gaps in the curve * updated text * fix typos * improve documentation and format * Addressing a few issues around package dependencies (#169) * syncronizing utils with other OSS AI repos * exclude xlrd, leftover from tsperf * exclude urlib3, leftover from tsperf * moving tqdm to fclib as only used by lib at the moment * included fclib dependencies in requirements.txt * lower bounded package versions that we dont need specific versions of * lower bound gitpython * Chenhui/improve checking of run completion (#170) * Chenhui/added ray dashboard (#171) * Chenhui/update diagram (#172) * update multiround training diagram * minor change * update diagram and minor change * Addressing doc related issues (#173) * taking out inventory optimization link * pulled contributing out of docs * Chenhui/ray windows (#177) * add util to check if module exists * use ray if available or use sequential training * updated text * updated text * reduce code redundancy * Chenhui/setup scripts (#178) * move ray to linux setup script * remove duplicated azureml-sdk to avoid errors * add ray to ci yaml files * update azureml-sdk * update manual setup instructions * minor change * Chenhui/content table (#179) * update readme * minor change * minor update * Chenhui/multiround arima (#180) * use ray if it is installed * update text and reran notebook * add reference * Chenhui/dilatedcnn windows (#184) * resolve format issues * update log path and tensorboard path * remove subprocess import * fix path * change env name to resolve pipeline failures * Chenhui/hyperdrive windows (#185) * resolve format issues * update log path and tensorboard path * remove subprocess import * fetch common utils from chenhui/dilatedcnn_windows * update notebook * removed explain module and added notebooks module * get updated ci yml files * updated kernel name * Chenhui/enhancement (#186) * modified module_path * updated tensorboard section * rerun notebook * only submit local run if python path is found * minor change and rerun notebook * updated content section (#187) * updated content section * minor change * address comments * add links Co-authored-by: Hong Lu <honglu@microsoft.com> Co-authored-by: ZhouFang928 <ZhouFang928@users.noreply.github.com> Co-authored-by: pechyony <pechyony@outlook.com> Co-authored-by: Ubuntu <chenhui@chhdsvmnc6.hyjxgt1qggauhj0g0g2jh3guwb.bx.internal.cloudapp.net> Co-authored-by: vapaunic <15053814+vapaunic@users.noreply.github.com> Co-authored-by: Hong Ooi <hongooi@microsoft.com> Co-authored-by: Daniel Ciborowski <dciborow@microsoft.com> Co-authored-by: Markus Cozowicz <marcozo@microsoft.com> Former-commit-id: 6098ecf68c12da3f7f839e6b034af34e2438421d 2020-04-06 23:17:18 +03:00			`**/__pycache__`
			`**/.ipynb_checkpoints`
			`*.egg-info/`
			`.vscode/`
			`*.pkl`
			`*.h5`

			`# Data`
			`ojdata/*`
			`*.Rdata`

			`# AML Config`
			`aml_config/`
			`.azureml/`
			`.config/`

			`# Pytests`
			`.pytest_cache/`

			`# File for model deployment`
			`score.py`

			`# Environments`
			`myenv.yml`

			`# Logs`
			`logs/`
			`*.log`