maro/tests/citi_bike
Jinyu-W fa092f35b1
V0.2 update (#262)
* refine readme

* feat: refine data push/pull (#138)

* feat: refine data push/pull

* test: add cli provision testing

* fix: style fix

* fix: add necessary comments

* fix: from code review

* add fall back function in weather download (#112)

* fix deployment issue in multi envs

* fix typo

* fix ~/.maro not exist issue in build

* skip deploy when build

* update for comments

* temporarily disable weather info

* replace ecr with cim in setup.py

* replace ecr in manifest

* remove weather check when read data

* fix station id issue

* fix format

* add TODO in comments

* add noaa weather source

* fix weather reset and weather comment

* add comment for weather data url

* some format update

* add fall back function in weather download

* update comment

* update for comments

* update comment

* add period

* fix for pylint

* update for pylint check

* added example docs (#136)

* added example docs

* added citibike greedy example doc

* modified citibike doc

* fixed PR comments

* fixed more PR comments

* fixed small formatting issue

Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* switch the key and value of handler_dict in decorator (#144)

* switch the key and value of handler_dict in decorator

* add dist decorator UT and fixed multithreading conflict in maro test suite

* pr comments update.

* resolved comments about decorator UT

* rename handler_fun in dist decorator

* change self.attr into class_name.attr

* update UT tests comments

* V0.1 annotation (#147)

* refine the annotation of simulator core

* remove reward from env(be)

* format refined

* white spaces test

* left-padding spaces refined

* format modifed

* update the left-padding spaces of docstrings

* code format updated

* update according to comments

* update according to PR comments

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>

* Event payload details for env.summary (#156)

* key_list of events added for env.summary

* code refined according to lint

* 2 kinds of Payload added for CIM scenario; citi bike summary refined according to comments

* code format refined

* try trigger the git tests

* update github workflow

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>

* V0.2 online lp for citi bike (#159)

* key_list of events added for env.summary

* code refined according to lint

* 2 kinds of Payload added for CIM scenario; citi bike summary refined according to comments

* code format refined

* try trigger the git tests

* update github workflow

* online LP example added for citi bike

* infeasible solution

* infeasible solution fixed: call snapshot before any env.step()

* experiment results of toy topos added

* experiment results of toy topos added

* experiment result update: better than naive baseline

* PuLP version added

* greedy experiment results update

* citibike result update

* modified according to PR comments

* update experiment results and forecasting comparison

* citi bike lp README updated

* README updated

* modified according to PR comments

* update according to PR comments

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>
Co-authored-by: Jinyu Wang <jinywan@microsoft.com>

* V0.2 rl toolkit refinement (#165)

* refined rl abstractions

* fixed formattin issues

* checked out error-code related code from v0.2_pg

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* renamed save_models to dump_models

* 1. set default batch_norm_enabled to True; 2. used state_dict in dqn model saving

* renamed dump_experience_store to dump_experience_pool

* fixed a bug in the dump_experience_pool method

* fixed some PR comments

* fixed more PR comments

* 1.fixed some PR comments; 2.added early_stopping_checker; 3.revised explorer class

* fixed cim example according to rl toolkit changes

* fixed some more PR comments

* rewrote multi_process_launcher to eliminate the distributed section in config

* 1. fixed a typo; 2. added logging before early stopping

* fixed a bug

* fixed a bug

* fixed a bug

* added early stopping feature to CIM exmaple

* fixed a typo

* fixed some issues with early stopping

* changed early stopping metric func

* fixed a bug

* fixed a bug

* added early stopping to dist mode cim

* added experience collecting func

* edited notebook according to changes in CIM example

* fixed bugs in nb

* fixed lint formatting issues

* fixed a typo

* fixed some PR comments

* fixed more PR comments

* revised docs

* removed nb output

* fixed a bug in simple_learner

* fixed a typo in nb

* fixed a bug

* fixed a bug

* fixed a bug

* removed unused import

* fixed a bug

* 1. changed early stopping default config; 2. renamed param in early stopping checker and added typing

* fixed some doc issues

* added output to nb

Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* update according to flake8

* V0.2 Logical operator overloading for EarlyStoppingChecker (#178)

* 1. added logical operator overloading for early stopping checker; 2. added mean value checker

* fixed PR comments

* removed learner.exit() in single_process_launcher

* added another early stopping checker in example

* fixed PR comments and lint issues

* lint issue fix

* fixed lint issues

* fixed a bug

* fixed a bug

Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* V0.2 skip connection (#176)

* replaced IdentityLayers with nn.Identity

* 1. added skip connection option in FC_net; 2. generalized learning model

* added skip_connection option in config

* removed type casting in fc_net

* fixed lint formatting issues

* refined docstring

* added multi-head functionality to LearningModel

* refined learning model docstring

* added head_key param in learningModel forward

* fixed PR comments

* added top layer logic and is_top option in fc_net

* fixed a bug

* fixed a bug

* reverted some changes in learning model

* reverted some changes in learning model

* added members to learning model to fix the mode issue

* fixed a bug

* fixed mode setting issue in learning model

* removed learner.exit() in single_process_launcher

* fixed PR comments

* fixed rl/__init__

* fixed issues in example

* fixed a bug

* fixed a bug

* fixed lint formatting issues

* moved reward type casting to exp shaper

Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* fixed a bug in learner's test() (#193)

Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* V0.2 double dqn (#188)

* added dueling action value model

* renamed params in dueling_action_value_model

* renamed shared_features to features

* replaced IdentityLayers with nn.Identity

* 1. added skip connection option in FC_net; 2. generalized learning model

* added skip_connection option in config

* removed type casting in fc_net

* fixed lint formatting issues

* refined docstring

* mv dueling_actiovalue_model and fixed some bugs

* added multi-head functionality to LearningModel

* refined learning model docstring

* added head_key param in learningModel forward

* added double DQN and dueling features to DQN

* fixed a bug

* added DuelingQModelHead enum

* fixed a bug

* removed unwanted file

* fixed PR comments

* added top layer logic and is_top option in fc_net

* fixed a bug

* fixed a bug

* reverted some changes in learning model

* reverted some changes in learning model

* added members to learning model to fix the mode issue

* fixed a bug

* fixed mode setting issue in learning model

* fixed PR comments

* revised cim example according to DQN changes

* renamed eval_model to q_value_model in cim example

* more fixes

* fixed a bug

* fixed a bug

* added doc per PR comments

* removed learner.exit() in single_process_launcher

* removed learner.exit() in single_process_launcher

* fixed PR comments

* fixed rl/__init__

* fixed issues in example

* fixed a bug

* fixed a bug

* fixed lint formatting issues

* double DQN feature

* fixed a bug

* fixed a bug

* fixed PR comments

* fixed lint issue

* 1. fixed PR comments related to load/dump; 2. removed abstract load/dump methods from AbsAlgorithm

* added load_models in simple_learner

* minor docstring edits

* minor docstring edits

* set is_double to true in DQN config

Co-authored-by: ysqyang <v-yangqi@microsoft.com>
Co-authored-by: Arthur Jiang <ArthurSJiang@gmail.com>

* V0.2 feature predefined image (#183)

* feat: support predefined image provision

* style: fix linting errors

* style: fix linting errors

* style: fix linting errors

* style: fix linting errors

* fix: error scripts invocation after using relative import

* fix: missing init.py

* fixed a bug in learner's test()

* feat: add distributed_config for dqn example

* test: update test for grass

* test: update test for k8s

* feat: add promptings for steps

* fix: change relative imports to absolute imports

Co-authored-by: ysqyang <v-yangqi@microsoft.com>
Co-authored-by: Arthur Jiang <ArthurSJiang@gmail.com>

* V0.2 feature proxy rejoin (#158)

* update dist decorator

* replace proxy.get_peers by proxy.peers

* update proxy rejoin (draft, not runable for proxy rejoin)

* fix bugs in proxy

* add message cache, and redesign rejoin parameter

* feat: add checkpoint with test

* update proxy.rejoin

* fixed rejoin bug, rename func

* add test example(temp)

* feat: add FaultToleranceAgent, refine other MasterAgents and NodeAgents.

* capital env vari name

* rm json.dumps; change retries to 10; temp add warning level for rejoin

* fix: unable to load FaultToleranceAgent, missing params

* fix: delete mapping in StopJob if FaultTolerance is activated, add exception handler for FaultToleranceAgent

* feat: add node_id to node_details

* fix: add a new dependency for tests

* style: meet linting requirements

* style: remaining linting problems

* lint fixed; rm temp test folder.

* fixed lint f-string without placeholder

* fix: add a flag for "remove_container", refine restart logic and Redis keys naming

* proxy rejoin update.

* variable rename.

* fixed lint issues

* fixed lint issues

* add exit code for different error

* feat: add special errors handler

* add max rejoin times

* remove unused import

* add rejoin UT; resolve rejoin comments

* lint fixed

* fixed UT import problem

* rm MessageCache in proxy

* fix: refine key naming

* update proxy rejoin; add topic for broadcast

* feat: support predefined image provision

* update UT for communication

* add docstring for rejoin

* fixed isort and zmq driver import

* fixed isort and UT test

* fix isort issue

* proxy rejoin update (comments v2)

* fixed isort error

* style: fix linting errors

* style: fix linting errors

* style: fix linting errors

* style: fix linting errors

* feat: add exists method for checkpoint

* fix: error scripts invocation after using relative import

* fix: missing init.py

* fixed a bug in learner's test()

* add driver close and socket SUB disconnect for rejoin

* feat: add distributed_config for dqn example

* test: update test for grass

* test: update test for k8s

* feat: add promptings for steps

* fix: change relative imports to absolute imports

* fixed comments and update logger level

* mv driver in proxy.__init__ for issue temp fixed.

* Update docstring and comments

* style: fix code reviews problems

* fix code format

Co-authored-by: Lyuchun Huang <romic.kid@gmail.com>
Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* V0.2 feature cli windows (#203)

* fix: change local mkdir to os.makedirs

* fix: add utf8 encoding for logger

* fix: add powershell.exe prefix to subprocess functions

* feat: add debug_green

* fix: use fsutil to create fix-size files in Windows

* fix: use universal_newlines=True to handle encoding problem in different operating systems

* fix: use temp file to do copy when the operating system is not Linux

* fix: linting error

* fix: use fsutil in test_k8s.py

* feat: dynamic init ABS_PATH in GlobalParams

* fix: use -Command to execute Powershell command

* fix: refine code style in k8s_azure_executor.py, add Windows support for k8s mode

* fix: problems in code review

* EventBuffer refine (#197)

* merge uniform event changes back

* 1st step: move executing events into stack for better removing performance

* flush event pool

* typo

* add option for env to enable event pool

* refine stack functions

* fix comment issues, add typings

* lint fixing

* lint fix

* add missing fix

* linting

* lint

* use linked list instead original event list and execute stack

* add missing file

* linting, and fixes

* add missing file

* linting fix

* fixing comments

* add missing file

* rename event_list to event_linked_list

* correct import path

* change enable_event_pool to disable_finished_events

* add missing file

* V0.2 merge master (#214)

* fix the visualization of docs/key_components/distributed_toolkit

* add examples into isort ignore

* refine import path for examples (#195)

* refine import path for examples

* refine indents

* fixed formatting issues

* update code style

* add editorconfig-checker, add editorconfig path into lint, change super-linter version

* change path for code saving in cim.gnn

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>
Co-authored-by: ysqyang <v-yangqi@microsoft.com>
Co-authored-by: Wenlei Shi <Wenlei.Shi@microsoft.com>

* fix issue that sometimes there is conflict between distutils and setuptools  (#208)

* fix issue that cython and setuptools conflict

* follow the accepted temp workaround

* update comment, it should be conflict between setuptools and distutils

* fixed bugs related to proxy interface changes

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>
Co-authored-by: Jinyu-W <53509467+Jinyu-W@users.noreply.github.com>
Co-authored-by: ysqyang <v-yangqi@microsoft.com>
Co-authored-by: Wenlei Shi <Wenlei.Shi@microsoft.com>
Co-authored-by: Chaos Yu <chaos.you@gmail.com>

* typo fix

* Bug fix: event buffer issue that cause Actions cannot be passed into business engine (#215)

* bug fix

* clear the reference after extract sub events, update ut to cover this issue

Co-authored-by: Jinyu-W <53509467+Jinyu-W@users.noreply.github.com>

* fix flake8 style problem

* V0.2 feature refine mode namings (#212)

* feat: refine cli exception

* feat: refine mode namings

* EventBuffer refine (#197)

* merge uniform event changes back

* 1st step: move executing events into stack for better removing performance

* flush event pool

* typo

* add option for env to enable event pool

* refine stack functions

* fix comment issues, add typings

* lint fixing

* lint fix

* add missing fix

* linting

* lint

* use linked list instead original event list and execute stack

* add missing file

* linting, and fixes

* add missing file

* linting fix

* fixing comments

* add missing file

* rename event_list to event_linked_list

* correct import path

* change enable_event_pool to disable_finished_events

* add missing file

* fixed bugs in dist rl

* feat: rename files

* tests: set longer gracefully wait time

* style: fix linting errors

* style: fix linting errors

* style: fix linting errors

* fix: rm redundant variables

* fix: refine error message

Co-authored-by: Chaos Yu <chaos.you@gmail.com>
Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* V0.2 vis new (#210)

Co-authored-by: Wenlei Shi <Wenlei.Shi@microsoft.com>
Co-authored-by: Chaos Yu <chaos.you@gmail.com>

* V0.2 local host process (#221)

* Update local process (not ready)

* update cli process mode

* add setup/clear/template for maro process

* fix process stop

* add logger and rename parameters

* add logger for setup/clear

* fixed close not exist pid when given pid list.

* Fixed comments and rename setup/clear with create/delete

* update ProcessInternalError

* V0.2 grass on premises (#220)

* feat: refine cli exception
* commit on v0.2_grass_on_premises

Co-authored-by: Lyuchun Huang <romic.kid@gmail.com>
Co-authored-by: Chaos Yu <chaos.you@gmail.com>
Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* V0.2 vm scheduling scenario (#189)

* Initialize

* Data center scenario init

* Code style modification

* V0.2 event buffer subevents expand (#180)

* V0.2 rl toolkit refinement (#165)

* refined rl abstractions

* fixed formattin issues

* checked out error-code related code from v0.2_pg

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* renamed save_models to dump_models

* 1. set default batch_norm_enabled to True; 2. used state_dict in dqn model saving

* renamed dump_experience_store to dump_experience_pool

* fixed a bug in the dump_experience_pool method

* fixed some PR comments

* fixed more PR comments

* 1.fixed some PR comments; 2.added early_stopping_checker; 3.revised explorer class

* fixed cim example according to rl toolkit changes

* fixed some more PR comments

* rewrote multi_process_launcher to eliminate the distributed section in config

* 1. fixed a typo; 2. added logging before early stopping

* fixed a bug

* fixed a bug

* fixed a bug

* added early stopping feature to CIM exmaple

* fixed a typo

* fixed some issues with early stopping

* changed early stopping metric func

* fixed a bug

* fixed a bug

* added early stopping to dist mode cim

* added experience collecting func

* edited notebook according to changes in CIM example

* fixed bugs in nb

* fixed lint formatting issues

* fixed a typo

* fixed some PR comments

* fixed more PR comments

* revised docs

* removed nb output

* fixed a bug in simple_learner

* fixed a typo in nb

* fixed a bug

* fixed a bug

* fixed a bug

* removed unused import

* fixed a bug

* 1. changed early stopping default config; 2. renamed param in early stopping checker and added typing

* fixed some doc issues

* added output to nb

Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* unfold sub-events, insert after parent

* remove event category, use different class instead, add helper functions to gen decision and action event

* add a method to support add immediate event to cascade event with tick validation

* fix ut issue

* add action as 1st sub event to ensure the executing order

Co-authored-by: ysqyang <ysqyang@gmail.com>
Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* Data center scenario update

* Code style update

* Data scenario business engine update

* Isort update

* Fix lint code check

* Fix based on PR comments.

* Update based on PR comments.

* Add decision payload

* Add config file

* Update utilization series logic

* Update based on PR comment

* Update based on PR

* Update

* Update

* Add the ValidPm class

* Update docs string and naming

* Add energy consumption

* Lint code fixed

* Refining postpone function

* Lint style update

* Init data pipeline

* Update based on PR comment

* Add data pipeline download

* Lint style update

* Code style fix

* Temp update

* Data pipeline update

* Add aria2p download function

* Update based on PR comment

* Update based on PR comment

* Update based on PR comment

* Update naming of variables

* Rename topology

* Renaming

* Fix valid pm list

* Pylint fix

* Update comment

* Update docstring and comment

* Fix init import

* Update tick issue

* fix merge problem

* update style

* V0.2 datacenter data pipeline (#199)

* Data pipeline update

* Data pipeline update

* Lint update

* Update pipeline

* Add vmid mapping

* Update lint style

* Add VM data analytics

* Update notebook

* Add binary converter

* Modift vmtable yaml

* Update binary meta file

* Add cpu reader

* random example added for data center

* Fix bugs

* Fix pylint

* Add launcher

* Fix pylint

* best fit policy added

* Add reset

* Add config

* Add config

* Modify action object

* Modify config

* Fix naming

* Modify config

* Add snapshot list

* Modify a spelling typo

* Update based on PR comments.

* Rename scenario to vm scheduling

* Rename scenario

* Update print messages

* Lint fix

* Lint fix

* Rename scenario

* Modify the calculation of cpu utilization

* Add comment

* Modify data pipeline path

* Fix typo

* Modify naming

* Add unittest

* Add comment

* Unify naming

* Fix data path typo

* Update comments

* Update snapshot features

* Add take snapshot

* Add summary keys

* Update cpu reader

* Update naming

* Add unit test

* Rename snapshot node

* Add processed data pipeline

* Modify config

* Add comment

* Lint style fix

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>

* Add package used in vm_scheduling

* add aria2p to test requirement

* best fit example: update the usage of snapshot

* Add aria2p to test requriement

* Remove finish event

* Fix unittest

* Add test dataset

* Update based on PR comment

* Refine cpu reader and unittest

* Lint update

* Refine based on PR comment

* Add agent index

* Add node maping

* Refine based on PR comments

* Renaming postpone_step

* Renaming and refine based on PR comments

* Rename config

* Update

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>
Co-authored-by: Chaos Yu <chaos.you@gmail.com>
Co-authored-by: ysqyang <ysqyang@gmail.com>
Co-authored-by: ysqyang <v-yangqi@microsoft.com>
Co-authored-by: Jinyu-W <53509467+Jinyu-W@users.noreply.github.com>

* Resolve none action problem (#224)

* V0.2 vm_scheduling notebook (#223)

* Initialize

* Data center scenario init

* Code style modification

* V0.2 event buffer subevents expand (#180)

* V0.2 rl toolkit refinement (#165)

* refined rl abstractions

* fixed formattin issues

* checked out error-code related code from v0.2_pg

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* renamed save_models to dump_models

* 1. set default batch_norm_enabled to True; 2. used state_dict in dqn model saving

* renamed dump_experience_store to dump_experience_pool

* fixed a bug in the dump_experience_pool method

* fixed some PR comments

* fixed more PR comments

* 1.fixed some PR comments; 2.added early_stopping_checker; 3.revised explorer class

* fixed cim example according to rl toolkit changes

* fixed some more PR comments

* rewrote multi_process_launcher to eliminate the distributed section in config

* 1. fixed a typo; 2. added logging before early stopping

* fixed a bug

* fixed a bug

* fixed a bug

* added early stopping feature to CIM exmaple

* fixed a typo

* fixed some issues with early stopping

* changed early stopping metric func

* fixed a bug

* fixed a bug

* added early stopping to dist mode cim

* added experience collecting func

* edited notebook according to changes in CIM example

* fixed bugs in nb

* fixed lint formatting issues

* fixed a typo

* fixed some PR comments

* fixed more PR comments

* revised docs

* removed nb output

* fixed a bug in simple_learner

* fixed a typo in nb

* fixed a bug

* fixed a bug

* fixed a bug

* removed unused import

* fixed a bug

* 1. changed early stopping default config; 2. renamed param in early stopping checker and added typing

* fixed some doc issues

* added output to nb

Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* unfold sub-events, insert after parent

* remove event category, use different class instead, add helper functions to gen decision and action event

* add a method to support add immediate event to cascade event with tick validation

* fix ut issue

* add action as 1st sub event to ensure the executing order

Co-authored-by: ysqyang <ysqyang@gmail.com>
Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* Data center scenario update

* Code style update

* Data scenario business engine update

* Isort update

* Fix lint code check

* Fix based on PR comments.

* Update based on PR comments.

* Add decision payload

* Add config file

* Update utilization series logic

* Update based on PR comment

* Update based on PR

* Update

* Update

* Add the ValidPm class

* Update docs string and naming

* Add energy consumption

* Lint code fixed

* Refining postpone function

* Lint style update

* Init data pipeline

* Update based on PR comment

* Add data pipeline download

* Lint style update

* Code style fix

* Temp update

* Data pipeline update

* Add aria2p download function

* Update based on PR comment

* Update based on PR comment

* Update based on PR comment

* Update naming of variables

* Rename topology

* Renaming

* Fix valid pm list

* Pylint fix

* Update comment

* Update docstring and comment

* Fix init import

* Update tick issue

* fix merge problem

* update style

* V0.2 datacenter data pipeline (#199)

* Data pipeline update

* Data pipeline update

* Lint update

* Update pipeline

* Add vmid mapping

* Update lint style

* Add VM data analytics

* Update notebook

* Add binary converter

* Modift vmtable yaml

* Update binary meta file

* Add cpu reader

* random example added for data center

* Fix bugs

* Fix pylint

* Add launcher

* Fix pylint

* best fit policy added

* Add reset

* Add config

* Add config

* Modify action object

* Modify config

* Fix naming

* Modify config

* Add snapshot list

* Modify a spelling typo

* Update based on PR comments.

* Rename scenario to vm scheduling

* Rename scenario

* Update print messages

* Lint fix

* Lint fix

* Rename scenario

* Modify the calculation of cpu utilization

* Add comment

* Modify data pipeline path

* Fix typo

* Modify naming

* Add unittest

* Add comment

* Unify naming

* Fix data path typo

* Update comments

* Update snapshot features

* Add take snapshot

* Add summary keys

* Update cpu reader

* Update naming

* Add unit test

* Rename snapshot node

* Add processed data pipeline

* Modify config

* Add comment

* Lint style fix

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>

* Add package used in vm_scheduling

* add aria2p to test requirement

* best fit example: update the usage of snapshot

* Add aria2p to test requriement

* Remove finish event

* Fix unittest

* Add test dataset

* Update based on PR comment

* Refine cpu reader and unittest

* Lint update

* Refine based on PR comment

* Add agent index

* Add node maping

* Init vm shceduling notebook

* Add notebook

* Refine based on PR comments

* Renaming postpone_step

* Renaming and refine based on PR comments

* Rename config

* Update based on the v0.2_datacenter

* Update notebook

* Update

* update filepath

* notebook updated

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>
Co-authored-by: Chaos Yu <chaos.you@gmail.com>
Co-authored-by: ysqyang <ysqyang@gmail.com>
Co-authored-by: ysqyang <v-yangqi@microsoft.com>
Co-authored-by: Jinyu-W <53509467+Jinyu-W@users.noreply.github.com>

* Update process mode docs and fixed on premises (#226)

* V0.2 Add github workflow integration (#222)

* test: add github workflow integration

* fix: split procedures && bug fixed

* test: add training only restriction

* fix: add 'approved' restriction

* fix: change default ssh port to 22

* style: in one line

* feat: add timeout for Subprocess.run

* test: change default node_size to Standard_D2s_v3

* style: refine style

* fix: add ssh_port param to on-premises mode

* fix: add missing init.py

* V0.2 explorer (#198)

* overhauled exploration abstraction

* fixed a bug

* fixed a bug

* fixed a bug

* added exploration related methods to abs_agent

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* separated learning with exploration schedule and without

* small fixes

* moved explorer logic to actor side

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* removed unwanted param from simple agent manager

* added noise explorer

* fixed formatting

* removed unnecessary comma

* fixed PR comments

* removed unwanted exception and imports

* fixed a bug

* fixed PR comments

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed lint issue

* fixed a bug

* fixed lint issue

* fixed naming

* combined exploration param generation and early stopping in scheduler

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed lint issues

* fixed lint issue

* moved logger inside scheduler

* fixed a bug

* fixed a bug

* fixed a bug

* fixed lint issues

* removed epsilon parameter from choose_action

* fixed some PR comments

* fixed some PR comments

* bug fix

* bug fix

* bug fix

* removed explorer abstraction from agent

* refined dqn example

* fixed lint issues

* simplified scheduler

* removed early stopping from CIM dqn example

* removed early stopping from cim example config

* renamed early_stopping_callback to early_stopping_checker

* removed action_dim from noise explorer classes and added some shape checks

* modified NoiseExplorer's __call__ logic to batch processing

* made NoiseExplorer's __call__ return type np array

* renamed update to set_parameters in explorer

* fixed old naming in test_grass

Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* V0.2 embedded optim (#191)

* added dueling action value model

* renamed params in dueling_action_value_model

* renamed shared_features to features

* replaced IdentityLayers with nn.Identity

* 1. added skip connection option in FC_net; 2. generalized learning model

* added skip_connection option in config

* removed type casting in fc_net

* fixed lint formatting issues

* refined docstring

* mv dueling_actiovalue_model and fixed some bugs

* added multi-head functionality to LearningModel

* refined learning model docstring

* added head_key param in learningModel forward

* added double DQN and dueling features to DQN

* fixed a bug

* added DuelingQModelHead enum

* fixed a bug

* removed unwanted file

* fixed PR comments

* added top layer logic and is_top option in fc_net

* fixed a bug

* fixed a bug

* reverted some changes in learning model

* reverted some changes in learning model

* added members to learning model to fix the mode issue

* fixed a bug

* fixed mode setting issue in learning model

* fixed PR comments

* revised cim example according to DQN changes

* renamed eval_model to q_value_model in cim example

* more fixes

* fixed a bug

* fixed a bug

* added doc per PR comments

* removed learner.exit() in single_process_launcher

* removed learner.exit() in single_process_launcher

* fixed PR comments

* fixed rl/__init__

* fixed issues in example

* fixed a bug

* fixed a bug

* fixed lint formatting issues

* double DQN feature

* fixed a bug

* fixed a bug

* fixed PR comments

* fixed lint issue

* embedded optimizer into SingleHeadLearningModel

* 1. fixed PR comments related to load/dump; 2. removed abstract load/dump methods from AbsAlgorithm

* added load_models in simple_learner

* minor docstring edits

* minor docstring edits

* minor docstring edits

* mv optimizer options inside LearningMode

* modified example accordingly

* fixed a bug

* fixed a bug

* fixed a bug

* added dueling DQN feature

* revised and refined docstrings

* fixed a bug

* fixed lint issues

* added load/dump functions to LearningModel

* fixed a bug

* fixed a bug

* fixed lint issues

* refined DQN docstrings

* removed load/dump functions from DQN

* added task validator

* fixed decorator use

* fixed a typo

* fixed a bug

* fixed lint issues

* changed LearningModel's step() to take a single loss

* revised learning model design

* revised example

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* added decorator utils to algorithm

* fixed a bug

* renamed core_model to model

* fixed a bug

* 1. fixed lint formatting issues; 2. refined learning model docstrings

* rm trailing whitespaces

* added decorator for choose_action

* fixed a bug

* fixed a bug

* fixed version-related issues

* renamed add_zeroth_dim decorator to expand_dim

* overhauled exploration abstraction

* fixed a bug

* fixed a bug

* fixed a bug

* added exploration related methods to abs_agent

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* separated learning with exploration schedule and without

* small fixes

* moved explorer logic to actor side

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* removed unwanted param from simple agent manager

* small fixes

* added shared_module property to LearningModel

* added shared_module property to LearningModel

* revised __getstate__ for LearningModel

* fixed a bug

* added soft_update function to learningModel

* fixed a bug

* revised learningModel

* rm __getstate__ and __setstate__ from LearningModel

* added noise explorer

* fixed formatting

* removed unnecessary comma

* removed unnecessary comma

* fixed PR comments

* removed unwanted exception and imports

* removed unwanted exception and imports

* fixed a bug

* fixed PR comments

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed lint issue

* fixed a bug

* fixed lint issue

* fixed naming

* combined exploration param generation and early stopping in scheduler

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed lint issues

* fixed lint issue

* moved logger inside scheduler

* fixed a bug

* fixed a bug

* fixed a bug

* fixed lint issues

* fixed lint issue

* removed epsilon parameter from choose_action

* removed epsilon parameter from choose_action

* changed agent manager's train parameter to experience_by_agent

* fixed some PR comments

* renamed zero_grad to zero_gradients in LearningModule

* fixed some PR comments

* bug fix

* bug fix

* bug fix

* removed explorer abstraction from agent

* added DEVICE env variable as first choice for torch device

* refined dqn example

* fixed lint issues

* removed unwanted import in cim example

* updated cim-dqn notebook

* simplified scheduler

* edited notebook according to merged scheduler changes

* refined dimension check for learning module manager and removed num_actions from DQNConfig

* bug fix for cim example

* added notebook output

* removed early stopping from CIM dqn example

* removed early stopping from cim example config

* moved decorator logic inside algorithms

* renamed early_stopping_callback to early_stopping_checker

* removed action_dim from noise explorer classes and added some shape checks

* modified NoiseExplorer's __call__ logic to batch processing

* made NoiseExplorer's __call__ return type np array

* renamed update to set_parameters in explorer

* fixed old naming in test_grass

Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* V0.2 VM scheduling docs (#228)

* Initialize

* Data center scenario init

* Code style modification

* V0.2 event buffer subevents expand (#180)

* V0.2 rl toolkit refinement (#165)

* refined rl abstractions

* fixed formattin issues

* checked out error-code related code from v0.2_pg

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* renamed save_models to dump_models

* 1. set default batch_norm_enabled to True; 2. used state_dict in dqn model saving

* renamed dump_experience_store to dump_experience_pool

* fixed a bug in the dump_experience_pool method

* fixed some PR comments

* fixed more PR comments

* 1.fixed some PR comments; 2.added early_stopping_checker; 3.revised explorer class

* fixed cim example according to rl toolkit changes

* fixed some more PR comments

* rewrote multi_process_launcher to eliminate the distributed section in config

* 1. fixed a typo; 2. added logging before early stopping

* fixed a bug

* fixed a bug

* fixed a bug

* added early stopping feature to CIM exmaple

* fixed a typo

* fixed some issues with early stopping

* changed early stopping metric func

* fixed a bug

* fixed a bug

* added early stopping to dist mode cim

* added experience collecting func

* edited notebook according to changes in CIM example

* fixed bugs in nb

* fixed lint formatting issues

* fixed a typo

* fixed some PR comments

* fixed more PR comments

* revised docs

* removed nb output

* fixed a bug in simple_learner

* fixed a typo in nb

* fixed a bug

* fixed a bug

* fixed a bug

* removed unused import

* fixed a bug

* 1. changed early stopping default config; 2. renamed param in early stopping checker and added typing

* fixed some doc issues

* added output to nb

Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* unfold sub-events, insert after parent

* remove event category, use different class instead, add helper functions to gen decision and action event

* add a method to support add immediate event to cascade event with tick validation

* fix ut issue

* add action as 1st sub event to ensure the executing order

Co-authored-by: ysqyang <ysqyang@gmail.com>
Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* Data center scenario update

* Code style update

* Data scenario business engine update

* Isort update

* Fix lint code check

* Fix based on PR comments.

* Update based on PR comments.

* Add decision payload

* Add config file

* Update utilization series logic

* Update based on PR comment

* Update based on PR

* Update

* Update

* Add the ValidPm class

* Update docs string and naming

* Add energy consumption

* Lint code fixed

* Refining postpone function

* Lint style update

* Init data pipeline

* Update based on PR comment

* Add data pipeline download

* Lint style update

* Code style fix

* Temp update

* Data pipeline update

* Add aria2p download function

* Update based on PR comment

* Update based on PR comment

* Update based on PR comment

* Update naming of variables

* Rename topology

* Renaming

* Fix valid pm list

* Pylint fix

* Update comment

* Update docstring and comment

* Fix init import

* Update tick issue

* fix merge problem

* update style

* V0.2 datacenter data pipeline (#199)

* Data pipeline update

* Data pipeline update

* Lint update

* Update pipeline

* Add vmid mapping

* Update lint style

* Add VM data analytics

* Update notebook

* Add binary converter

* Modift vmtable yaml

* Update binary meta file

* Add cpu reader

* random example added for data center

* Fix bugs

* Fix pylint

* Add launcher

* Fix pylint

* best fit policy added

* Add reset

* Add config

* Add config

* Modify action object

* Modify config

* Fix naming

* Modify config

* Add snapshot list

* Modify a spelling typo

* Update based on PR comments.

* Rename scenario to vm scheduling

* Rename scenario

* Update print messages

* Lint fix

* Lint fix

* Rename scenario

* Modify the calculation of cpu utilization

* Add comment

* Modify data pipeline path

* Fix typo

* Modify naming

* Add unittest

* Add comment

* Unify naming

* Fix data path typo

* Update comments

* Update snapshot features

* Add take snapshot

* Add summary keys

* Update cpu reader

* Update naming

* Add unit test

* Rename snapshot node

* Add processed data pipeline

* Modify config

* Add comment

* Lint style fix

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>

* Add package used in vm_scheduling

* add aria2p to test requirement

* best fit example: update the usage of snapshot

* Add aria2p to test requriement

* Remove finish event

* Fix unittest

* Add test dataset

* Update based on PR comment

* vm doc init

* Update docs

* Update docs

* Update docs

* Update docs

* Remove old notebook

* Update docs

* Update docs

* Add figure

* Update docs

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>
Co-authored-by: Chaos Yu <chaos.you@gmail.com>
Co-authored-by: ysqyang <ysqyang@gmail.com>
Co-authored-by: ysqyang <v-yangqi@microsoft.com>
Co-authored-by: Jinyu-W <53509467+Jinyu-W@users.noreply.github.com>

* v0.2 VM Scheduling docs refinement (#231)

* Fix typo

* Refining vm scheduling docs

* V0.2 store refinement (#234)

* updated docs and images for rl toolkit

* 1. fixed import formats for maro/rl; 2. changed decorators to hypers in store

* fixed lint issues

Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* Fix bug (#237)

vm scenario: fix the event type bug of the postpone event

* V0.2 rl toolkit doc (#235)

* updated docs and images for rl toolkit

* updated cim example doc

* updated cim exmaple docs

* updated cim example rst

* updated rl_toolkit and cim example docs

* replaced q_module with q_net in example rst

* refined doc

* refined doc

* updated figures

* updated figures

Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* Merge V0.2 vis into V0.2 (#233)

* Implemented dump snapshots and convert to CSV.

* Let BE supports params when dump snapshot.

* Refactor dump code to core.py

* Implemented decision event dump.

* replace is not '' with !=''

* Fixed issues that code review mentioned.

* removed path from hello.py

* Changed import sort.

* Fix  import sorting in citi_bike/business_engine

* visualization 0.1

* Updated lint configurations.

* Fixed formatting error that caused lint errors.

* render html title function

* Try to fix lint errors.

* flake-8 style fix

* remove space around 18,35

* dump_csv_converter.py re-formatting.

* files re-formatting.

* style fixed

* tab delete

* white space fix

* white space fix-2

* vis redundant function delete

* refine

* re-formatting after merged upstream.

* Updated import section.

* Updated import section.

* pr refine

* isort fix

* white space

* lint error

* \n error

* test continuation

* indent

* continuation of indent

* indent 0.3

* comment update

* comment update 0.2

* f-string update

* f-string 0.2

* lint 0.3

* lint 0.4

* lint 0.4

* lint 0.5

* lint 0.6

* docstring update

* data version deploy update

* condition update

* add whitespace

* V0.2 vis dump feature enhancement. (#190)

* Dumps added manifest file.
* Code updated format by flake8
* Changed manifest file format for easy reading.

* deploy info update; docs update

* weird white space

* Update dashboard_visualization.md

* new endline?

* delete dependency

* delete irrelevant file

* change scenario to enum, divide file path into a separated class

* doc refine

* doc update

* params type

* data structure update

* doc&enum, formula refine

* refine

* add ut, refine doc

* style refine

* isort

* strong type fix

* os._exit delete

* revert datalib

* import new line

* change test case

* change file name & doc

* change deploy path

* delete params

* revert file

* delete duplicate file

* delete single process

* update naming

* manually change import order

* delete blank

* edit error

* requirement txt

* style fix & refine

* comments&docstring refine

* add parameter name

* test & dump

* comments update

* Added manifest file. (#201)

Only a few changes that need to meet requirements of manifest file format.

* comments fix

* delete toolkit change

* doc update

* citi bike update

* deploy path

* datalib update

* revert datalib

* revert

* maro file format

* comments update

* doc update

* update param name

* doc update

* new link

* image update

* V0.2 visualization-0.1 (#181)

* visualization 0.1

* render html title function

* flake-8 style fix

* style fixed

* tab delete

* white space fix

* white space fix-2

* vis redundant function delete

* refine

* pr refine

* isort fix

* white space

* lint error

* \n error

* test continuation

* indent

* continuation of indent

* indent 0.3

* comment update

* comment update 0.2

* f-string update

* f-string 0.2

* lint 0.3

* lint 0.4

* lint 0.4

* lint 0.5

* lint 0.6

* docstring update

* data version deploy update

* condition update

* add whitespace

* deploy info update; docs update

* weird white space

* Update dashboard_visualization.md

* new endline?

* delete dependency

* delete irrelevant file

* change scenario to enum, divide file path into a separated class

* fix the visualization of docs/key_components/distributed_toolkit

* doc refine

* doc update

* params type

* add examples into isort ignore

* data structure update

* doc&enum, formula refine

* refine

* add ut, refine doc

* style refine

* isort

* strong type fix

* os._exit delete

* revert datalib

* import new line

* change test case

* change file name & doc

* change deploy path

* delete params

* revert file

* delete duplicate file

* delete single process

* update naming

* manually change import order

* delete blank

* edit error

* requirement txt

* style fix & refine

* comments&docstring refine

* add parameter name

* test & dump

* comments update

* comments fix

* delete toolkit change

* doc update

* citi bike update

* deploy path

* datalib update

* revert datalib

* revert

* maro file format

* comments update

* doc update

* update param name

* doc update

* new link

* image update

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>
Co-authored-by: Miaoran Chen (Wicresoft) <v-miaorc@microsoft.com>

* image change

* add reset snapshot

* delete dump

* add new line

* add next steps

* import change

* relative import

* add init file

* import change

* change utils file

* change cliexpcetion to clierror

* dashboard test

* change result

* change assertation

* move not

* unit test change

* core change

* unit test delete name_mapping_file

* update cim business engine

* doc update

* change relative path

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* duc update

* duc update

* duc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* change import sequence

* comments update

* doc add pic

* add dependency

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* Update dashboard_visualization.rst

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* delete white space

* doc update

* doc update

* update doc

* update doc

* update doc

Co-authored-by: Michael Li <mic_lee2000@hotmail.com>
Co-authored-by: Miaoran Chen (Wicresoft) <v-miaorc@microsoft.com>
Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>
Co-authored-by: Jinyu-W <53509467+Jinyu-W@users.noreply.github.com>

* V0.2 docs process mode (#230)

* Update process mode docs and fixed on premises

* Update orchestration docs

* Update process mode docs add JOB_NAME as env variable

* fixed bugs

* fixed isort issue

* update docs index

Co-authored-by: kaiqli <v-kaiqli@microsoft.com>

* V0.2 learning model refinement (#236)

* moved optimizer options to LearningModel

* typo fix

* fixed lint issues

* updated notebook

* misc edits

* 1. renamed CIMAgent to DQNAgent; 2. moved create_dqn_agents to Agent section in notebook

* renamed single_host_cim_learner ot cim_learner in notebook

* updated notebook output

* typo fix

* removed dimension check in absence of shared stack

* fixed a typo

* fixed lint issues

Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* Update vm docs (#241)

Co-authored-by: Jinyu-W <53509467+Jinyu-W@users.noreply.github.com>

* V0.2 info update (#240)

* update readme

* update version

* refine reademe format

* add vis gif

* add citation

* update citation

* update badge

Co-authored-by: Arthur Jiang <sjian@microsoft.com>

* Fix typo (#242)

* Fix typo

* fix typo

* fix

* syntax fix (#253)

* syntax fix

* syntax fix

* syntax fix

* rm unwanted import

Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* V0.2 vm oversubscription (#246)

* Remove topology

* Update pipeline

* Update pipeline

* Update pipeline

* Modify metafile

* Add two attributes of VM

* Update pipeline

* Add vm category

* Add todo

* Add oversub config

* Add oversubscription feature

* Lint fix

* Update based on PR comment.

* Update pipeline

* Update pipeline

* Update config.

* Update based on PR comment

* Update

* Add pm sku feature

* Add sku setting

* Add sku feature

* Lint fix

* Lint style

* Update sku, overloading

* Lint fix

* Lint style

* Fix bug

* Modify config

* Remove sky and replaced it by pm stype

* Add and refactor vm category

* Comment out cofig

* Unify the enum format

* Fix lint style

* Fix import order

* Update based on PR comment

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>

* V0.2 vm scheduling decision event (#257)

* Fix data preparation bug

* Add frame index

* V0.2 PG, K-step and lambda return utils  (#155)

* fixed a bug

* fixed lint issues

* added load/dump functions to LearningModel

* fixed a bug

* fixed a bug

* fixed lint issues

* merged with v0.2_embedded_optims

* refined DQN docstrings

* removed load/dump functions from DQN

* added task validator

* fixed decorator use

* fixed a typo

* fixed a bug

* revised

* fixed lint issues

* changed LearningModel's step() to take a single loss

* revised learning model design

* revised example

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* added decorator utils to algorithm

* fixed a bug

* renamed core_model to model

* fixed a bug

* 1. fixed lint formatting issues; 2. refined learning model docstrings

* rm trailing whitespaces

* added decorator for choose_action

* fixed a bug

* fixed a bug

* fixed version-related issues

* renamed add_zeroth_dim decorator to expand_dim

* overhauled exploration abstraction

* fixed a bug

* fixed a bug

* fixed a bug

* added exploration related methods to abs_agent

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* separated learning with exploration schedule and without

* small fixes

* moved explorer logic to actor side

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* removed unwanted param from simple agent manager

* small fixes

* revised code based on revised abstractions

* fixed some bugs

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* added shared_module property to LearningModel

* added shared_module property to LearningModel

* fixed a bug with k-step return in AC

* fixed a bug

* fixed a bug

* merged pg, ac and ppo examples

* fixed a bug

* fixed a bug

* fixed naming for ppo

* renamed some variables in PPO

* added ActionWithLogProbability return type for PO-type algorithms

* fixed a bug

* fixed a bug

* fixed lint issues

* revised __getstate__ for LearningModel

* fixed a bug

* added soft_update function to learningModel

* fixed a bug

* revised learningModel

* rm __getstate__ and __setstate__ from LearningModel

* added noise explorer

* formatting

* fixed formatting

* removed unnecessary comma

* removed unnecessary comma

* removed unnecessary comma

* fixed PR comments

* removed unwanted exception and imports

* removed unwanted exception and imports

* fixed a bug

* fixed PR comments

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed lint issue

* fixed a bug

* fixed lint issue

* fixed naming

* combined exploration param generation and early stopping in scheduler

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed lint issues

* fixed lint issue

* moved logger inside scheduler

* fixed a bug

* fixed a bug

* fixed a bug

* fixed lint issues

* fixed lint issue

* removed epsilon parameter from choose_action

* removed epsilon parameter from choose_action

* changed agent manager's train parameter to experience_by_agent

* fixed some PR comments

* renamed zero_grad to zero_gradients in LearningModule

* fixed some PR comments

* bug fix

* bug fix

* bug fix

* removed explorer abstraction from agent

* added DEVICE env variable as first choice for torch device

* refined dqn example

* fixed lint issues

* removed unwanted import in cim example

* updated cim-dqn notebook

* simplified scheduler

* edited notebook according to merged scheduler changes

* refined dimension check for learning module manager and removed num_actions from DQNConfig

* bug fix for cim example

* added notebook output

* updated cim PO example code according to changes in maro/rl

* removed early stopping from CIM dqn example

* combined ac and ppo and simplified example code and config

* removed early stopping from cim example config

* moved decorator logic inside algorithms

* renamed early_stopping_callback to early_stopping_checker

* put PG and AC under PolicyOptimization class and refined examples accordingly

* fixed lint issues

* removed action_dim from noise explorer classes and added some shape checks

* modified NoiseExplorer's __call__ logic to batch processing

* made NoiseExplorer's __call__ return type np array

* renamed update to set_parameters in explorer

* fixed old naming in test_grass

* moved optimizer options to LearningModel

* typo fix

* fixed lint issues

* updated notebook

* updated cim example for policy optimization

* typo fix

* typo fix

* typo fix

* typo fix

* misc edits

* minor edits to rl_toolkit.rst

* checked out docs from master

* fixed typo in k-step shaper

* fixed lint issues

* bug fix in store

* lint issue fix

* changed default max_ep to 100 for policy_optimization algos

* vis doc update to master (#244)

* refine readme

* feat: refine data push/pull (#138)

* feat: refine data push/pull

* test: add cli provision testing

* fix: style fix

* fix: add necessary comments

* fix: from code review

* add fall back function in weather download (#112)

* fix deployment issue in multi envs

* fix typo

* fix ~/.maro not exist issue in build

* skip deploy when build

* update for comments

* temporarily disable weather info

* replace ecr with cim in setup.py

* replace ecr in manifest

* remove weather check when read data

* fix station id issue

* fix format

* add TODO in comments

* add noaa weather source

* fix weather reset and weather comment

* add comment for weather data url

* some format update

* add fall back function in weather download

* update comment

* update for comments

* update comment

* add period

* fix for pylint

* update for pylint check

* added example docs (#136)

* added example docs

* added citibike greedy example doc

* modified citibike doc

* fixed PR comments

* fixed more PR comments

* fixed small formatting issue

Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* switch the key and value of handler_dict in decorator (#144)

* switch the key and value of handler_dict in decorator

* add dist decorator UT and fixed multithreading conflict in maro test suite

* pr comments update.

* resolved comments about decorator UT

* rename handler_fun in dist decorator

* change self.attr into class_name.attr

* update UT tests comments

* V0.1 annotation (#147)

* refine the annotation of simulator core

* remove reward from env(be)

* format refined

* white spaces test

* left-padding spaces refined

* format modifed

* update the left-padding spaces of docstrings

* code format updated

* update according to comments

* update according to PR comments

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>

* Event payload details for env.summary (#156)

* key_list of events added for env.summary

* code refined according to lint

* 2 kinds of Payload added for CIM scenario; citi bike summary refined according to comments

* code format refined

* try trigger the git tests

* update github workflow

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>

* Implemented dump snapshots and convert to CSV.

* Let BE supports params when dump snapshot.

* Refactor dump code to core.py

* Implemented decision event dump.

* V0.2 online lp for citi bike (#159)

* key_list of events added for env.summary

* code refined according to lint

* 2 kinds of Payload added for CIM scenario; citi bike summary refined according to comments

* code format refined

* try trigger the git tests

* update github workflow

* online LP example added for citi bike

* infeasible solution

* infeasible solution fixed: call snapshot before any env.step()

* experiment results of toy topos added

* experiment results of toy topos added

* experiment result update: better than naive baseline

* PuLP version added

* greedy experiment results update

* citibike result update

* modified according to PR comments

* update experiment results and forecasting comparison

* citi bike lp README updated

* README updated

* modified according to PR comments

* update according to PR comments

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>
Co-authored-by: Jinyu Wang <jinywan@microsoft.com>

* V0.2 rl toolkit refinement (#165)

* refined rl abstractions

* fixed formattin issues

* checked out error-code related code from v0.2_pg

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* renamed save_models to dump_models

* 1. set default batch_norm_enabled to True; 2. used state_dict in dqn model saving

* renamed dump_experience_store to dump_experience_pool

* fixed a bug in the dump_experience_pool method

* fixed some PR comments

* fixed more PR comments

* 1.fixed some PR comments; 2.added early_stopping_checker; 3.revised explorer class

* fixed cim example according to rl toolkit changes

* fixed some more PR comments

* rewrote multi_process_launcher to eliminate the distributed section in config

* 1. fixed a typo; 2. added logging before early stopping

* fixed a bug

* fixed a bug

* fixed a bug

* added early stopping feature to CIM exmaple

* fixed a typo

* fixed some issues with early stopping

* changed early stopping metric func

* fixed a bug

* fixed a bug

* added early stopping to dist mode cim

* added experience collecting func

* edited notebook according to changes in CIM example

* fixed bugs in nb

* fixed lint formatting issues

* fixed a typo

* fixed some PR comments

* fixed more PR comments

* revised docs

* removed nb output

* fixed a bug in simple_learner

* fixed a typo in nb

* fixed a bug

* fixed a bug

* fixed a bug

* removed unused import

* fixed a bug

* 1. changed early stopping default config; 2. renamed param in early stopping checker and added typing

* fixed some doc issues

* added output to nb

Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* replace is not '' with !=''

* Fixed issues that code review mentioned.

* removed path from hello.py

* Changed import sort.

* Fix  import sorting in citi_bike/business_engine

* visualization 0.1

* Updated lint configurations.

* Fixed formatting error that caused lint errors.

* render html title function

* Try to fix lint errors.

* flake-8 style fix

* remove space around 18,35

* dump_csv_converter.py re-formatting.

* files re-formatting.

* style fixed

* tab delete

* white space fix

* white space fix-2

* vis redundant function delete

* refine

* update according to flake8

* re-formatting after merged upstream.

* Updated import section.

* Updated import section.

* V0.2 Logical operator overloading for EarlyStoppingChecker (#178)

* 1. added logical operator overloading for early stopping checker; 2. added mean value checker

* fixed PR comments

* removed learner.exit() in single_process_launcher

* added another early stopping checker in example

* fixed PR comments and lint issues

* lint issue fix

* fixed lint issues

* fixed a bug

* fixed a bug

Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* V0.2 skip connection (#176)

* replaced IdentityLayers with nn.Identity

* 1. added skip connection option in FC_net; 2. generalized learning model

* added skip_connection option in config

* removed type casting in fc_net

* fixed lint formatting issues

* refined docstring

* added multi-head functionality to LearningModel

* refined learning model docstring

* added head_key param in learningModel forward

* fixed PR comments

* added top layer logic and is_top option in fc_net

* fixed a bug

* fixed a bug

* reverted some changes in learning model

* reverted some changes in learning model

* added members to learning model to fix the mode issue

* fixed a bug

* fixed mode setting issue in learning model

* removed learner.exit() in single_process_launcher

* fixed PR comments

* fixed rl/__init__

* fixed issues in example

* fixed a bug

* fixed a bug

* fixed lint formatting issues

* moved reward type casting to exp shaper

Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* pr refine

* isort fix

* white space

* lint error

* \n error

* test continuation

* indent

* continuation of indent

* indent 0.3

* comment update

* comment update 0.2

* f-string update

* f-string 0.2

* lint 0.3

* lint 0.4

* lint 0.4

* lint 0.5

* lint 0.6

* docstring update

* data version deploy update

* condition update

* add whitespace

* V0.2 vis dump feature enhancement. (#190)

* Dumps added manifest file.
* Code updated format by flake8
* Changed manifest file format for easy reading.

* deploy info update; docs update

* weird white space

* Update dashboard_visualization.md

* new endline?

* delete dependency

* delete irrelevant file

* change scenario to enum, divide file path into a separated class

* fixed a bug in learner's test() (#193)

Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* V0.2 double dqn (#188)

* added dueling action value model

* renamed params in dueling_action_value_model

* renamed shared_features to features

* replaced IdentityLayers with nn.Identity

* 1. added skip connection option in FC_net; 2. generalized learning model

* added skip_connection option in config

* removed type casting in fc_net

* fixed lint formatting issues

* refined docstring

* mv dueling_actiovalue_model and fixed some bugs

* added multi-head functionality to LearningModel

* refined learning model docstring

* added head_key param in learningModel forward

* added double DQN and dueling features to DQN

* fixed a bug

* added DuelingQModelHead enum

* fixed a bug

* removed unwanted file

* fixed PR comments

* added top layer logic and is_top option in fc_net

* fixed a bug

* fixed a bug

* reverted some changes in learning model

* reverted some changes in learning model

* added members to learning model to fix the mode issue

* fixed a bug

* fixed mode setting issue in learning model

* fixed PR comments

* revised cim example according to DQN changes

* renamed eval_model to q_value_model in cim example

* more fixes

* fixed a bug

* fixed a bug

* added doc per PR comments

* removed learner.exit() in single_process_launcher

* removed learner.exit() in single_process_launcher

* fixed PR comments

* fixed rl/__init__

* fixed issues in example

* fixed a bug

* fixed a bug

* fixed lint formatting issues

* double DQN feature

* fixed a bug

* fixed a bug

* fixed PR comments

* fixed lint issue

* 1. fixed PR comments related to load/dump; 2. removed abstract load/dump methods from AbsAlgorithm

* added load_models in simple_learner

* minor docstring edits

* minor docstring edits

* set is_double to true in DQN config

Co-authored-by: ysqyang <v-yangqi@microsoft.com>
Co-authored-by: Arthur Jiang <ArthurSJiang@gmail.com>

* V0.2 feature predefined image (#183)

* feat: support predefined image provision

* style: fix linting errors

* style: fix linting errors

* style: fix linting errors

* style: fix linting errors

* fix: error scripts invocation after using relative import

* fix: missing init.py

* fixed a bug in learner's test()

* feat: add distributed_config for dqn example

* test: update test for grass

* test: update test for k8s

* feat: add promptings for steps

* fix: change relative imports to absolute imports

Co-authored-by: ysqyang <v-yangqi@microsoft.com>
Co-authored-by: Arthur Jiang <ArthurSJiang@gmail.com>

* doc refine

* doc update

* params type

* data structure update

* doc&enum, formula refine

* refine

* add ut, refine doc

* style refine

* isort

* strong type fix

* os._exit delete

* revert datalib

* import new line

* change test case

* change file name & doc

* change deploy path

* delete params

* revert file

* delete duplicate file

* delete single process

* update naming

* manually change import order

* delete blank

* edit error

* requirement txt

* style fix & refine

* comments&docstring refine

* add parameter name

* test & dump

* comments update

* V0.2 feature proxy rejoin (#158)

* update dist decorator

* replace proxy.get_peers by proxy.peers

* update proxy rejoin (draft, not runable for proxy rejoin)

* fix bugs in proxy

* add message cache, and redesign rejoin parameter

* feat: add checkpoint with test

* update proxy.rejoin

* fixed rejoin bug, rename func

* add test example(temp)

* feat: add FaultToleranceAgent, refine other MasterAgents and NodeAgents.

* capital env vari name

* rm json.dumps; change retries to 10; temp add warning level for rejoin

* fix: unable to load FaultToleranceAgent, missing params

* fix: delete mapping in StopJob if FaultTolerance is activated, add exception handler for FaultToleranceAgent

* feat: add node_id to node_details

* fix: add a new dependency for tests

* style: meet linting requirements

* style: remaining linting problems

* lint fixed; rm temp test folder.

* fixed lint f-string without placeholder

* fix: add a flag for "remove_container", refine restart logic and Redis keys naming

* proxy rejoin update.

* variable rename.

* fixed lint issues

* fixed lint issues

* add exit code for different error

* feat: add special errors handler

* add max rejoin times

* remove unused import

* add rejoin UT; resolve rejoin comments

* lint fixed

* fixed UT import problem

* rm MessageCache in proxy

* fix: refine key naming

* update proxy rejoin; add topic for broadcast

* feat: support predefined image provision

* update UT for communication

* add docstring for rejoin

* fixed isort and zmq driver import

* fixed isort and UT test

* fix isort issue

* proxy rejoin update (comments v2)

* fixed isort error

* style: fix linting errors

* style: fix linting errors

* style: fix linting errors

* style: fix linting errors

* feat: add exists method for checkpoint

* fix: error scripts invocation after using relative import

* fix: missing init.py

* fixed a bug in learner's test()

* add driver close and socket SUB disconnect for rejoin

* feat: add distributed_config for dqn example

* test: update test for grass

* test: update test for k8s

* feat: add promptings for steps

* fix: change relative imports to absolute imports

* fixed comments and update logger level

* mv driver in proxy.__init__ for issue temp fixed.

* Update docstring and comments

* style: fix code reviews problems

* fix code format

Co-authored-by: Lyuchun Huang <romic.kid@gmail.com>
Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* V0.2 feature cli windows (#203)

* fix: change local mkdir to os.makedirs

* fix: add utf8 encoding for logger

* fix: add powershell.exe prefix to subprocess functions

* feat: add debug_green

* fix: use fsutil to create fix-size files in Windows

* fix: use universal_newlines=True to handle encoding problem in different operating systems

* fix: use temp file to do copy when the operating system is not Linux

* fix: linting error

* fix: use fsutil in test_k8s.py

* feat: dynamic init ABS_PATH in GlobalParams

* fix: use -Command to execute Powershell command

* fix: refine code style in k8s_azure_executor.py, add Windows support for k8s mode

* fix: problems in code review

* EventBuffer refine (#197)

* merge uniform event changes back

* 1st step: move executing events into stack for better removing performance

* flush event pool

* typo

* add option for env to enable event pool

* refine stack functions

* fix comment issues, add typings

* lint fixing

* lint fix

* add missing fix

* linting

* lint

* use linked list instead original event list and execute stack

* add missing file

* linting, and fixes

* add missing file

* linting fix

* fixing comments

* add missing file

* rename event_list to event_linked_list

* correct import path

* change enable_event_pool to disable_finished_events

* add missing file

* V0.2 merge master (#214)

* fix the visualization of docs/key_components/distributed_toolkit

* add examples into isort ignore

* refine import path for examples (#195)

* refine import path for examples

* refine indents

* fixed formatting issues

* update code style

* add editorconfig-checker, add editorconfig path into lint, change super-linter version

* change path for code saving in cim.gnn

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>
Co-authored-by: ysqyang <v-yangqi@microsoft.com>
Co-authored-by: Wenlei Shi <Wenlei.Shi@microsoft.com>

* fix issue that sometimes there is conflict between distutils and setuptools  (#208)

* fix issue that cython and setuptools conflict

* follow the accepted temp workaround

* update comment, it should be conflict between setuptools and distutils

* fixed bugs related to proxy interface changes

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>
Co-authored-by: Jinyu-W <53509467+Jinyu-W@users.noreply.github.com>
Co-authored-by: ysqyang <v-yangqi@microsoft.com>
Co-authored-by: Wenlei Shi <Wenlei.Shi@microsoft.com>
Co-authored-by: Chaos Yu <chaos.you@gmail.com>

* typo fix

* Bug fix: event buffer issue that cause Actions cannot be passed into business engine (#215)

* bug fix

* clear the reference after extract sub events, update ut to cover this issue

Co-authored-by: Jinyu-W <53509467+Jinyu-W@users.noreply.github.com>

* fix flake8 style problem

* V0.2 feature refine mode namings (#212)

* feat: refine cli exception

* feat: refine mode namings

* EventBuffer refine (#197)

* merge uniform event changes back

* 1st step: move executing events into stack for better removing performance

* flush event pool

* typo

* add option for env to enable event pool

* refine stack functions

* fix comment issues, add typings

* lint fixing

* lint fix

* add missing fix

* linting

* lint

* use linked list instead original event list and execute stack

* add missing file

* linting, and fixes

* add missing file

* linting fix

* fixing comments

* add missing file

* rename event_list to event_linked_list

* correct import path

* change enable_event_pool to disable_finished_events

* add missing file

* fixed bugs in dist rl

* feat: rename files

* tests: set longer gracefully wait time

* style: fix linting errors

* style: fix linting errors

* style: fix linting errors

* fix: rm redundant variables

* fix: refine error message

Co-authored-by: Chaos Yu <chaos.you@gmail.com>
Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* V0.2 vis new (#210)

Co-authored-by: Wenlei Shi <Wenlei.Shi@microsoft.com>
Co-authored-by: Chaos Yu <chaos.you@gmail.com>

* V0.2 local host process (#221)

* Update local process (not ready)

* update cli process mode

* add setup/clear/template for maro process

* fix process stop

* add logger and rename parameters

* add logger for setup/clear

* fixed close not exist pid when given pid list.

* Fixed comments and rename setup/clear with create/delete

* update ProcessInternalError

* comments fix

* delete toolkit change

* doc update

* citi bike update

* deploy path

* datalib update

* revert datalib

* revert

* maro file format

* comments update

* doc update

* V0.2 grass on premises (#220)

* feat: refine cli exception
* commit on v0.2_grass_on_premises

Co-authored-by: Lyuchun Huang <romic.kid@gmail.com>
Co-authored-by: Chaos Yu <chaos.you@gmail.com>
Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* V0.2 vm scheduling scenario (#189)

* Initialize

* Data center scenario init

* Code style modification

* V0.2 event buffer subevents expand (#180)

* V0.2 rl toolkit refinement (#165)

* refined rl abstractions

* fixed formattin issues

* checked out error-code related code from v0.2_pg

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* renamed save_models to dump_models

* 1. set default batch_norm_enabled to True; 2. used state_dict in dqn model saving

* renamed dump_experience_store to dump_experience_pool

* fixed a bug in the dump_experience_pool method

* fixed some PR comments

* fixed more PR comments

* 1.fixed some PR comments; 2.added early_stopping_checker; 3.revised explorer class

* fixed cim example according to rl toolkit changes

* fixed some more PR comments

* rewrote multi_process_launcher to eliminate the distributed section in config

* 1. fixed a typo; 2. added logging before early stopping

* fixed a bug

* fixed a bug

* fixed a bug

* added early stopping feature to CIM exmaple

* fixed a typo

* fixed some issues with early stopping

* changed early stopping metric func

* fixed a bug

* fixed a bug

* added early stopping to dist mode cim

* added experience collecting func

* edited notebook according to changes in CIM example

* fixed bugs in nb

* fixed lint formatting issues

* fixed a typo

* fixed some PR comments

* fixed more PR comments

* revised docs

* removed nb output

* fixed a bug in simple_learner

* fixed a typo in nb

* fixed a bug

* fixed a bug

* fixed a bug

* removed unused import

* fixed a bug

* 1. changed early stopping default config; 2. renamed param in early stopping checker and added typing

* fixed some doc issues

* added output to nb

Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* unfold sub-events, insert after parent

* remove event category, use different class instead, add helper functions to gen decision and action event

* add a method to support add immediate event to cascade event with tick validation

* fix ut issue

* add action as 1st sub event to ensure the executing order

Co-authored-by: ysqyang <ysqyang@gmail.com>
Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* Data center scenario update

* Code style update

* Data scenario business engine update

* Isort update

* Fix lint code check

* Fix based on PR comments.

* Update based on PR comments.

* Add decision payload

* Add config file

* Update utilization series logic

* Update based on PR comment

* Update based on PR

* Update

* Update

* Add the ValidPm class

* Update docs string and naming

* Add energy consumption

* Lint code fixed

* Refining postpone function

* Lint style update

* Init data pipeline

* Update based on PR comment

* Add data pipeline download

* Lint style update

* Code style fix

* Temp update

* Data pipeline update

* Add aria2p download function

* Update based on PR comment

* Update based on PR comment

* Update based on PR comment

* Update naming of variables

* Rename topology

* Renaming

* Fix valid pm list

* Pylint fix

* Update comment

* Update docstring and comment

* Fix init import

* Update tick issue

* fix merge problem

* update style

* V0.2 datacenter data pipeline (#199)

* Data pipeline update

* Data pipeline update

* Lint update

* Update pipeline

* Add vmid mapping

* Update lint style

* Add VM data analytics

* Update notebook

* Add binary converter

* Modift vmtable yaml

* Update binary meta file

* Add cpu reader

* random example added for data center

* Fix bugs

* Fix pylint

* Add launcher

* Fix pylint

* best fit policy added

* Add reset

* Add config

* Add config

* Modify action object

* Modify config

* Fix naming

* Modify config

* Add snapshot list

* Modify a spelling typo

* Update based on PR comments.

* Rename scenario to vm scheduling

* Rename scenario

* Update print messages

* Lint fix

* Lint fix

* Rename scenario

* Modify the calculation of cpu utilization

* Add comment

* Modify data pipeline path

* Fix typo

* Modify naming

* Add unittest

* Add comment

* Unify naming

* Fix data path typo

* Update comments

* Update snapshot features

* Add take snapshot

* Add summary keys

* Update cpu reader

* Update naming

* Add unit test

* Rename snapshot node

* Add processed data pipeline

* Modify config

* Add comment

* Lint style fix

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>

* Add package used in vm_scheduling

* add aria2p to test requirement

* best fit example: update the usage of snapshot

* Add aria2p to test requriement

* Remove finish event

* Fix unittest

* Add test dataset

* Update based on PR comment

* Refine cpu reader and unittest

* Lint update

* Refine based on PR comment

* Add agent index

* Add node maping

* Refine based on PR comments

* Renaming postpone_step

* Renaming and refine based on PR comments

* Rename config

* Update

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>
Co-authored-by: Chaos Yu <chaos.you@gmail.com>
Co-authored-by: ysqyang <ysqyang@gmail.com>
Co-authored-by: ysqyang <v-yangqi@microsoft.com>
Co-authored-by: Jinyu-W <53509467+Jinyu-W@users.noreply.github.com>

* Resolve none action problem (#224)

* V0.2 vm_scheduling notebook (#223)

* Initialize

* Data center scenario init

* Code style modification

* V0.2 event buffer subevents expand (#180)

* V0.2 rl toolkit refinement (#165)

* refined rl abstractions

* fixed formattin issues

* checked out error-code related code from v0.2_pg

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* renamed save_models to dump_models

* 1. set default batch_norm_enabled to True; 2. used state_dict in dqn model saving

* renamed dump_experience_store to dump_experience_pool

* fixed a bug in the dump_experience_pool method

* fixed some PR comments

* fixed more PR comments

* 1.fixed some PR comments; 2.added early_stopping_checker; 3.revised explorer class

* fixed cim example according to rl toolkit changes

* fixed some more PR comments

* rewrote multi_process_launcher to eliminate the distributed section in config

* 1. fixed a typo; 2. added logging before early stopping

* fixed a bug

* fixed a bug

* fixed a bug

* added early stopping feature to CIM exmaple

* fixed a typo

* fixed some issues with early stopping

* changed early stopping metric func

* fixed a bug

* fixed a bug

* added early stopping to dist mode cim

* added experience collecting func

* edited notebook according to changes in CIM example

* fixed bugs in nb

* fixed lint formatting issues

* fixed a typo

* fixed some PR comments

* fixed more PR comments

* revised docs

* removed nb output

* fixed a bug in simple_learner

* fixed a typo in nb

* fixed a bug

* fixed a bug

* fixed a bug

* removed unused import

* fixed a bug

* 1. changed early stopping default config; 2. renamed param in early stopping checker and added typing

* fixed some doc issues

* added output to nb

Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* unfold sub-events, insert after parent

* remove event category, use different class instead, add helper functions to gen decision and action event

* add a method to support add immediate event to cascade event with tick validation

* fix ut issue

* add action as 1st sub event to ensure the executing order

Co-authored-by: ysqyang <ysqyang@gmail.com>
Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* Data center scenario update

* Code style update

* Data scenario business engine update

* Isort update

* Fix lint code check

* Fix based on PR comments.

* Update based on PR comments.

* Add decision payload

* Add config file

* Update utilization series logic

* Update based on PR comment

* Update based on PR

* Update

* Update

* Add the ValidPm class

* Update docs string and naming

* Add energy consumption

* Lint code fixed

* Refining postpone function

* Lint style update

* Init data pipeline

* Update based on PR comment

* Add data pipeline download

* Lint style update

* Code style fix

* Temp update

* Data pipeline update

* Add aria2p download function

* Update based on PR comment

* Update based on PR comment

* Update based on PR comment

* Update naming of variables

* Rename topology

* Renaming

* Fix valid pm list

* Pylint fix

* Update comment

* Update docstring and comment

* Fix init import

* Update tick issue

* fix merge problem

* update style

* V0.2 datacenter data pipeline (#199)

* Data pipeline update

* Data pipeline update

* Lint update

* Update pipeline

* Add vmid mapping

* Update lint style

* Add VM data analytics

* Update notebook

* Add binary converter

* Modift vmtable yaml

* Update binary meta file

* Add cpu reader

* random example added for data center

* Fix bugs

* Fix pylint

* Add launcher

* Fix pylint

* best fit policy added

* Add reset

* Add config

* Add config

* Modify action object

* Modify config

* Fix naming

* Modify config

* Add snapshot list

* Modify a spelling typo

* Update based on PR comments.

* Rename scenario to vm scheduling

* Rename scenario

* Update print messages

* Lint fix

* Lint fix

* Rename scenario

* Modify the calculation of cpu utilization

* Add comment

* Modify data pipeline path

* Fix typo

* Modify naming

* Add unittest

* Add comment

* Unify naming

* Fix data path typo

* Update comments

* Update snapshot features

* Add take snapshot

* Add summary keys

* Update cpu reader

* Update naming

* Add unit test

* Rename snapshot node

* Add processed data pipeline

* Modify config

* Add comment

* Lint style fix

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>

* Add package used in vm_scheduling

* add aria2p to test requirement

* best fit example: update the usage of snapshot

* Add aria2p to test requriement

* Remove finish event

* Fix unittest

* Add test dataset

* Update based on PR comment

* Refine cpu reader and unittest

* Lint update

* Refine based on PR comment

* Add agent index

* Add node maping

* Init vm shceduling notebook

* Add notebook

* Refine based on PR comments

* Renaming postpone_step

* Renaming and refine based on PR comments

* Rename config

* Update based on the v0.2_datacenter

* Update notebook

* Update

* update filepath

* notebook updated

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>
Co-authored-by: Chaos Yu <chaos.you@gmail.com>
Co-authored-by: ysqyang <ysqyang@gmail.com>
Co-authored-by: ysqyang <v-yangqi@microsoft.com>
Co-authored-by: Jinyu-W <53509467+Jinyu-W@users.noreply.github.com>

* Update process mode docs and fixed on premises (#226)

* V0.2 Add github workflow integration (#222)

* test: add github workflow integration

* fix: split procedures && bug fixed

* test: add training only restriction

* fix: add 'approved' restriction

* fix: change default ssh port to 22

* style: in one line

* feat: add timeout for Subprocess.run

* test: change default node_size to Standard_D2s_v3

* style: refine style

* fix: add ssh_port param to on-premises mode

* fix: add missing init.py

* update param name

* V0.2 explorer (#198)

* overhauled exploration abstraction

* fixed a bug

* fixed a bug

* fixed a bug

* added exploration related methods to abs_agent

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* separated learning with exploration schedule and without

* small fixes

* moved explorer logic to actor side

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* removed unwanted param from simple agent manager

* added noise explorer

* fixed formatting

* removed unnecessary comma

* fixed PR comments

* removed unwanted exception and imports

* fixed a bug

* fixed PR comments

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed lint issue

* fixed a bug

* fixed lint issue

* fixed naming

* combined exploration param generation and early stopping in scheduler

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed lint issues

* fixed lint issue

* moved logger inside scheduler

* fixed a bug

* fixed a bug

* fixed a bug

* fixed lint issues

* removed epsilon parameter from choose_action

* fixed some PR comments

* fixed some PR comments

* bug fix

* bug fix

* bug fix

* removed explorer abstraction from agent

* refined dqn example

* fixed lint issues

* simplified scheduler

* removed early stopping from CIM dqn example

* removed early stopping from cim example config

* renamed early_stopping_callback to early_stopping_checker

* removed action_dim from noise explorer classes and added some shape checks

* modified NoiseExplorer's __call__ logic to batch processing

* made NoiseExplorer's __call__ return type np array

* renamed update to set_parameters in explorer

* fixed old naming in test_grass

Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* V0.2 embedded optim (#191)

* added dueling action value model

* renamed params in dueling_action_value_model

* renamed shared_features to features

* replaced IdentityLayers with nn.Identity

* 1. added skip connection option in FC_net; 2. generalized learning model

* added skip_connection option in config

* removed type casting in fc_net

* fixed lint formatting issues

* refined docstring

* mv dueling_actiovalue_model and fixed some bugs

* added multi-head functionality to LearningModel

* refined learning model docstring

* added head_key param in learningModel forward

* added double DQN and dueling features to DQN

* fixed a bug

* added DuelingQModelHead enum

* fixed a bug

* removed unwanted file

* fixed PR comments

* added top layer logic and is_top option in fc_net

* fixed a bug

* fixed a bug

* reverted some changes in learning model

* reverted some changes in learning model

* added members to learning model to fix the mode issue

* fixed a bug

* fixed mode setting issue in learning model

* fixed PR comments

* revised cim example according to DQN changes

* renamed eval_model to q_value_model in cim example

* more fixes

* fixed a bug

* fixed a bug

* added doc per PR comments

* removed learner.exit() in single_process_launcher

* removed learner.exit() in single_process_launcher

* fixed PR comments

* fixed rl/__init__

* fixed issues in example

* fixed a bug

* fixed a bug

* fixed lint formatting issues

* double DQN feature

* fixed a bug

* fixed a bug

* fixed PR comments

* fixed lint issue

* embedded optimizer into SingleHeadLearningModel

* 1. fixed PR comments related to load/dump; 2. removed abstract load/dump methods from AbsAlgorithm

* added load_models in simple_learner

* minor docstring edits

* minor docstring edits

* minor docstring edits

* mv optimizer options inside LearningMode

* modified example accordingly

* fixed a bug

* fixed a bug

* fixed a bug

* added dueling DQN feature

* revised and refined docstrings

* fixed a bug

* fixed lint issues

* added load/dump functions to LearningModel

* fixed a bug

* fixed a bug

* fixed lint issues

* refined DQN docstrings

* removed load/dump functions from DQN

* added task validator

* fixed decorator use

* fixed a typo

* fixed a bug

* fixed lint issues

* changed LearningModel's step() to take a single loss

* revised learning model design

* revised example

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* added decorator utils to algorithm

* fixed a bug

* renamed core_model to model

* fixed a bug

* 1. fixed lint formatting issues; 2. refined learning model docstrings

* rm trailing whitespaces

* added decorator for choose_action

* fixed a bug

* fixed a bug

* fixed version-related issues

* renamed add_zeroth_dim decorator to expand_dim

* overhauled exploration abstraction

* fixed a bug

* fixed a bug

* fixed a bug

* added exploration related methods to abs_agent

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* separated learning with exploration schedule and without

* small fixes

* moved explorer logic to actor side

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* removed unwanted param from simple agent manager

* small fixes

* added shared_module property to LearningModel

* added shared_module property to LearningModel

* revised __getstate__ for LearningModel

* fixed a bug

* added soft_update function to learningModel

* fixed a bug

* revised learningModel

* rm __getstate__ and __setstate__ from LearningModel

* added noise explorer

* fixed formatting

* removed unnecessary comma

* removed unnecessary comma

* fixed PR comments

* removed unwanted exception and imports

* removed unwanted exception and imports

* fixed a bug

* fixed PR comments

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed lint issue

* fixed a bug

* fixed lint issue

* fixed naming

* combined exploration param generation and early stopping in scheduler

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed lint issues

* fixed lint issue

* moved logger inside scheduler

* fixed a bug

* fixed a bug

* fixed a bug

* fixed lint issues

* fixed lint issue

* removed epsilon parameter from choose_action

* removed epsilon parameter from choose_action

* changed agent manager's train parameter to experience_by_agent

* fixed some PR comments

* renamed zero_grad to zero_gradients in LearningModule

* fixed some PR comments

* bug fix

* bug fix

* bug fix

* removed explorer abstraction from agent

* added DEVICE env variable as first choice for torch device

* refined dqn example

* fixed lint issues

* removed unwanted import in cim example

* updated cim-dqn notebook

* simplified scheduler

* edited notebook according to merged scheduler changes

* refined dimension check for learning module manager and removed num_actions from DQNConfig

* bug fix for cim example

* added notebook output

* removed early stopping from CIM dqn example

* removed early stopping from cim example config

* moved decorator logic inside algorithms

* renamed early_stopping_callback to early_stopping_checker

* removed action_dim from noise explorer classes and added some shape checks

* modified NoiseExplorer's __call__ logic to batch processing

* made NoiseExplorer's __call__ return type np array

* renamed update to set_parameters in explorer

* fixed old naming in test_grass

Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* V0.2 VM scheduling docs (#228)

* Initialize

* Data center scenario init

* Code style modification

* V0.2 event buffer subevents expand (#180)

* V0.2 rl toolkit refinement (#165)

* refined rl abstractions

* fixed formattin issues

* checked out error-code related code from v0.2_pg

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* fixed a bug

* renamed save_models to dump_models

* 1. set default batch_norm_enabled to True; 2. used state_dict in dqn model saving

* renamed dump_experience_store to dump_experience_pool

* fixed a bug in the dump_experience_pool method

* fixed some PR comments

* fixed more PR comments

* 1.fixed some PR comments; 2.added early_stopping_checker; 3.revised explorer class

* fixed cim example according to rl toolkit changes

* fixed some more PR comments

* rewrote multi_process_launcher to eliminate the distributed section in config

* 1. fixed a typo; 2. added logging before early stopping

* fixed a bug

* fixed a bug

* fixed a bug

* added early stopping feature to CIM exmaple

* fixed a typo

* fixed some issues with early stopping

* changed early stopping metric func

* fixed a bug

* fixed a bug

* added early stopping to dist mode cim

* added experience collecting func

* edited notebook according to changes in CIM example

* fixed bugs in nb

* fixed lint formatting issues

* fixed a typo

* fixed some PR comments

* fixed more PR comments

* revised docs

* removed nb output

* fixed a bug in simple_learner

* fixed a typo in nb

* fixed a bug

* fixed a bug

* fixed a bug

* removed unused import

* fixed a bug

* 1. changed early stopping default config; 2. renamed param in early stopping checker and added typing

* fixed some doc issues

* added output to nb

Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* unfold sub-events, insert after parent

* remove event category, use different class instead, add helper functions to gen decision and action event

* add a method to support add immediate event to cascade event with tick validation

* fix ut issue

* add action as 1st sub event to ensure the executing order

Co-authored-by: ysqyang <ysqyang@gmail.com>
Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* Data center scenario update

* Code style update

* Data scenario business engine update

* Isort update

* Fix lint code check

* Fix based on PR comments.

* Update based on PR comments.

* Add decision payload

* Add config file

* Update utilization series logic

* Update based on PR comment

* Update based on PR

* Update

* Update

* Add the ValidPm class

* Update docs string and naming

* Add energy consumption

* Lint code fixed

* Refining postpone function

* Lint style update

* Init data pipeline

* Update based on PR comment

* Add data pipeline download

* Lint style update

* Code style fix

* Temp update

* Data pipeline update

* Add aria2p download function

* Update based on PR comment

* Update based on PR comment

* Update based on PR comment

* Update naming of variables

* Rename topology

* Renaming

* Fix valid pm list

* Pylint fix

* Update comment

* Update docstring and comment

* Fix init import

* Update tick issue

* fix merge problem

* update style

* V0.2 datacenter data pipeline (#199)

* Data pipeline update

* Data pipeline update

* Lint update

* Update pipeline

* Add vmid mapping

* Update lint style

* Add VM data analytics

* Update notebook

* Add binary converter

* Modift vmtable yaml

* Update binary meta file

* Add cpu reader

* random example added for data center

* Fix bugs

* Fix pylint

* Add launcher

* Fix pylint

* best fit policy added

* Add reset

* Add config

* Add config

* Modify action object

* Modify config

* Fix naming

* Modify config

* Add snapshot list

* Modify a spelling typo

* Update based on PR comments.

* Rename scenario to vm scheduling

* Rename scenario

* Update print messages

* Lint fix

* Lint fix

* Rename scenario

* Modify the calculation of cpu utilization

* Add comment

* Modify data pipeline path

* Fix typo

* Modify naming

* Add unittest

* Add comment

* Unify naming

* Fix data path typo

* Update comments

* Update snapshot features

* Add take snapshot

* Add summary keys

* Update cpu reader

* Update naming

* Add unit test

* Rename snapshot node

* Add processed data pipeline

* Modify config

* Add comment

* Lint style fix

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>

* Add package used in vm_scheduling

* add aria2p to test requirement

* best fit example: update the usage of snapshot

* Add aria2p to test requriement

* Remove finish event

* Fix unittest

* Add test dataset

* Update based on PR comment

* vm doc init

* Update docs

* Update docs

* Update docs

* Update docs

* Remove old notebook

* Update docs

* Update docs

* Add figure

* Update docs

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>
Co-authored-by: Chaos Yu <chaos.you@gmail.com>
Co-authored-by: ysqyang <ysqyang@gmail.com>
Co-authored-by: ysqyang <v-yangqi@microsoft.com>
Co-authored-by: Jinyu-W <53509467+Jinyu-W@users.noreply.github.com>

* doc update

* new link

* image update

* v0.2 VM Scheduling docs refinement (#231)

* Fix typo

* Refining vm scheduling docs

* image change

* V0.2 store refinement (#234)

* updated docs and images for rl toolkit

* 1. fixed import formats for maro/rl; 2. changed decorators to hypers in store

* fixed lint issues

Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* Fix bug (#237)

vm scenario: fix the event type bug of the postpone event

* V0.2 rl toolkit doc (#235)

* updated docs and images for rl toolkit

* updated cim example doc

* updated cim exmaple docs

* updated cim example rst

* updated rl_toolkit and cim example docs

* replaced q_module with q_net in example rst

* refined doc

* refined doc

* updated figures

* updated figures

Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* Merge V0.2 vis into V0.2 (#233)

* Implemented dump snapshots and convert to CSV.

* Let BE supports params when dump snapshot.

* Refactor dump code to core.py

* Implemented decision event dump.

* replace is not '' with !=''

* Fixed issues that code review mentioned.

* removed path from hello.py

* Changed import sort.

* Fix  import sorting in citi_bike/business_engine

* visualization 0.1

* Updated lint configurations.

* Fixed formatting error that caused lint errors.

* render html title function

* Try to fix lint errors.

* flake-8 style fix

* remove space around 18,35

* dump_csv_converter.py re-formatting.

* files re-formatting.

* style fixed

* tab delete

* white space fix

* white space fix-2

* vis redundant function delete

* refine

* re-formatting after merged upstream.

* Updated import section.

* Updated import section.

* pr refine

* isort fix

* white space

* lint error

* \n error

* test continuation

* indent

* continuation of indent

* indent 0.3

* comment update

* comment update 0.2

* f-string update

* f-string 0.2

* lint 0.3

* lint 0.4

* lint 0.4

* lint 0.5

* lint 0.6

* docstring update

* data version deploy update

* condition update

* add whitespace

* V0.2 vis dump feature enhancement. (#190)

* Dumps added manifest file.
* Code updated format by flake8
* Changed manifest file format for easy reading.

* deploy info update; docs update

* weird white space

* Update dashboard_visualization.md

* new endline?

* delete dependency

* delete irrelevant file

* change scenario to enum, divide file path into a separated class

* doc refine

* doc update

* params type

* data structure update

* doc&enum, formula refine

* refine

* add ut, refine doc

* style refine

* isort

* strong type fix

* os._exit delete

* revert datalib

* import new line

* change test case

* change file name & doc

* change deploy path

* delete params

* revert file

* delete duplicate file

* delete single process

* update naming

* manually change import order

* delete blank

* edit error

* requirement txt

* style fix & refine

* comments&docstring refine

* add parameter name

* test & dump

* comments update

* Added manifest file. (#201)

Only a few changes that need to meet requirements of manifest file format.

* comments fix

* delete toolkit change

* doc update

* citi bike update

* deploy path

* datalib update

* revert datalib

* revert

* maro file format

* comments update

* doc update

* update param name

* doc update

* new link

* image update

* V0.2 visualization-0.1 (#181)

* visualization 0.1

* render html title function

* flake-8 style fix

* style fixed

* tab delete

* white space fix

* white space fix-2

* vis redundant function delete

* refine

* pr refine

* isort fix

* white space

* lint error

* \n error

* test continuation

* indent

* continuation of indent

* indent 0.3

* comment update

* comment update 0.2

* f-string update

* f-string 0.2

* lint 0.3

* lint 0.4

* lint 0.4

* lint 0.5

* lint 0.6

* docstring update

* data version deploy update

* condition update

* add whitespace

* deploy info update; docs update

* weird white space

* Update dashboard_visualization.md

* new endline?

* delete dependency

* delete irrelevant file

* change scenario to enum, divide file path into a separated class

* fix the visualization of docs/key_components/distributed_toolkit

* doc refine

* doc update

* params type

* add examples into isort ignore

* data structure update

* doc&enum, formula refine

* refine

* add ut, refine doc

* style refine

* isort

* strong type fix

* os._exit delete

* revert datalib

* import new line

* change test case

* change file name & doc

* change deploy path

* delete params

* revert file

* delete duplicate file

* delete single process

* update naming

* manually change import order

* delete blank

* edit error

* requirement txt

* style fix & refine

* comments&docstring refine

* add parameter name

* test & dump

* comments update

* comments fix

* delete toolkit change

* doc update

* citi bike update

* deploy path

* datalib update

* revert datalib

* revert

* maro file format

* comments update

* doc update

* update param name

* doc update

* new link

* image update

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>
Co-authored-by: Miaoran Chen (Wicresoft) <v-miaorc@microsoft.com>

* image change

* add reset snapshot

* delete dump

* add new line

* add next steps

* import change

* relative import

* add init file

* import change

* change utils file

* change cliexpcetion to clierror

* dashboard test

* change result

* change assertation

* move not

* unit test change

* core change

* unit test delete name_mapping_file

* update cim business engine

* doc update

* change relative path

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* duc update

* duc update

* duc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* change import sequence

* comments update

* doc add pic

* add dependency

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* Update dashboard_visualization.rst

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* delete white space

* doc update

* doc update

* update doc

* update doc

* update doc

Co-authored-by: Michael Li <mic_lee2000@hotmail.com>
Co-authored-by: Miaoran Chen (Wicresoft) <v-miaorc@microsoft.com>
Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>
Co-authored-by: Jinyu-W <53509467+Jinyu-W@users.noreply.github.com>

* V0.2 docs process mode (#230)

* Update process mode docs and fixed on premises

* Update orchestration docs

* Update process mode docs add JOB_NAME as env variable

* fixed bugs

* fixed isort issue

* update docs index

Co-authored-by: kaiqli <v-kaiqli@microsoft.com>

* V0.2 learning model refinement (#236)

* moved optimizer options to LearningModel

* typo fix

* fixed lint issues

* updated notebook

* misc edits

* 1. renamed CIMAgent to DQNAgent; 2. moved create_dqn_agents to Agent section in notebook

* renamed single_host_cim_learner ot cim_learner in notebook

* updated notebook output

* typo fix

* removed dimension check in absence of shared stack

* fixed a typo

* fixed lint issues

Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* Update vm docs (#241)

Co-authored-by: Jinyu-W <53509467+Jinyu-W@users.noreply.github.com>

* V0.2 info update (#240)

* update readme

* update version

* refine reademe format

* add vis gif

* add citation

* update citation

* update badge

Co-authored-by: Arthur Jiang <sjian@microsoft.com>

* Fix typo (#242)

* Fix typo

* fix typo

* fix

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

* doc update

Co-authored-by: Arthur Jiang <sjian@microsoft.com>
Co-authored-by: Arthur Jiang <ArthurSJiang@gmail.com>
Co-authored-by: Romic Huang <romic.kid@gmail.com>
Co-authored-by: zhanyu wang <pocket_2001@163.com>
Co-authored-by: ysqyang <ysqyang@gmail.com>
Co-authored-by: ysqyang <v-yangqi@microsoft.com>
Co-authored-by: kaiqli <59279714+kaiqli@users.noreply.github.com>
Co-authored-by: Jinyu-W <53509467+Jinyu-W@users.noreply.github.com>
Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>
Co-authored-by: Jinyu Wang <jinywan@microsoft.com>
Co-authored-by: Michael Li <mic_lee2000@hotmail.com>
Co-authored-by: Miaoran Chen (Wicresoft) <v-miaorc@microsoft.com>
Co-authored-by: Chaos Yu <chaos.you@gmail.com>
Co-authored-by: Wenlei Shi <Wenlei.Shi@microsoft.com>
Co-authored-by: kyu-kuanwei <72911362+kyu-kuanwei@users.noreply.github.com>
Co-authored-by: kaiqli <v-kaiqli@microsoft.com>

* bug fix related to np array divide (#245)

Co-authored-by: ysqyang <v-yangqi@microsoft.com>

* Master.simple bike (#250)

* notebook for simple bike repositioning added

* add simple rule-based algorithms

* unify input

* add policy based on statistics

* update be for simple bike scenario to fit latest event buffer changes (#247)

* change rendered graph

* figures updated

* change notebook

* matplot updated

* figures updated

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>
Co-authored-by: wesley <Wenlei.Shi@microsoft.com>
Co-authored-by: Chaos Yu <chaos.you@gmail.com>

* simple bike repositioning article: formula updated

* checked out docs/source from v0.2

* aligned with v0.2

* rm unwanted import

* added references in policy_optimization.py

* fixed lint issues

Co-authored-by: ysqyang <v-yangqi@microsoft.com>
Co-authored-by: Meroy Chen <39452768+Meroy9819@users.noreply.github.com>
Co-authored-by: Arthur Jiang <sjian@microsoft.com>
Co-authored-by: Arthur Jiang <ArthurSJiang@gmail.com>
Co-authored-by: Romic Huang <romic.kid@gmail.com>
Co-authored-by: zhanyu wang <pocket_2001@163.com>
Co-authored-by: kaiqli <59279714+kaiqli@users.noreply.github.com>
Co-authored-by: Jinyu-W <53509467+Jinyu-W@users.noreply.github.com>
Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>
Co-authored-by: Jinyu Wang <jinywan@microsoft.com>
Co-authored-by: Michael Li <mic_lee2000@hotmail.com>
Co-authored-by: Miaoran Chen (Wicresoft) <v-miaorc@microsoft.com>
Co-authored-by: Chaos Yu <chaos.you@gmail.com>
Co-authored-by: Wenlei Shi <Wenlei.Shi@microsoft.com>
Co-authored-by: kyu-kuanwei <72911362+kyu-kuanwei@users.noreply.github.com>
Co-authored-by: kaiqli <v-kaiqli@microsoft.com>

* V0.2 backend dynamic node support (#172)

* update lint workflow

* fix workflow issue

* Update lint.yml

* Create tox.ini

* Update lint.yml

* Update lint.yml

* Update tox.ini

* Update lint.yml

* Delete tox.ini from root folder, move it to .github/linters

* Update CONTRIBUTING.md

* add more comments

* update lint conf to ignore cli banner issue

* change extension implementation from c to cpp

* update script to gen cpp files

* backend base interface redefine

* interface revamp for np backend

* 1st step for revamp

* bug fix

* draft

* implementation of attribute

* implementation of backend

* remove  backend switching

* draft raw backend wrapper

* correct function parameter type

* 1st runable version

* bug fix for types

* ut passed

* change CRLF to LF

* fix get_node_info interface

* add raw test in frame ut

* return np.array for all query result

* use ticks from backend

* set init value

* snapshot ut passed

* support set default backend by environemnt variable

* env ut with different backend

* fix take snapshot index bug

* test under both backends

* ignore generated cpp file

* fix lint isues

* more lint fix

* use ordered map to store ticks to keep the order

* remove test code

* refine dup code

* refine code to avoid too much if/else

* handle and raise exception for attr getter

* change the way to handle cpp exception, use cython runtimeerror instead

* add missing function, and fix bug in np impl

* fix lint issue

* specify c++11 flag for compilers

* use normal field assignment instead initializer list, as linux gcc will complain it

* add np ignore macro

* try to refine token pasting operator to avoid error on linux

* more pasting operator issue fix

* remove un-used options

* update workflow files to fit new backend

* 1st version of dynamic backend structure

* setup ut for cpp using lest

* bitset complete

* attributestore and ut

* arrange

* copy_to

* current frame

* ut for frame

* bug fix and ut correct

* fix issue that value not correct after arrange

* fix bug in test case

* frame update

* change the way to add nodes, support add node from middle

* frame in backend

* snapshotlist code complete

* add size method for snapshotlist, add ut template

* make sure snapshot max size not be 0

* add max size

* fix query parameters

* fix attribute store extend error

* add function to retrieve attribute from snapshotlist

* return nan for invalid index

* add function to check if nan for float attribute only

* fix bug that not update _last_tick for snapshot list, that cause take snapshot for same tick crash

* add functions to expose internal state under debug mode, make it easy to do unit test

* fix issue that cause overlap logic skiped

* ut passed for all implemented functions

* remove query in ut, as it not completed yet

* refine querying interfaces, use 2 functions for 1 querying

* snapshot query,

* use pointer instead weak_ptr

* backend impl

* set default parameters value

* query bug fix,

* bug fix: new_attr should return attr id not node id

* use macro to create attribute getters

* add reset support

* change the way to reset, avoid allocation time

* test reset for attributestore

* use Bitset instead vector<bool> to make it easy to reset

* refine backend interfaces to make it compact with old one

* correct quering interface, cython compile passed

* bug fix: get_ticks not set correct index

* correct cpp backend binding, add type for frame

* correct ut for snapshot

* bug fix: query cause crash after snapshot reset

* fix env test

* bug fix: is_nan should check data type first

* fix cim ut issues with raw backend

* fix citibike ut issues for raw backend

* add interfaces to support dynamic nodes, not tested

* bug fix: access cpp object without cdef

* bug fix: missing impl for dynamic methods

* ut for append nodes

* return node number dynamiclly

* remove unused parameters for snapshot

* remove unused code

* allow get attribute for deleted node

* ut for delete and resume node

* function to set attribute slot

* bug fix: set attribute will cause crash

* bug fix: remove append node when reset cause exception

* bug fix: frame.backend_type return incorrect name

* backends performance comparison

* correct internal type

* correct warnings

* missing ;

* formating

* fix lint issue

* simple the way to copy mapping

* add dump interfaces

* frame dump

* ignore if dump path is not exist

* bug fix: use max slots instead of current slots for padding in snapshot querying

* use max slot number in history instead of current for padding

* dump for snapshot

* close file at the end

* refine snapshot dump function

* fix lint issue

* avoid too much allocate operation

* use pointer instead reference for furthure changes

* avoid 2 times map copy

* add comments for missing functions

* performance optimize

* use emplace instead push

* use emplace instead  push

* remove cpp files

* add missing lisence

* ignore .vs folder

* add lest lisence for cpp unittest

* Delete CMakeLists.txt

* add error msg for exception, make it easy to identify error at python side

* remove old codes

* replace with new code

* change IDENTIER to NODE_TYPE and ATTR_TYPE

* build pass

* fix attr type not correct bug

* reomve unused comment

* make frame ut pass

* correct the max snapshots checking

* fix test case

* add missing file

* correct performance test

* refine attribute code

* refine bitset code

* update FrameBase doc about switch backend

* correct the exception name

* refine frame code

* refine node code

* refine snapshot list code

* add is_const and is_list when adding attribute

* support query const attribute without tick exist

* add operations for list attribute

* remove cache as we have list attribute

* add remove and insert for list attribute

* add for-loop support for list attribute

* fix bug that not update list attribute slot number after operations

* test for dynamic features

* frame dump

* dump for snapshot list

* fix issue on gcc compiler

* add missing file

* fix lint issues

* refine the exception, more comments

* fix lint issue

* fix lint issue

* use simulate enum instead of str

* Use new type instead old in tests

* using mapping instead if-else

* remove generated code

* use mapping to reduce too much if-else

* add default attribute type int if not provided or invalid provided

* remove generated code

* update workflow with code gen

* more frame test

* add missing files

* test: cover maro.simulator.utils.common

* update test with new scenario

* comments

* tests

* update doc

* fix lint and comments

* CRLF to LF

* fix lint issue

Co-authored-by: Jinyu-W <53509467+Jinyu-W@users.noreply.github.com>

* V0.2 vm oversub docs (#256)

* Remove topology

* Update pipeline

* Update pipeline

* Update pipeline

* Modify metafile

* Add two attributes of VM

* Update pipeline

* Add vm category

* Add todo

* Add oversub config

* Add oversubscription feature

* Lint fix

* Update based on PR comment.

* Update pipeline

* Update pipeline

* Update config.

* Update based on PR comment

* Update

* Add pm sku feature

* Add sku setting

* Add sku feature

* Lint fix

* Lint style

* Update sku, overloading

* Lint fix

* Lint style

* Fix bug

* Modify config

* Remove sky and replaced it by pm stype

* Add and refactor vm category

* Comment out cofig

* Unify the enum format

* Fix lint style

* Fix import order

* Update based on PR comment

* Update overload to the VM docs

* Update docs

* Update vm docs

Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>
Co-authored-by: Jinyu-W <53509467+Jinyu-W@users.noreply.github.com>

Co-authored-by: Arthur Jiang <sjian@microsoft.com>
Co-authored-by: Arthur Jiang <ArthurSJiang@gmail.com>
Co-authored-by: Romic Huang <romic.kid@gmail.com>
Co-authored-by: zhanyu wang <pocket_2001@163.com>
Co-authored-by: ysqyang <ysqyang@gmail.com>
Co-authored-by: ysqyang <v-yangqi@microsoft.com>
Co-authored-by: kaiqli <59279714+kaiqli@users.noreply.github.com>
Co-authored-by: Jinyu Wang <Wang.Jinyu@microsoft.com>
Co-authored-by: Jinyu Wang <jinywan@microsoft.com>
Co-authored-by: Chaos Yu <chaos.you@gmail.com>
Co-authored-by: Wenlei Shi <Wenlei.Shi@microsoft.com>
Co-authored-by: Michael Li <mic_lee2000@hotmail.com>
Co-authored-by: kyu-kuanwei <72911362+kyu-kuanwei@users.noreply.github.com>
Co-authored-by: Meroy Chen <39452768+Meroy9819@users.noreply.github.com>
Co-authored-by: Miaoran Chen (Wicresoft) <v-miaorc@microsoft.com>
Co-authored-by: kaiqli <v-kaiqli@microsoft.com>
Co-authored-by: Kuan Wei Yu <v-kyu@microsoft.com>
2021-01-25 18:23:41 +08:00
..
test_bike_scenario.py V0.2 update (#262) 2021-01-25 18:23:41 +08:00