ml-agents

Граф коммитов

Автор	SHA1	Сообщение	Дата
Maryam Honari	df96d5c835	Develop custom trainers (#73 ) * Make create_policy more generic (#54) * add on/off policy classes and inherit from * trainers as plugins * remove swap files * clean up registration debug * clean up all pre-commit * a2c plugin pass precommit * move gae to trainer utils * move lambda return to trainer util * add validator for num_epoch * add types for settings/type methods * move create policy into highest level api * move update_reward_signal into optimizer * move get_policy into Trainer * remove get settings type * dummy_config settings * move all stats from actor into dict, enables arbitrary actor data * remove shared_critic flag, cleanups * refactor create_policy * remove sample_actions, evaluate_actions, update_norm from policy * remove comments * fix return type get stat * update poca create_policy * clean up policy init * remove conftest * add sharedecritic to settings * fix test_networks * fix test_policy * fix test network * fix some ppo/sac tests * add back conftest.py * improve specification of trainer type * add defaults fpr trainer_type/hyperparam * fix test_saver * fix reward providers * add settings check utility for tests * fix some settings tests * add trainer types to run_experiment * type check for arbitary actor data * cherrypick rename ml-agents/trainers/torch to torch_entities (#55) * make all trainers types and setting visible at module level * remove settings from run_experiment console script * fix test_settings and upgrade config scripts * remove need of trainer_type argument up to trainefactory * fix gohst trainer behavior id in policy Queue * fix torch shadow in tests * update trainers, rl trainers tests * update tests to match the refactors * fixing behavior name in ghost trainer * update ml-agents-envs test configs * separating the plugin package changes * bring get_policy back for sake of ghost trainer * add return types and remove unused returns * remove duplicate methods in poca (_update_policy, add_policy) Co-authored-by: mahon94 <maryam.honari@unity3d.com> * Online/offline custom trainer examples with plugin system (#52) * add on/off policy classes and inherit from * trainers as plugins * a2c trains * remove swap files * clean up registration debug * clean up all pre-commit * a2c plugin pass precommit * move gae to trainer utils * move lambda return to trainer util * add validator for num_epoch * add types for settings/type methods * move create policy into highest level api * move update_reward_signal into optimizer * move get_policy into Trainer * remove get settings type * dummy_config settings * move all stats from actor into dict, enables arbitrary actor data * remove shared_critic flag, cleanups * refactor create_policy * remove sample_actions, evaluate_actions, update_norm from policy * remove comments * fix return type get stat * update poca create_policy * clean up policy init * remove conftest * add sharedecritic to settings * fix test_networks * fix test_policy * fix test network * fix some ppo/sac tests * add back conftest.py * improve specification of trainer type * add defaults fpr trainer_type/hyperparam * fix test_saver * fix reward providers * add settings check utility for tests * fix some settings tests * add trainer types to run_experiment * type check for arbitary actor data * cherrypick rename ml-agents/trainers/torch to torch_entities (#55) * make all trainers types and setting visible at module level * remove settings from run_experiment console script * fix test_settings and upgrade config scripts * remove need of trainer_type argument up to trainefactory * fix gohst trainer behavior id in policy Queue * fix torch shadow in tests * update trainers, rl trainers tests * update tests to match the refactors * fixing behavior name in ghost trainer * update ml-agents-envs test configs * fix precommit * separating the plugin package changes * bring get_policy back for sake of ghost trainer * add return types and remove unused returns * remove duplicate methods in poca (_update_policy, add_policy) * add a2c trainer back * Add DQN cleaned up trainer/optimizer * nit naming * fix logprob/entropy types in torch_policy.py * clean up DQN/SAC * add docs for custom trainers,TODO: refrence tutorial * add docs for custom trainers,TODO: refrence tutorial * add clipping to loss function * set old importlim-metadata version * bump precomit hook env to 3.8.x * use smooth l1 loss Co-authored-by: mahon94 <maryam.honari@unity3d.com> * add tutorial for validation * fix formatting errors * clean up * minor changes Co-authored-by: Andrew Cohen <andrew.cohen@unity3d.com> Co-authored-by: zhuo <zhuo@unity3d.com>	2022-10-20 16:06:58 -04:00

Автор

SHA1

Сообщение

Дата

Maryam Honari

df96d5c835

Develop custom trainers (#73 )

* Make create_policy more generic (#54)

* add on/off policy classes and inherit from

* trainers as plugins


* remove swap files

* clean up registration debug

* clean up all pre-commit

* a2c plugin pass precommit

* move gae to trainer utils

* move lambda return to trainer util

* add validator for num_epoch

* add types for settings/type methods

* move create policy into highest level api

* move update_reward_signal into optimizer

* move get_policy into Trainer

* remove get settings type

* dummy_config settings

* move all stats from actor into dict, enables arbitrary actor data

* remove shared_critic flag, cleanups

* refactor create_policy

* remove sample_actions, evaluate_actions, update_norm from policy

* remove comments

* fix return type get stat

* update poca create_policy

* clean up policy init

* remove conftest

* add sharedecritic to settings

* fix test_networks

* fix test_policy

* fix test network

* fix some ppo/sac tests

* add back conftest.py

* improve specification of trainer type

* add defaults fpr trainer_type/hyperparam

* fix test_saver

* fix reward providers

* add settings check utility for tests

* fix some settings tests

* add trainer types to run_experiment

* type check for arbitary actor data

* cherrypick rename ml-agents/trainers/torch to torch_entities (#55)

* make all trainers types and setting visible at module level

* remove settings from run_experiment console script

* fix test_settings and upgrade config scripts

* remove need of trainer_type argument up to trainefactory

* fix gohst trainer behavior id in policy Queue

* fix torch shadow in tests

* update trainers, rl trainers tests

* update tests to match the refactors

* fixing behavior name in ghost trainer

* update ml-agents-envs test configs

* separating the plugin package changes

* bring get_policy back for sake of ghost trainer

* add return types and remove unused returns

* remove duplicate methods in poca (_update_policy, add_policy)

Co-authored-by: mahon94 <maryam.honari@unity3d.com>

* Online/offline custom trainer examples with plugin system (#52)

* add on/off policy classes and inherit from

* trainers as plugins

* a2c trains

* remove swap files

* clean up registration debug

* clean up all pre-commit

* a2c plugin pass precommit

* move gae to trainer utils

* move lambda return to trainer util

* add validator for num_epoch

* add types for settings/type methods

* move create policy into highest level api

* move update_reward_signal into optimizer

* move get_policy into Trainer

* remove get settings type

* dummy_config settings

* move all stats from actor into dict, enables arbitrary actor data

* remove shared_critic flag, cleanups

* refactor create_policy

* remove sample_actions, evaluate_actions, update_norm from policy

* remove comments

* fix return type get stat

* update poca create_policy

* clean up policy init

* remove conftest

* add sharedecritic to settings

* fix test_networks

* fix test_policy

* fix test network

* fix some ppo/sac tests

* add back conftest.py

* improve specification of trainer type

* add defaults fpr trainer_type/hyperparam

* fix test_saver

* fix reward providers

* add settings check utility for tests

* fix some settings tests

* add trainer types to run_experiment

* type check for arbitary actor data

* cherrypick rename ml-agents/trainers/torch to torch_entities (#55)

* make all trainers types and setting visible at module level

* remove settings from run_experiment console script

* fix test_settings and upgrade config scripts

* remove need of trainer_type argument up to trainefactory

* fix gohst trainer behavior id in policy Queue

* fix torch shadow in tests

* update trainers, rl trainers tests

* update tests to match the refactors

* fixing behavior name in ghost trainer

* update ml-agents-envs test configs

* fix precommit

* separating the plugin package changes

* bring get_policy back for sake of ghost trainer

* add return types and remove unused returns

* remove duplicate methods in poca (_update_policy, add_policy)

* add a2c trainer back

* Add DQN cleaned up trainer/optimizer

* nit naming

* fix logprob/entropy types in torch_policy.py

* clean up DQN/SAC

* add docs for custom trainers,TODO: refrence tutorial

* add docs for custom trainers,TODO: refrence tutorial

* add clipping to loss function

* set old importlim-metadata version

* bump precomit hook env to 3.8.x

* use smooth l1 loss

Co-authored-by: mahon94 <maryam.honari@unity3d.com>

* add tutorial for validation

* fix formatting errors

* clean up

* minor changes

Co-authored-by: Andrew Cohen <andrew.cohen@unity3d.com>
Co-authored-by: zhuo <zhuo@unity3d.com>

2022-10-20 16:06:58 -04:00

1 Коммитов