ml-agents

История

Maryam Honari df96d5c835 Develop custom trainers (#73 ) * Make create_policy more generic (#54) * add on/off policy classes and inherit from * trainers as plugins * remove swap files * clean up registration debug * clean up all pre-commit * a2c plugin pass precommit * move gae to trainer utils * move lambda return to trainer util * add validator for num_epoch * add types for settings/type methods * move create policy into highest level api * move update_reward_signal into optimizer * move get_policy into Trainer * remove get settings type * dummy_config settings * move all stats from actor into dict, enables arbitrary actor data * remove shared_critic flag, cleanups * refactor create_policy * remove sample_actions, evaluate_actions, update_norm from policy * remove comments * fix return type get stat * update poca create_policy * clean up policy init * remove conftest * add sharedecritic to settings * fix test_networks * fix test_policy * fix test network * fix some ppo/sac tests * add back conftest.py * improve specification of trainer type * add defaults fpr trainer_type/hyperparam * fix test_saver * fix reward providers * add settings check utility for tests * fix some settings tests * add trainer types to run_experiment * type check for arbitary actor data * cherrypick rename ml-agents/trainers/torch to torch_entities (#55) * make all trainers types and setting visible at module level * remove settings from run_experiment console script * fix test_settings and upgrade config scripts * remove need of trainer_type argument up to trainefactory * fix gohst trainer behavior id in policy Queue * fix torch shadow in tests * update trainers, rl trainers tests * update tests to match the refactors * fixing behavior name in ghost trainer * update ml-agents-envs test configs * separating the plugin package changes * bring get_policy back for sake of ghost trainer * add return types and remove unused returns * remove duplicate methods in poca (_update_policy, add_policy) Co-authored-by: mahon94 <maryam.honari@unity3d.com> * Online/offline custom trainer examples with plugin system (#52) * add on/off policy classes and inherit from * trainers as plugins * a2c trains * remove swap files * clean up registration debug * clean up all pre-commit * a2c plugin pass precommit * move gae to trainer utils * move lambda return to trainer util * add validator for num_epoch * add types for settings/type methods * move create policy into highest level api * move update_reward_signal into optimizer * move get_policy into Trainer * remove get settings type * dummy_config settings * move all stats from actor into dict, enables arbitrary actor data * remove shared_critic flag, cleanups * refactor create_policy * remove sample_actions, evaluate_actions, update_norm from policy * remove comments * fix return type get stat * update poca create_policy * clean up policy init * remove conftest * add sharedecritic to settings * fix test_networks * fix test_policy * fix test network * fix some ppo/sac tests * add back conftest.py * improve specification of trainer type * add defaults fpr trainer_type/hyperparam * fix test_saver * fix reward providers * add settings check utility for tests * fix some settings tests * add trainer types to run_experiment * type check for arbitary actor data * cherrypick rename ml-agents/trainers/torch to torch_entities (#55) * make all trainers types and setting visible at module level * remove settings from run_experiment console script * fix test_settings and upgrade config scripts * remove need of trainer_type argument up to trainefactory * fix gohst trainer behavior id in policy Queue * fix torch shadow in tests * update trainers, rl trainers tests * update tests to match the refactors * fixing behavior name in ghost trainer * update ml-agents-envs test configs * fix precommit * separating the plugin package changes * bring get_policy back for sake of ghost trainer * add return types and remove unused returns * remove duplicate methods in poca (_update_policy, add_policy) * add a2c trainer back * Add DQN cleaned up trainer/optimizer * nit naming * fix logprob/entropy types in torch_policy.py * clean up DQN/SAC * add docs for custom trainers,TODO: refrence tutorial * add docs for custom trainers,TODO: refrence tutorial * add clipping to loss function * set old importlim-metadata version * bump precomit hook env to 3.8.x * use smooth l1 loss Co-authored-by: mahon94 <maryam.honari@unity3d.com> * add tutorial for validation * fix formatting errors * clean up * minor changes Co-authored-by: Andrew Cohen <andrew.cohen@unity3d.com> Co-authored-by: zhuo <zhuo@unity3d.com>		2022-10-20 16:06:58 -04:00
..
a2c	Develop custom trainers (#73 )	2022-10-20 16:06:58 -04:00
dqn	Develop custom trainers (#73 )	2022-10-20 16:06:58 -04:00
__init__.py	Develop custom trainers (#73 )	2022-10-20 16:06:58 -04:00