Граф коммитов

117 Коммитов

Автор SHA1 Сообщение Дата
Thomas Simonini 06968765ba
Integrating Hugging Face Hub 🤗 (updated) (#5856)
* Add Hugging Face Integration

* Update setup.py

* Update push_to_hf.py

* Update push_to_hf.py

* Update push_to_hf.py

* Update push_to_hf.py

Remove use_auth_token

* Update push_to_hf.py

* Update push_to_hf.py

* Create Huggy

* Update load_from_hf.py

*Change loading to snapshot_download (able to use cache)

* Update push_to_hf.py

* Use create_repo and upload_folder instead of git

* Apply suggestions from code review

Co-authored-by: Lucain <lucainp@gmail.com>
Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

* Delete Huggy + update load and push to hf

* Delete Huggy config file
* Update load_from_hf
* Update push_to_hf

* Apply suggestions from code review

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
Co-authored-by: Lucain <lucainp@gmail.com>

* Update push_to_hf.py

* Update with Omar Feedback

* Black style formatter

* Create __init__.py

* Updates based on flake8 review

* Change logging to MLAgents logger

* Update python version

* Update Python Version

* Update setup.py

* Update logger

* Update logger

* Pre-commit

* Update Tuple

* Update metadata generation

* Ignore mypy error

* Create Hugging-Face-Integration.md

* Update ML-Agents-Toolkit-Documentation.md

* Update style of Hugging-Face-Integration.md

* Typo

* Remove spaces

---------

Co-authored-by: Lucain <lucainp@gmail.com>
Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
2023-06-06 13:10:27 -04:00
Hunter-Unity 9ee80bf490
Retrain Walker (#5911)
* reduce hidden nodes to 256 and retrain 30M steps

* update demo file
2023-04-27 13:10:41 -04:00
Ruo-Ping Dong 9a6def3003
Fix FoodCollector behavior name in SAC config (#5468) 2021-07-22 15:54:41 -07:00
Ervin T 8230987921
[ci] Shorten SAC runs (#5354) 2021-05-11 20:06:54 -04:00
Ervin T 4078fa6283
Better hyperparameters for Hallway-SAC (#5339) 2021-05-04 15:00:27 -04:00
Ervin T 0c96d7dd05
[config] Disable `threading` by default (#5221)
* Remove threading as default

* New description

* Remove threaded option from YAML configs

* Remove from Match3
2021-04-05 18:49:08 -04:00
Ervin T 2ce6810846
[bug-fix] Fix POCA LSTM, pad sequences in the back (#5206)
* Pad buffer at the end

* Fix padding in optimizer value estimate

* Fix additional bugs and POCA

* Fix groupmate obs, add tests

* Update changelog

* Improve tests

* Address comments

* Fix poca test

* Fix buffer test

* Increase entropy for Hallway

* Add EOF newline

* Fix Behavior Name

* Address comments
2021-04-05 18:42:14 -04:00
Vincent-Pierre BERGES 92ff2c26fe
Goal conditioning grid world : Example of goal conditioning (#5193)
* Aded the Goal conditioned GridWorld to replace regular gridworld

* adding missing files

* Code improvements

* Documentation change on gridworld

* resolving conflicts

* new model

* Addressing comments

* comments and renames

* Update docs/Learning-Environment-Examples.md

Co-authored-by: Ervin T. <ervin@unity3d.com>

* adding reference to gridworld in docs about goal signal

Co-authored-by: Chris Elion <chris.elion@unity3d.com>
Co-authored-by: Ervin T. <ervin@unity3d.com>
2021-03-31 15:17:53 -07:00
andrewcoh 875feb0150
Fix path to PushBlock demo (#5198) 2021-03-30 12:40:44 -04:00
andrewcoh 41818c5e42
Reduce pb collab steps to 15M (#5196) 2021-03-30 10:55:54 -04:00
andrewcoh 1c595e4d5a
Remove env settings from Sorter (#5146) 2021-03-17 12:18:13 -04:00
andrewcoh 23368fce04
Fix GridFoodCollector yaml (#5134) 2021-03-16 17:58:41 -04:00
Hunter-Unity 682a2856dd
Add DungeonEscape POCA Environment (#5128)
* Add DungeonEscape assets from working branch

* Add Dungeon Escape docs

* Create dungeon_escape.png
2021-03-16 17:20:51 -04:00
andrewcoh 450e5220db
Integrate Group Manager to soccer/retrain with POCA (#5115) 2021-03-15 18:37:04 -04:00
andrewcoh 8545a0dadb
Move PushBlockCollab config to poca directory (#5097) 2021-03-12 09:52:31 -05:00
Ervin T 07997d0096
[environment] Push Block Collaborative (#5090)
* Add pushblock collab

* Make SimpleMultiAgentGroup public

* Remove GoalDetectTrigger

* Remove GDT meta file

* Remove some comments

* Add training configuration

* Rename behavior

* Add to docs

* Change the reward structure in docs

* Add back GoalDetectTrigger

Co-authored-by: HH <brandonh@unity3d.com>
2021-03-11 21:11:23 -05:00
Vincent-Pierre BERGES 686518f8ac
renaming of behavior name for imitation crawler (#5039) 2021-03-05 10:35:07 -08:00
Vincent-Pierre BERGES cedc75cff5
Removing some scenes (#4997)
* Removing some scenes, All the Static and all the non variable speed environments. Also removed Bouncer, PushBlock, WallJump and reacher. Removed a bunch of visual environements as well. Removed 3DBallHard and FoodCollector (kept Visual and Grid FoodCollector)

* readding 3DBallHard

* readding pushblock and walljump

* Removing tennis

* removing mentions of removed environments

* removing unused images

* Renaming Crawler demos

* renaming some demo files

* removing and modifying some config files

* new examples image?

* removing Bouncer from build list

* replacing the Bouncer environment with Match3 for llapi tests

* Typo in yamato test
2021-03-04 16:00:53 -08:00
Christopher Goy 06d2f759b4 Merge branch 'master' into release_13_branch-to-master 2021-02-24 14:57:14 -08:00
Christopher Goy ce48a6e61f Merge master -> release_13_branch-to-master 2021-02-24 14:43:44 -08:00
vincentpierre b3dbfdc7cf Fixing the number of layers in the config of PyramidsRND 2021-02-24 14:28:56 -08:00
andrewcoh ad2680ea65
Set ignore done=False in GAIL (#4971) 2021-02-22 18:33:02 -05:00
Chris Elion a06b1dac85
[MLA-1768] retrain Match3 scene (#4943)
* improved settings and move to default_settings

* update models
2021-02-16 22:56:08 -08:00
vincentpierre 076c37f164 New curriculum, new model 2021-02-11 09:09:29 -08:00
vincentpierre 7f302492f3 new curriculum 2021-02-10 09:12:37 -08:00
vincentpierre 63a88c5ec0 [skip ci] Attempting new config 2021-02-08 10:35:33 -08:00
vincentpierre eb31fd6063 addressing some of the comments 2021-02-05 09:46:14 -08:00
vincentpierre 833ec81825 - 2021-02-02 20:51:40 -08:00
Andrew Cohen 6ba4b9b823 refactored sequence env 2021-01-18 15:38:29 -05:00
Ervin T ec850ae075
[bug-fix] Disable threading for self-play envs (#4679) 2020-11-19 16:54:23 -08:00
Chris Elion f7ef326392
Match3 example (#4515) 2020-11-04 16:33:33 -08:00
Ruo-Ping (Rachel) Dong 53c27ee8c8
Add Visual3DBall scene (#4513)
* Add Visual3DBall scene which use visual observations with stacking
2020-10-07 11:50:10 -07:00
Ruo-Ping (Rachel) Dong a261b407af
Add vector flag of agent's frozen state to VisualFoodCollector (#4511)
VisualFoodCollector is now an example environment of using a mix of visual and vector observation and is able to train with default config file.

Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>
2020-10-06 14:02:47 -07:00
Vincent-Pierre BERGES 73fa8bd062
Random Network Distillation for Torch (#4473)
* initial commit

* works with Pyramids

* added unit tests and a separate config file

* Adding first batch of documentation

* adding in the docs that rnd is only for PyTorch

* adding newline at the end of the config files

* adding some docs

* Code comments

* no normalization of the reward

* Fixing the tests

* [skip ci]

* [skip ci] Make sure RND will only work for Torch by editing the config file

* [skip ci] Additional information in the Documentation

* Remove the _has_updated_once flag
2020-09-23 15:11:03 -07:00
Hunter-Unity 9cea1524e3
Worm Ragdoll & Env Updates (#4413)
* add worm updates

* add rewman

* cp

* normalize rewards

* only cookie

* try 20M. Add3.5Mnn file

* reduce strength to 3000spring

* facing reward troubleshooting

* Update WormAgent.cs

* troubleshoot nan

* try product of rewards

* train 5M steps

* try end episode on target touch

* fix joint obsv

* use 7M steps

* added nn file for observation joint fix. looks great

* don't end episode

* remove old code

* refactor to patterns used in walker & crawler

* add auto-setup code

* reformat

* use head vel

* remove unneeded observ. update prefabs

* update static scenes

* keeps rolling. added debug. try 5 m/s

* gate the facing reward based on angle tolerance

* added 10ms_angle30rew_nn files

* use fromto rot

* use 7M steps

* add new trained files. cleanup code and prefabs

* use avgvel. add code comments

* remove unused method

* add more comments

* Update Learning-Environment-Examples.md

* Update DynamicTargetPlatform.prefab

* remove testing tools

* reset targetcontroller to master

* reset mat to master

* update case

* change property name, update prefab

* add new set up methods

* update format
2020-08-27 14:58:33 -07:00
Hunter-Unity 000e9e264d
New Crawler Variable Speed Scenes (#4382)
* init

* updating prefabs

* spawn a target

* add brains

* update static prefabs

* enable enhanced determinism

* reset manifest

* add nn files. update to 15M steps

* update prefabs

* increase max speed to 15

* add new local model for 15 speed

* update prefabs

* add configs

* update configs/prefabs

* cleanup

* added final nn models

* add new demos and do more cleanup.

* add meta files

* add RigidbodySensor

* update prefab. about to retrain

* remove body pen

* add fixed crawler & retrained nn file, new demos

* train 10M steps

* Update Crawler Docs

* more prefab cleanup

* add meta files

* Update Project/Assets/ML-Agents/Examples/Crawler/Scripts/CrawlerAgent.cs

Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>

* remove unused prefab

* update comment

* add summary tags

* cleanup and add more comments

* remove unused prefab

* Update Project/Assets/ML-Agents/Examples/Crawler/Scripts/CrawlerAgent.cs

Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>

* update nnmodel property names, restructure initialization

* fix pre-format

* update comment

* update format

* remove NaN checks

Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>
2020-08-27 11:46:08 -07:00
Jaden Travnik 4cb9168a0c
Grid Sensor (#4399)
Co-authored-by: Chris Elion <chris.elion@unity3d.com>
2020-08-26 14:08:14 -07:00
Ervin T 3a20ce3dd1
[ci] Shorten max steps for strikergoalie (#4394) 2020-08-20 13:44:42 -07:00
Ervin T cc524a1cbc
Reduce max steps for striker vs. goalie (#4377) 2020-08-18 16:06:00 -07:00
Hunter-Unity 69d8cd8ece
New Variable Speed Walker Environments (#4301)
* init

* Add reward manager and hurryUpReward

* fix hurry reward/ add awful first training

* Turn off head height and hurry rew

* changed max speed to 15. added small hh rew

* add NaN check for reward manager. start vel penalty

* add bpVel pen

* add new BPVelPen nn file

* remove outdated nn file

* add randomize speed bool

* try rewad product

* change coeff to 1

* try avg vel of all bp for reward

* move outside loop

* try linear inverselerp for vel

* add avg rew matchspeed15 nn file. looks much better

* save scene

* no hand penalty, random walk speed

* fix inverse lerp

* try new reward falloff

* cleanup

* added new nn file. don't allow hand contact

* update obsv

* remove hh rew. add trained no-hh model

* add new nn file

* new curve

* add new models. try no reset

* add hh rew

* clamp hh

* zero rewards if ground contact

* switch to approved with moving target

* try new dot

* add shifted dot and reg dot nn file

* add WalkerStaticVariableSpeedScene and PPO config

* add a NaN debug for action values

* start dynamic cleanup and more debug for NaNs

* more cleanup

* add WalkerDynamicVarialbeSpeed scene and update prefabs

* add trained static walker nn file

* About to do cleanup

* add all scenes

* reduce numpy ver

* add new no hh nn models and update prefabs

* add hh rew

* try 15k strength. reset jdcontroller to master

* remove h rew 10k strength

* increase to 30M

* trying to figure out shuffle foot regression. added 10k no hh model

* about to train 20k strength, no hh, no rolling targ 30M

* fixed shuffle step regr with 20k no hh

* update prefabs with new models. walkerstatic failed to train

* saved scene

* implemented distToTarget Instead of targetPos

* add dist observ nn files

* more cleanup

* reduce maxSpeed to 10, update prefabs

* max dist 50 avg core vel

* cleanup

* use all bp for avg vel

* reimplement cube relTargetPos

* update prefabs

* add relPos clamped to 100m models

* cleanup

* more prefab cleanup

* more cleanup

* remove unused prefabs

* remove unused code

* replace demo files

* remove demorecorder

* reset ppo learn.py to master

* reset these to master

* Update Learning-Environment-Examples.md

* cleanup from PR review

* more cleanup

* add code comments

* observe velocity delta

* add additional velocity observations

* add hh

* add trained models. remove hh rew

* remove multiple walk dir methods because its confusing

* update walker static vs prefabs

* add new trained models and romove old ones

* add new demo files

* reset script to master

* custom setter for TargetWalkingSpeed

* update benchmarks based on new models

* cleanup per PR suggestions
2020-08-13 17:16:07 -07:00
Vincent-Pierre BERGES bb61418fd7
Refactor of Curriculum and parameter sampling (#4160)
* Introduced the Constant Parameter Sampler that will be useful later as samplers and floats can be used interchangeably

* Refactored the settings.py to refect the new format of the config.yaml

* First working version

* Added the unit tests

* Update to Upgrade for Updates

* fixing the tests

* Upgraded the config files

* Fixes

* Additional error catching

* addressing some comments

* Making the code nicer with cattr

* Added and registered an unstructure hook for PrameterRandomization

* Updating C# Walljump

* Adding comments

* Add test for settings export (#4164)

* Add test for settings export

* Update ml-agents/mlagents/trainers/tests/test_settings.py

Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>

Co-authored-by: Vincent-Pierre BERGES <vincentpierre@unity3d.com>

* Including environment parameters for the test for settings export

* First documentation update

* Fixing a link

* Updating changelog and migrating

* adding some more tests for the conversion script

* fixing bugs and using samplers in the walljump curriculum

* Changing the format of the curriculum file as per discussion

* Addressing comments

* Update ml-agents/mlagents/trainers/settings.py

Co-authored-by: Ervin T. <ervin@unity3d.com>

* Update docs/Migrating.md

Co-authored-by: Chris Elion <chris.elion@unity3d.com>

* addressing comments

Co-authored-by: Ervin T <ervin@unity3d.com>
Co-authored-by: Chris Elion <chris.elion@unity3d.com>
2020-07-07 15:10:30 -07:00
andrewcoh a599e116c8
Fix 3DBall PPO hard regression (#4133) 2020-07-07 11:47:05 -07:00
Hunter-Unity 3363b44c26
Add TargetController/OrientationCubeController Components & Bugfix (#4157)
* added Target and OCube controllers. updated crawler envs

* update walker prefab

* add refs to prefab

* Update Crawler.prefab

* update platform, ragdoll,  ocube prefabs

* reformat file

* reformat files

* fix behavior name

* add final retrained crawler and walker nn files

* collect hip ocube rot in world space

* update crawler observations and update prefabs

* change to 20M steps

* update crwl prefab to 142 observ

* update obsvs to 241. add expvel  reward

* change walkspeed to 3

* add new crawler and walker nn files

* adjust rewards

* enable other pairs

* add RewardManager

* cleanup about to do final training

* cleanup add nn files for increased facing rew reduced height rew

* try no facing rew

* add vel only policy, try dy target

* inc torq on cube

* added dynamic cube nn. gonna try 40M steps

* add 40M step test, more cleanup

* change back to 20M steps

* Update WalkerStatic.unity

* add no vel pen nn file

* .005 head height rew

* remove extra walker in scene

* Update WalkerWithTargetPair.prefab

* Update WalkerStatic.unity

* more cleanup add new nn file with less head height reward

* added Target and OCube controllers. updated crawler envs

* update walker prefab

* add refs to prefab

* Update Crawler.prefab

* update platform, ragdoll,  ocube prefabs

* reformat file

* reformat files

* fix behavior name

* add final retrained crawler and walker nn files

* collect hip ocube rot in world space

* update crawler observations and update prefabs

* change to 20M steps

* update crwl prefab to 142 observ

* update obsvs to 241. add expvel  reward

* change walkspeed to 3

* add new crawler and walker nn files

* adjust rewards

* enable other pairs

* add RewardManager

* cleanup about to do final training

* cleanup add nn files for increased facing rew reduced height rew

* try no facing rew

* add vel only policy, try dy target

* inc torq on cube

* added dynamic cube nn. gonna try 40M steps

* add 40M step test, more cleanup

* change back to 20M steps

* Update WalkerStatic.unity

* add no vel pen nn file

* .005 head height rew

* remove extra walker in scene

* Update WalkerWithTargetPair.prefab

* Update WalkerStatic.unity

* more cleanup add new nn file with less head height reward

* cleanup

* remove comment

* more cleanup

* correct format

* Update ProjectVersion.txt

* change to Log()

* cleanup

* use the starting y position instead of a hard coded height

* test old fromtorot

* add 236 model

* testing new 236 nn files

* add final walker nn files

* cleanup

* crawler cleanup

* update crawler observ size

* add final crawler nn files

* fixed formatting ssues
2020-07-02 14:14:33 -07:00
Ervin T 5e60954bf0
[CI] Better hyperparameters for Pyramids-SAC, WalkerStatic-SAC, and Reacher-PPO (#4154) 2020-06-29 12:48:32 -07:00
andrewcoh dba27ad244
Fix 3DBall and 3DBallHard SAC regressions (#4132) 2020-06-17 15:30:51 -07:00
andrewcoh 20527d1012
Moving domain randomization to C# (#4065) 2020-06-12 16:44:44 -07:00
Ervin T 09853def2b
[refactor] Remove nonfunctional `output_path` option from TrainerSettings (#4087) 2020-06-09 11:42:53 -07:00
Hunter-Unity ccca2fb4a7
Add Dynamic Walker. Improved Ragdoll Stability/Performance (#4037)
* about to implement orientation cube

* oCube spawining works. ready to train

* working. about to try com

* ready for training

* add random rot on episode start

* feet now alternate but runs backwards

* still running with right leg in front

* increased joint strength to 40k

* removed texture example

* reduced maxAngVel, enabled enhanced determinism, cont spec

* rebuilt walker ragdoll to scale 1

* rebuilt ragdoll ready

* update walker pair prefab

* fixed bp heirarchy

* added trained model, renamed scene, usecollisioncallbacks

* updated dynamic platforms

* added dynamic walker tf file. max speed 5

* DynamicWalker working. has working nn file

* collect local rotations

* added new dynamic nn file

* hip facing reward

* Create WalkerDynamic.yaml

* fix hip rotation

* about to clean up code

* added dirIndicator and orentCubeGizmo

* clean up

* cleanup

* updated WalkerStatic scene with new ragdoll

* cleanup

* updated walker dynamic demo file. cleanup

* iterate through list not dict to collect observations

* increase gravity to 1.5

* try 100M steps on walkerdynamic

* 100M steps

* add dir vector obsv

* 2e7 steps

* testing  new nn models

* testing bigger batch size

* try 8x mem for cloud

* 8x batch size for cloud test

* epoch 10

* hyptest

* cp

* increase timescale for cloudtraining

* cp

* try new cluster

* cp

* 200k buff cloud

* cleanup & put direction indicator in separate script

* update configs

* about to implement orientation cube

* oCube spawining works. ready to train

* working. about to try com

* ready for training

* add random rot on episode start

* feet now alternate but runs backwards

* still running with right leg in front

* increased joint strength to 40k

* removed texture example

* reduced maxAngVel, enabled enhanced determinism, cont spec

* rebuilt walker ragdoll to scale 1

* rebuilt ragdoll ready

* update walker pair prefab

* fixed bp heirarchy

* added trained model, renamed scene, usecollisioncallbacks

* updated dynamic platforms

* added dynamic walker tf file. max speed 5

* DynamicWalker working. has working nn file

* collect local rotations

* added new dynamic nn file

* hip facing reward

* Create WalkerDynamic.yaml

* fix hip rotation

* about to clean up code

* added dirIndicator and orentCubeGizmo

* clean up

* cleanup

* updated WalkerStatic scene with new ragdoll

* cleanup

* updated walker dynamic demo file. cleanup

* iterate through list not dict to collect observations

* increase gravity to 1.5

* try 100M steps on walkerdynamic

* 100M steps

* add dir vector obsv

* 2e7 steps

* testing  new nn models

* testing bigger batch size

* try 8x mem for cloud

* 8x batch size for cloud test

* epoch 10

* hyptest

* cp

* increase timescale for cloudtraining

* cp

* try new cluster

* cp

* 200k buff cloud

* cleanup & put direction indicator in separate script

* update configs

* update configs to new class format

* added final nn files

* more cleanup

* new walker image for docs

* Update walker docs

* remove old gitignore item

* cleanup

* Delete trainer_config.yaml

* Update CHANGELOG.md

* remove code comment

* changed property to float

* rename variable

* remove header

* rename function

* added code comment and consolidated similar properties

* removed unused asset

* make maxAngularVelocity a constant

* cleeanup remove tab

* cleanup - remove unneeded header attr

* added code comments

* auto-format doc to remove unwanted tabs

* add new trained model. increase max step for dynamic

* add code comments. update oCube system. cleanup

* move orientation cube to shared prefabs

* refactored reward function variables

* removed header

* add SAC configs

* added new dynamic walker nn file

* remove old config

* add new config

* fix project ver
2020-06-04 10:49:20 -07:00
Ervin T 3023f45dea
[refactor] Improve config upgrade script and add test (#4056) 2020-06-03 17:17:06 -07:00
andrewcoh afb94e46f5
Self play hyperparameter improvements (#4063) 2020-06-03 12:18:25 -07:00