Fix documentation typos and list rendering (#6066)
* Fix list being rendered incorrectly in webdocs I assume this extra blank line will fix the list not being correctly formatted on https://unity-technologies.github.io/ml-agents/#releases-documentation * Fix typos in docs * Fix more mis-rendered lists Add a blank line before bulleted lists in markdown files to avoid them being rendered as in-paragraph sentences that all start with hyphens. * Fix typos in python comments used to generate docs
This commit is contained in:
Родитель
1bee58f5bb
Коммит
4f2cfd1b6b
|
@ -620,6 +620,7 @@ the order of the entities, so there is no need to properly "order" the
|
|||
entities before feeding them into the `BufferSensor`.
|
||||
|
||||
The `BufferSensorComponent` Editor inspector has two arguments:
|
||||
|
||||
- `Observation Size` : This is how many floats each entities will be
|
||||
represented with. This number is fixed and all entities must
|
||||
have the same representation. For example, if the entities you want to
|
||||
|
|
|
@ -231,7 +231,7 @@ you would like to contribute environments, please see our
|
|||
objects around agent's forward direction (40 by 40 with 6 different categories).
|
||||
- Actions:
|
||||
- 3 continuous actions correspond to Forward Motion, Side Motion and Rotation
|
||||
- 1 discrete acion branch for Laser with 2 possible actions corresponding to
|
||||
- 1 discrete action branch for Laser with 2 possible actions corresponding to
|
||||
Shoot Laser or No Action
|
||||
- Visual Observations (Optional): First-person camera per-agent, plus one vector
|
||||
flag representing the frozen state of the agent. This scene uses a combination
|
||||
|
|
|
@ -434,6 +434,7 @@ Similarly to Curiosity, Random Network Distillation (RND) is useful in sparse or
|
|||
reward environments as it helps the Agent explore. The RND Module is implemented following
|
||||
the paper [Exploration by Random Network Distillation](https://arxiv.org/abs/1810.12894).
|
||||
RND uses two networks:
|
||||
|
||||
- The first is a network with fixed random weights that takes observations as inputs and
|
||||
generates an encoding
|
||||
- The second is a network with similar architecture that is trained to predict the
|
||||
|
@ -491,9 +492,9 @@ to the expert, the agent is incentivized to remain alive for as long as possible
|
|||
This can directly conflict with goal-oriented tasks like our PushBlock or Pyramids
|
||||
example environments where an agent must reach a goal state thus ending the
|
||||
episode as quickly as possible. In these cases, we strongly recommend that you
|
||||
use a low strength GAIL reward signal and a sparse extrinisic signal when
|
||||
use a low strength GAIL reward signal and a sparse extrinsic signal when
|
||||
the agent achieves the task. This way, the GAIL reward signal will guide the
|
||||
agent until it discovers the extrnisic signal and will not overpower it. If the
|
||||
agent until it discovers the extrinsic signal and will not overpower it. If the
|
||||
agent appears to be ignoring the extrinsic reward signal, you should reduce
|
||||
the strength of GAIL.
|
||||
|
||||
|
|
|
@ -21,7 +21,7 @@ from mlagents_envs.envs.unity_gym_env import UnityToGymWrapper
|
|||
|
||||
## Migrating the package to version 2.x
|
||||
- The official version of Unity ML-Agents supports is now 2022.3 LTS. If you run
|
||||
into issues, please consider deleting your project's Library folder and reponening your
|
||||
into issues, please consider deleting your project's Library folder and reopening your
|
||||
project.
|
||||
- If you used any of the APIs that were deprecated before version 2.0, you need to use their replacement. These
|
||||
deprecated APIs have been removed. See the migration steps bellow for specific API replacements.
|
||||
|
@ -130,7 +130,7 @@ values from `GetMaxBoardSize()`.
|
|||
|
||||
### GridSensor changes
|
||||
The sensor configuration has changed:
|
||||
* The sensor implementation has been refactored and exsisting GridSensor created from extension package
|
||||
* The sensor implementation has been refactored and existing GridSensor created from extension package
|
||||
will not work in newer version. Some errors might show up when loading the old sensor in the scene.
|
||||
You'll need to remove the old sensor and create a new GridSensor.
|
||||
* These parameters names have changed but still refer to the same concept in the sensor: `GridNumSide` -> `GridSize`,
|
||||
|
@ -151,8 +151,8 @@ data type changed from `float` to `int`. The index of first detectable tag will
|
|||
* The observation data should be written to the input `dataBuffer` instead of creating and returning a new array.
|
||||
* Removed the constraint of all data required to be normalized. You should specify it in `IsDataNormalized()`.
|
||||
Sensors with non-normalized data cannot use PNG compression type.
|
||||
* The sensor will not further encode the data recieved from `GetObjectData()` anymore. The values
|
||||
recieved from `GetObjectData()` will be the observation sent to the trainer.
|
||||
* The sensor will not further encode the data received from `GetObjectData()` anymore. The values
|
||||
received from `GetObjectData()` will be the observation sent to the trainer.
|
||||
|
||||
### LSTM models from previous releases no longer supported
|
||||
The way that Sentis processes LSTM (recurrent neural networks) has changed. As a result, models
|
||||
|
@ -169,7 +169,7 @@ the model using the python trainer from this release.
|
|||
- `VectorSensor.AddObservation(IEnumerable<float>)` is deprecated. Use `VectorSensor.AddObservation(IList<float>)`
|
||||
instead.
|
||||
- `ObservationWriter.AddRange()` is deprecated. Use `ObservationWriter.AddList()` instead.
|
||||
- `ActuatorComponent.CreateAcuator()` is deprecated. Please use override `ActuatorComponent.CreateActuators`
|
||||
- `ActuatorComponent.CreateActuator()` is deprecated. Please use override `ActuatorComponent.CreateActuators`
|
||||
instead. Since `ActuatorComponent.CreateActuator()` is abstract, you will still need to override it in your
|
||||
class until it is removed. It is only ever called if you don't override `ActuatorComponent.CreateActuators`.
|
||||
You can suppress the warnings by surrounding the method with the following pragma:
|
||||
|
@ -376,7 +376,7 @@ vector observations to be used simultaneously.
|
|||
method names will be removed in a later release:
|
||||
- `InitializeAgent()` was renamed to `Initialize()`
|
||||
- `AgentAction()` was renamed to `OnActionReceived()`
|
||||
- `AgentReset()` was renamed to `OnEpsiodeBegin()`
|
||||
- `AgentReset()` was renamed to `OnEpisodeBegin()`
|
||||
- `Done()` was renamed to `EndEpisode()`
|
||||
- `GiveModel()` was renamed to `SetModel()`
|
||||
- The `IFloatProperties` interface has been removed.
|
||||
|
@ -532,7 +532,7 @@ vector observations to be used simultaneously.
|
|||
depended on [PEP420](https://www.python.org/dev/peps/pep-0420/), which caused
|
||||
problems with some of our tooling such as mypy and pylint.
|
||||
- The official version of Unity ML-Agents supports is now 2022.3 LTS. If you run
|
||||
into issues, please consider deleting your library folder and reponening your
|
||||
into issues, please consider deleting your library folder and reopening your
|
||||
projects. You will need to install the Sentis package into your project in
|
||||
order to ML-Agents to compile correctly.
|
||||
|
||||
|
|
|
@ -9,7 +9,7 @@ You can find them at `Edit` > `Project Settings...` > `ML-Agents`. It lists out
|
|||
## Create Custom Settings
|
||||
In order to to use your own settings for your project, you'll need to create a settings asset.
|
||||
|
||||
You can do this by clicking the `Create Settings Asset` buttom or clicking the gear on the top right and select `New Settings Asset...`.
|
||||
You can do this by clicking the `Create Settings Asset` button or clicking the gear on the top right and select `New Settings Asset...`.
|
||||
The asset file can be placed anywhere in the `Asset/` folder in your project.
|
||||
After Creating the settings asset, you'll be able to modify the settings for your project and your settings will be saved in the asset.
|
||||
|
||||
|
@ -21,7 +21,7 @@ You can create multiple settings assets in one project.
|
|||
|
||||
By clicking the gear on the top right you'll see all available settings listed in the drop-down menu to choose from.
|
||||
|
||||
This allows you to create different settings for different scenatios. For example, you can create two
|
||||
This allows you to create different settings for different scenarios. For example, you can create two
|
||||
separate settings for training and inference, and specify which one you want to use according to what you're currently running.
|
||||
|
||||
![Multiple Settings](images/multiple-settings.png)
|
||||
|
|
|
@ -1,6 +1,6 @@
|
|||
# Profiling in Python
|
||||
|
||||
As part of the ML-Agents Tookit, we provide a lightweight profiling system, in
|
||||
As part of the ML-Agents Toolkit, we provide a lightweight profiling system, in
|
||||
order to identity hotspots in the training process and help spot regressions
|
||||
from changes.
|
||||
|
||||
|
|
|
@ -5,7 +5,7 @@ capabilities. we introduce an extensible plugin system to define new trainers ba
|
|||
in `Ml-agents` Package. This will allow rerouting `mlagents-learn` CLI to custom trainers and extending the config files
|
||||
with hyper-parameters specific to your new trainers. We will expose a high-level extensible trainer (both on-policy,
|
||||
and off-policy trainers) optimizer and hyperparameter classes with documentation for the use of this plugin. For more
|
||||
infromation on how python plugin system works see [Plugin interfaces](Training-Plugins.md).
|
||||
information on how python plugin system works see [Plugin interfaces](Training-Plugins.md).
|
||||
## Overview
|
||||
Model-free RL algorithms generally fall into two broad categories: on-policy and off-policy. On-policy algorithms perform updates based on data gathered from the current policy. Off-policy algorithms learn a Q function from a buffer of previous data, then use this Q function to make decisions. Off-policy algorithms have three key benefits in the context of ML-Agents: They tend to use fewer samples than on-policy as they can pull and re-use data from the buffer many times. They allow player demonstrations to be inserted in-line with RL data into the buffer, enabling new ways of doing imitation learning by streaming player data.
|
||||
|
||||
|
|
|
@ -11,7 +11,7 @@ Unity environment via Python.
|
|||
|
||||
## Installation
|
||||
|
||||
The gym wrapper is part of the `mlgents_envs` package. Please refer to the
|
||||
The gym wrapper is part of the `mlagents_envs` package. Please refer to the
|
||||
[mlagents_envs installation instructions](ML-Agents-Envs-README.md).
|
||||
|
||||
|
||||
|
|
|
@ -678,7 +678,7 @@ of downloading the Unity Editor.
|
|||
The UnityEnvRegistry implements a Map, to access an entry of the Registry, use:
|
||||
```python
|
||||
registry = UnityEnvRegistry()
|
||||
entry = registry[<environment_identifyier>]
|
||||
entry = registry[<environment_identifier>]
|
||||
```
|
||||
An entry has the following properties :
|
||||
* `identifier` : Uniquely identifies this environment
|
||||
|
@ -689,7 +689,7 @@ An entry has the following properties :
|
|||
To launch a Unity environment from a registry entry, use the `make` method:
|
||||
```python
|
||||
registry = UnityEnvRegistry()
|
||||
env = registry[<environment_identifyier>].make()
|
||||
env = registry[<environment_identifier>].make()
|
||||
```
|
||||
|
||||
<a name="mlagents_envs.registry.unity_env_registry.UnityEnvRegistry.register"></a>
|
||||
|
|
|
@ -694,7 +694,7 @@ class Lesson()
|
|||
```
|
||||
|
||||
Gathers the data of one lesson for one environment parameter including its name,
|
||||
the condition that must be fullfiled for the lesson to be completed and a sampler
|
||||
the condition that must be fulfilled for the lesson to be completed and a sampler
|
||||
for the environment parameter. If the completion_criteria is None, then this is
|
||||
the last lesson in the curriculum.
|
||||
|
||||
|
|
|
@ -43,8 +43,8 @@ Get value estimates and memories for a trajectory, in batch form.
|
|||
**Arguments**:
|
||||
|
||||
- `batch`: An AgentBuffer that consists of a trajectory.
|
||||
- `next_obs`: the next observation (after the trajectory). Used for boostrapping
|
||||
if this is not a termiinal trajectory.
|
||||
- `next_obs`: the next observation (after the trajectory). Used for bootstrapping
|
||||
if this is not a terminal trajectory.
|
||||
- `done`: Set true if this is a terminal trajectory.
|
||||
- `agent_id`: Agent ID of the agent that this trajectory belongs to.
|
||||
|
||||
|
|
|
@ -7,7 +7,7 @@ interfacing with a Unity environment via Python.
|
|||
|
||||
## Installation and Examples
|
||||
|
||||
The PettingZoo wrapper is part of the `mlgents_envs` package. Please refer to the
|
||||
The PettingZoo wrapper is part of the `mlagents_envs` package. Please refer to the
|
||||
[mlagents_envs installation instructions](ML-Agents-Envs-README.md).
|
||||
|
||||
[[Colab] PettingZoo Wrapper Example](https://colab.research.google.com/github/Unity-Technologies/ml-agents/blob/develop-python-api-ga/ml-agents-envs/colabs/Colab_PettingZoo.ipynb)
|
||||
|
|
|
@ -52,6 +52,7 @@ to get started with the latest release of ML-Agents.**
|
|||
|
||||
The table below lists all our releases, including our `main` branch which is
|
||||
under active development and may be unstable. A few helpful guidelines:
|
||||
|
||||
- The [Versioning page](Versioning.md) overviews how we manage our GitHub
|
||||
releases and the versioning process for each of the ML-Agents components.
|
||||
- The [Releases page](https://github.com/Unity-Technologies/ml-agents/releases)
|
||||
|
@ -165,7 +166,7 @@ We have also published a series of blog posts that are relevant for ML-Agents:
|
|||
### More from Unity
|
||||
|
||||
- [Unity Sentis](https://unity.com/products/sentis)
|
||||
- [Introductin Unity Muse and Sentis](https://blog.unity.com/engine-platform/introducing-unity-muse-and-unity-sentis-ai)
|
||||
- [Introducing Unity Muse and Sentis](https://blog.unity.com/engine-platform/introducing-unity-muse-and-unity-sentis-ai)
|
||||
|
||||
## Community and Feedback
|
||||
|
||||
|
|
|
@ -413,7 +413,7 @@ Unless otherwise specified, omitting a configuration will revert it to its defau
|
|||
In some cases, you may want to specify a set of default configurations for your Behaviors.
|
||||
This may be useful, for instance, if your Behavior names are generated procedurally by
|
||||
the environment and not known before runtime, or if you have many Behaviors with very similar
|
||||
settings. To specify a default configuraton, insert a `default_settings` section in your YAML.
|
||||
settings. To specify a default configuration, insert a `default_settings` section in your YAML.
|
||||
This section should be formatted exactly like a configuration for a Behavior.
|
||||
|
||||
```yaml
|
||||
|
|
|
@ -13,7 +13,7 @@ Users of the plug-in system are responsible for implementing the trainer class s
|
|||
|
||||
Please refer to the internal [PPO implementation](../ml-agents/mlagents/trainers/ppo/trainer.py) for a complete code example. We will not provide a workable code in the document. The purpose of the tutorial is to introduce you to the core components and interfaces of our plugin framework. We use code snippets and patterns to demonstrate the control and data flow.
|
||||
|
||||
Your custom trainers are responsible for collecting experiences and training the models. Your custom trainer class acts like a co-ordinator to the policy and optimizer. To start implementing methods in the class, create a policy class objects from method `create_policy`:
|
||||
Your custom trainers are responsible for collecting experiences and training the models. Your custom trainer class acts like a coordinator to the policy and optimizer. To start implementing methods in the class, create a policy class objects from method `create_policy`:
|
||||
|
||||
|
||||
```python
|
||||
|
@ -243,7 +243,7 @@ Before installing your custom trainer package, make sure you have `ml-agents-env
|
|||
pip3 install -e ./ml-agents-envs && pip3 install -e ./ml-agents
|
||||
```
|
||||
|
||||
Install your cutom trainer package(if your package is pip installable):
|
||||
Install your custom trainer package(if your package is pip installable):
|
||||
```shell
|
||||
pip3 install your_custom_package
|
||||
```
|
||||
|
|
|
@ -28,7 +28,8 @@ env.close()
|
|||
|
||||
## Create and share your own registry
|
||||
|
||||
In order to share the `UnityEnvironemnt` you created, you must :
|
||||
In order to share the `UnityEnvironment` you created, you must:
|
||||
|
||||
- [Create a Unity executable](Learning-Environment-Executable.md) of your environment for each platform (Linux, OSX and/or Windows)
|
||||
- Place each executable in a `zip` compressed folder
|
||||
- Upload each zip file online to your preferred hosting platform
|
||||
|
|
|
@ -16,7 +16,7 @@ class UnityEnvRegistry(Mapping):
|
|||
The UnityEnvRegistry implements a Map, to access an entry of the Registry, use:
|
||||
```python
|
||||
registry = UnityEnvRegistry()
|
||||
entry = registry[<environment_identifyier>]
|
||||
entry = registry[<environment_identifier>]
|
||||
```
|
||||
An entry has the following properties :
|
||||
* `identifier` : Uniquely identifies this environment
|
||||
|
@ -27,7 +27,7 @@ class UnityEnvRegistry(Mapping):
|
|||
To launch a Unity environment from a registry entry, use the `make` method:
|
||||
```python
|
||||
registry = UnityEnvRegistry()
|
||||
env = registry[<environment_identifyier>].make()
|
||||
env = registry[<environment_identifier>].make()
|
||||
```
|
||||
"""
|
||||
|
||||
|
|
|
@ -148,8 +148,8 @@ class TorchOptimizer(Optimizer):
|
|||
"""
|
||||
Get value estimates and memories for a trajectory, in batch form.
|
||||
:param batch: An AgentBuffer that consists of a trajectory.
|
||||
:param next_obs: the next observation (after the trajectory). Used for boostrapping
|
||||
if this is not a termiinal trajectory.
|
||||
:param next_obs: the next observation (after the trajectory). Used for bootstrapping
|
||||
if this is not a terminal trajectory.
|
||||
:param done: Set true if this is a terminal trajectory.
|
||||
:param agent_id: Agent ID of the agent that this trajectory belongs to.
|
||||
:returns: A Tuple of the Value Estimates as a Dict of [name, np.ndarray(trajectory_len)],
|
||||
|
|
|
@ -565,8 +565,8 @@ class TorchPOCAOptimizer(TorchOptimizer):
|
|||
"""
|
||||
Get value estimates, baseline estimates, and memories for a trajectory, in batch form.
|
||||
:param batch: An AgentBuffer that consists of a trajectory.
|
||||
:param next_obs: the next observation (after the trajectory). Used for boostrapping
|
||||
if this is not a termiinal trajectory.
|
||||
:param next_obs: the next observation (after the trajectory). Used for bootstrapping
|
||||
if this is not a terminal trajectory.
|
||||
:param next_groupmate_obs: the next observations from other members of the group.
|
||||
:param done: Set true if this is a terminal trajectory.
|
||||
:param agent_id: Agent ID of the agent that this trajectory belongs to.
|
||||
|
|
|
@ -517,7 +517,7 @@ class CompletionCriteriaSettings:
|
|||
class Lesson:
|
||||
"""
|
||||
Gathers the data of one lesson for one environment parameter including its name,
|
||||
the condition that must be fullfiled for the lesson to be completed and a sampler
|
||||
the condition that must be fulfilled for the lesson to be completed and a sampler
|
||||
for the environment parameter. If the completion_criteria is None, then this is
|
||||
the last lesson in the curriculum.
|
||||
"""
|
||||
|
|
Загрузка…
Ссылка в новой задаче