This commit is contained in:
Arthur Jiang 2020-09-22 18:35:39 +08:00
Родитель d8c98deea4
Коммит ce435b61fe
120 изменённых файлов: 3686 добавлений и 1082 удалений

Просмотреть файл

@ -15,55 +15,73 @@ MARO has complete support on data processing, simulator building, RL algorithms
| `examples` | Showcase of MARO. |
| `notebooks` | MARO quick-start notebooks. |
### Prerequisites
## Prerequisites
- [Python == 3.6/3.7](https://www.python.org/downloads/)
- C++ Compiler
- Linux or Mac OS X: `gcc`
- Windows: [Build Tools for Visual Studio 2017](https://visualstudio.microsoft.com/thank-you-downloading-visual-studio/?sku=BuildTools&rel=15)
### Install MARO from PyPI
## Install MARO from PyPI
```sh
pip install maro
```
### Install MARO from Source
## Install MARO from Source ([editable mode](https://pip.pypa.io/en/stable/reference/pip_install/#editable-installs))
```sh
# If your environment is not clean, create a virtual environment firstly
python -m venv maro_venv
source maro_venv/bin/activate
- Prerequisites
- C++ Compiler
- Linux or Mac OS X: `gcc`
- Windows: [Build Tools for Visual Studio 2017](https://visualstudio.microsoft.com/thank-you-downloading-visual-studio/?sku=BuildTools&rel=15)
# Install MARO from source, if you don't need CLI full feature
pip install -r ./maro/requirements.build.txt
- Enable Virtual Environment
- Mac OS / Linux
# compile cython files
bash scripts/compile_cython.sh
pip install -e .
```sh
# If your environment is not clean, create a virtual environment firstly.
python -m venv maro_venv
source ./maro_venv/bin/activate
```
# Or with script
bash scripts/build_maro.sh
```
- Windows
### Quick example
```ps
# If your environment is not clean, create a virtual environment firstly.
python -m venv maro_venv
.\maro_venv\Scripts\activate
```
- Install MARO
- Mac OS / Linux
```sh
# Install MARO from source.
bash scripts/install_maro.sh
```
- Windows
```ps
# Install MARO from source.
.\scripts\install_maro.bat
```
## Quick example
```python
from maro.simulator import Env
env = Env(scenario="ecr", topology="toy.5p_ssddd_l0.0", start_tick=0, durations=100)
_, decision_event, is_done = env.step(None)
metrics, decision_event, is_done = env.step(None)
while not is_done:
reward, decision_event, is_done = env.step(None)
metrics, decision_event, is_done = env.step(None)
tot_shortage = env.snapshot_list["ports"][::"shortage"].sum()
print(f"total shortage: {tot_shortage}")
print(f"environment metrics: {env.metrics}")
```
### Run playground
## Run playground
```sh
# Build playground image
@ -75,7 +93,7 @@ docker build -f ./docker_files/cpu.play.df . -t maro/playground:cpu
docker run -p 40009:40009 -p 40010:40010 -p 40011:40011 maro/playground:cpu
```
### Contributing
## Contributing
This project welcomes contributions and suggestions. Most contributions require you to agree to a
Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us

Просмотреть файл

@ -20,9 +20,9 @@ RUN jupyter contrib nbextension install --system
RUN jt -t onedork -fs 95 -altp -tfs 11 -nfs 115 -cellw 88% -T
# Install maro
COPY --from=ext_build /maro_build/dist/maro-0.0.1a0-cp36-cp36m-manylinux2010_x86_64.whl ./maro-0.0.1a0-cp36-cp36m-manylinux2010_x86_64.whl
RUN pip install maro-0.0.1a0-cp36-cp36m-manylinux2010_x86_64.whl
RUN rm maro-0.0.1a0-cp36-cp36m-manylinux2010_x86_64.whl
COPY --from=ext_build /maro_build/dist/maro-0.1.1a0-cp36-cp36m-manylinux2010_x86_64.whl ./maro-0.1.1a0-cp36-cp36m-manylinux2010_x86_64.whl
RUN pip install maro-0.1.1a0-cp36-cp36m-manylinux2010_x86_64.whl
RUN rm maro-0.1.1a0-cp36-cp36m-manylinux2010_x86_64.whl
# Install redis
RUN wget http://download.redis.io/releases/redis-6.0.6.tar.gz; tar xzf redis-6.0.6.tar.gz; cd redis-6.0.6; make

Просмотреть файл

@ -1,11 +1,14 @@
# MARO Documentation
## Pre-install
## Generate API docs
```sh
pip install -U -r requirements.docs.txt
```
## Build docs
```sh
# For linux, darwin
make html
@ -15,10 +18,31 @@ make html
```
## Generate API docs
```sh
sphinx-apidoc -f -o ./source/apidoc ../maro/
```
## Local host
```sh
python -m http.server -d ./_build/html 8000 -b 0.0.0.0
```
## Auto-build/Auto-refresh
### Prerequisites
- [Watchdog](https://pypi.org/project/watchdog/)
- [Browser-sync](https://www.browsersync.io/)
```sh
# Watch file change, auto-build
watchmedo shell-command --patterns="*.rst;*.md;*.py;*.png;*.ico;*.svg" --ignore-pattern="_build/*" --recursive --command="APIDOC_GEN=False make html"
# Watch file change, auto-refresh
browser-sync start --server --startPath ./_build/html --port 8000 --files "**/*"
```
## Local host
```sh
python -m http.server -d ./_build/html 8000 -b 0.0.0.0

Просмотреть файл

@ -32,47 +32,45 @@ Quick Start
from maro.simulator import Env
from maro.simulator.scenarios.ecr.common import Action
start_tick = 0
durations = 100 # 100 days
# Initialize an environment with a specific scenario, related topology.
env = Env(scenario="ecr", topology="5p_ssddd_l0.0",
start_tick=start_tick, durations=durations)
# In ECR, 1 tick means 1 day, here durations=100 means a length of 100 days
env = Env(scenario="ecr", topology="toy.5p_ssddd_l0.0", start_tick=0, durations=100)
# Query environment summary, which includes business instances, intra-instance attributes, etc.
print(env.summary)
for ep in range(2):
# Gym-like step function
# Gym-like step function.
metrics, decision_event, is_done = env.step(None)
while not is_done:
past_week_ticks = [x for x in range(
decision_event.tick - 7, decision_event.tick)]
past_week_ticks = [
x for x in range(decision_event.tick - 7, decision_event.tick)
]
decision_port_idx = decision_event.port_idx
intr_port_infos = ["booking", "empty", "shortage"]
# Query the decision port booking, empty container inventory, shortage information in the past week
past_week_info = env.snapshot_list["ports"][past_week_ticks:
decision_port_idx:
intr_port_infos]
# Query the snapshot list of the environment to get the information of
# the booking, empty container inventory, shortage of the decision port in the past week.
past_week_info = env.snapshot_list["ports"][
past_week_ticks : decision_port_idx : intr_port_infos
]
dummy_action = Action(decision_event.vessel_idx,
decision_event.port_idx, 0)
dummy_action = Action(
vessel_idx=decision_event.vessel_idx,
port_idx=decision_event.port_idx,
quantity=0
)
# Drive environment with dummy action (no repositioning)
# Drive environment with dummy action (no repositioning).
metrics, decision_event, is_done = env.step(dummy_action)
# Query environment business metrics at the end of an episode, it is your optimized object (usually includes multi-target).
print(f"ep: {ep}, environment metrics: {env.get_metrics()}")
# Query environment business metrics at the end of an episode,
# it is your optimized object (usually includes multi-target).
print(f"ep: {ep}, environment metrics: {env.metrics}")
env.reset()
Contents
====================
.. toctree::
:maxdepth: 2
:caption: Installation
installation/pip_install.md
installation/playground.md
installation/grass_cluster_provisioning_on_azure.md
@ -85,14 +83,6 @@ Contents
scenarios/ecr.md
scenarios/citi_bike.md
.. toctree::
:maxdepth: 2
:caption: Examples
examples/hello_world.md
examples/ecr_single_host.md
examples/ecr_distributed.md
.. toctree::
:maxdepth: 2
:caption: Key Components
@ -106,13 +96,6 @@ Contents
key_components/communication.md
key_components/orchestration.md
.. toctree::
:maxdepth: 2
:caption: Experiments
experiments/ecr.md
experiments/citi_bike.md
.. toctree::
:maxdepth: 2
:caption: API Documents

Просмотреть файл

@ -7,7 +7,8 @@ on Azure and run your training job in a distributed environment.
## Prerequisites
- [Install the Azure CLI and login](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli?view=azure-cli-latest)
- [Install docker](https://docs.docker.com/engine/install/) and [Configure docker to make sure it can be managed as a non-root user](https://docs.docker.com/engine/install/linux-postinstall/#manage-docker-as-a-non-root-user)
- [Install docker](https://docs.docker.com/engine/install/) and
[Configure docker to make sure it can be managed as a non-root user](https://docs.docker.com/engine/install/linux-postinstall/#manage-docker-as-a-non-root-user)
## Cluster Management

Просмотреть файл

@ -8,7 +8,8 @@ on Azure and run your training job in a distributed environment.
- [Install the Azure CLI and login](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli?view=azure-cli-latest)
- [Install and set up kubectl](https://kubernetes.io/docs/tasks/tools/install-kubectl/)
- [Install docker](https://docs.docker.com/engine/install/) and [Configure docker to make sure it can be managed as a non-root user](https://docs.docker.com/engine/install/linux-postinstall/#manage-docker-as-a-non-root-user)
- [Install docker](https://docs.docker.com/engine/install/) and
[Configure docker to make sure it can be managed as a non-root user](https://docs.docker.com/engine/install/linux-postinstall/#manage-docker-as-a-non-root-user)
## Cluster Management

Просмотреть файл

@ -6,27 +6,43 @@
pip install maro
```
## Install from Source
## Install MARO from Source ([Editable Mode](https://pip.pypa.io/en/stable/reference/pip_install/#editable-installs))
### Prerequisites
- Prerequisites
- [Python >= 3.6, < 3.8](https://www.python.org/downloads/)
- C++ Compiler
- Linux or Mac OS X: `gcc`
- Windows: [Build Tools for Visual Studio 2017](https://visualstudio.microsoft.com/thank-you-downloading-visual-studio/?sku=BuildTools&rel=15)
- [Python >= 3.6, < 3.8](https://www.python.org/downloads/)
- C++ Compiler
- Linux or Mac OS X: `gcc`
- Windows: [Build Tools for Visual Studio 2017](https://visualstudio.microsoft.com/thank-you-downloading-visual-studio/?sku=BuildTools&rel=15)
- Enable Virtual Environment
- Mac OS / Linux
```sh
# If your environment is not clean, create a virtual environment first
python -m venv maro_venv
source maro_venv/bin/activate
```sh
# If your environment is not clean, create a virtual environment firstly.
python -m venv maro_venv
source ./maro_venv/bin/activate
```
# Install MARO from source, if you don't need CLI full feature
pip install -r ./maro/requirements.build.txt
- Windows
# compile cython files
bash scripts/compile_cython.sh
pip install -e .
```powershell
# If your environment is not clean, create a virtual environment firstly.
python -m venv maro_venv
.\maro_venv\Scripts\activate
```
# Or with script
bash scripts/build_maro.sh
```
- Install MARO
- Mac OS / Linux
```sh
# Install MARO from source.
bash scripts/install_maro.sh
```
- Windows
```powershell
# Install MARO from source.
.\scripts\install_maro.bat
```

Просмотреть файл

@ -36,4 +36,4 @@ docker run -p 40009:40009 -p 40010:40010 -p 40011:40011 maro/playground:cpu
| `examples` | Showcases of predefined scenarios. |
| `notebooks` | Quick-start tutorial. |
*(Those not mentioned in the table can be ignored.)*
*(Those not mentioned in the table can be ignored.)*

Просмотреть файл

@ -42,4 +42,4 @@ Generally, the business time series data is read from the historical log or
generated by a data generation model. Currently, for topologies in Citi Bike
scenario, data processing is needed before starting the simulation. You can find
the brief introduction of the data processing command in
[Data Processing](../scenarios/citi_bike.html#data-processing).
[Data Processing](../scenarios/citi_bike.html#data-preparation).

Просмотреть файл

@ -56,39 +56,42 @@ workflow and code snippet.
from maro.simulator import Env
from maro.simulator.scenarios.ecr.common import Action
start_tick = 0
durations = 100 # 100 days
# Initialize an environment with a specific scenario, related topology.
env = Env(scenario="ecr", topology="5p_ssddd_l0.0",
start_tick=start_tick, durations=durations)
# In ECR, 1 tick means 1 day, here durations=100 means a length of 100 days
env = Env(scenario="ecr", topology="toy.5p_ssddd_l0.0", start_tick=0, durations=100)
# Query environment summary, which includes business instances, intra-instance attributes, etc.
print(env.summary)
for ep in range(2):
# Gym-like step function
# Gym-like step function.
metrics, decision_event, is_done = env.step(None)
while not is_done:
past_week_ticks = [x for x in range(
decision_event.tick - 7, decision_event.tick)]
past_week_ticks = [
x for x in range(decision_event.tick - 7, decision_event.tick)
]
decision_port_idx = decision_event.port_idx
intr_port_infos = ["booking", "empty", "shortage"]
# Query the decision port booking, empty container inventory, shortage information in the past week
past_week_info = env.snapshot_list["ports"][past_week_ticks:
decision_port_idx:
intr_port_infos]
# Query the snapshot list of the environment to get the information of
# the booking, empty container inventory, shortage of the decision port in the past week.
past_week_info = env.snapshot_list["ports"][
past_week_ticks : decision_port_idx : intr_port_infos
]
dummy_action = Action(decision_event.vessel_idx,
decision_event.port_idx, 0)
dummy_action = Action(
vessel_idx=decision_event.vessel_idx,
port_idx=decision_event.port_idx,
quantity=0
)
# Drive environment with dummy action (no repositioning)
# Drive environment with dummy action (no repositioning).
metrics, decision_event, is_done = env.step(dummy_action)
# Query environment business metrics at the end of an episode, it is your optimized object (usually includes multi-target).
print(f"ep: {ep}, environment metrics: {env.get_metrics()}")
# Query environment business metrics at the end of an episode,
# it is your optimized object (usually includes multi-target).
print(f"ep: {ep}, environment metrics: {env.metrics}")
env.reset()
```

Просмотреть файл

@ -1,4 +1,4 @@
# Citi Bike (Bike Repositioning)
# Bike Repositioning (Citi Bike)
The Citi Bike scenario simulates the bike repositioning problem triggered by the
one-way bike trips based on the public trip data from
@ -144,70 +144,20 @@ topologies, the definition of the bike flow and the trigger mechanism of
repositioning actions are the same as those in the toy topologies. We provide
this series of topologies to better simulate the actual Citi Bike scenario.
### Naive Baseline
## Quick Start
Below are the performance of *no repositioning* and *random repositioning* in
different topologies. The performance metric used here is the *fulfillment ratio*.
| Topology | No Repositioning | Random Repositioning |
| :-------: | :--------------: | :------------------: |
| toy.3s_4t | | |
| toy.4s_4t | | |
| toy.5s_6t | | |
| Topology | No Repositioning | Random Repositioning |
| :-------: | :--------------: | :------------------: |
| ny.201801 | | |
| ny.201802 | | |
| ny.201803 | | |
| ny.201804 | | |
| ny.201805 | | |
| ny.201806 | | |
| ny.201807 | | |
| ny.201808 | | |
| ny.201809 | | |
| ny.201810 | | |
| ny.201811 | | |
| ny.201812 | | |
| Topology | No Repositioning | Random Repositioning |
| :-------: | :--------------: | :------------------: |
| ny.201901 | | |
| ny.201902 | | |
| ny.201903 | | |
| ny.201904 | | |
| ny.201905 | | |
| ny.201906 | | |
| ny.201907 | | |
| ny.201908 | | |
| ny.201909 | | |
| ny.201910 | | |
| ny.201911 | | |
| ny.201912 | | |
| Topology | No Repositioning | Random Repositioning |
| :-------: | :--------------: | :------------------: |
| ny.202001 | | |
| ny.202002 | | |
| ny.202003 | | |
| ny.202004 | | |
| ny.202005 | | |
| ny.202006 | | |
<!-- ## Quick Start
### Data Processing
### Data Preparation
To start the simulation of Citi Bike scenario, users need to first generate the
related data. Below are the introduction to the related commands:
related data. Below is the introduction to the related commands:
#### Environment List Command
The data environment list command is used to list the environments that need the
The data environment `list` command is used to list the environments that need the
data files generated before the simulation.
```console
user@maro:~/MARO$ maro env data list
```sh
maro env data list
scenario: citi_bike, topology: ny.201801
scenario: citi_bike, topology: ny.201802
@ -221,9 +171,9 @@ scenario: citi_bike, topology: ny.201806
#### Generate Command
The data generate command is used to automatically download and build the
specified predefined scenario and topology data files for the simulation.
Currently, there are three arguments for the data generate command:
The data `generate` command is used to automatically download and build the specified
predefined scenario and topology data files for the simulation. Currently, there
are three arguments for the data `generate` command:
- `-s`: required, used to specify the predefined scenario. Valid scenarios are
listed in the result of [environment list command](#environment-list-command).
@ -232,8 +182,8 @@ listed in the result of [environment list command](#environment-list-command).
- `-f`: optional, if set, to force to re-download and re-generate the data files
and overwrite the already existing ones.
```console
user@maro:~/MARO$ maro env data generate -s citi_bike -t ny.201802
```sh
maro env data generate -s citi_bike -t ny.201802
The data files for citi_bike-ny201802 will then be downloaded and deployed to ~/.maro/data/citibike/_build/ny201802.
```
@ -254,9 +204,9 @@ For the example above, the directory structure should be like:
#### Convert Command
The data convert command is used to convert the CSV data files to binary data
The data `convert` command is used to convert the CSV data files to binary data
files that the simulator needs. Currently, there are three arguments for the data
convert command:
`convert` command:
- `--meta`: required, used to specify the path of the meta file. The source
columns that to be converted and the data type of each columns should be
@ -266,101 +216,255 @@ If multiple source CSV data files are needed, you can list all the full paths of
the source files in a specific file and use a `@` symbol to specify it.
- `--output`: required, used to specify the path of the target binary file.
```console
user@maro:~/MARO$ maro data convert --meta ~/.maro/data/citibike/meta/trips.yml --file ~/.maro/data/citibike/source/_clean/ny201801/trip.csv --output ~/.maro/data/citibike/_build/ny201801/trip.bin
```sh
maro data convert --meta ~/.maro/data/citibike/meta/trips.yml --file ~/.maro/data/citibike/source/_clean/ny201801/trip.csv --output ~/.maro/data/citibike/_build/ny201801/trip.bin
```
### DecisionEvent
### Environment Interface
Once the environment need the agent's response to promote the simulation, it will throw an **DecisionEvent**. In the scenario of citi_bike, the information of each DecisionEvent is listed as below:
Before starting interaction with the environment, we need to know the definition
of `DecisionEvent` and `Action` in Citi Bike scenario first. Besides, you can query
the environment [snapshot list](../key_components/data_model.html#advanced-features)
to get more detailed information for the decision making.
- **station_idx**: the id of the station/agent that needs to respond to the environment
- **tick**: the corresponding tick
- **frame_index**: the corresponding frame index, that is the index of the corresponding snapshot in the snapshot list
- **type**: the decision type of this decision event. In citi_bike scenario, there are 2 types:
- **Supply**: There is too many bikes in the corresponding station, it's better to reposition some of them to other stations.
- **Demand**: There is no enough bikes in the corresponding station, it's better to reposition bikes from other stations
#### DecisionEvent
- **action_scope**: a dictionary of valid action items.
- The key of the item indicates the station/agent id;
Once the environment need the agent's response to reposition bikes, it will
throw an `DecisionEvent`. In the scenario of Citi Bike, the information of each
`DecisionEvent` is listed as below:
- **station_idx** (int): The id of the station/agent that needs to respond to the
environment.
- **tick** (int): The corresponding tick.
- **frame_index** (int): The corresponding frame index, that is the index of the
corresponding snapshot in the environment snapshot list.
- **type** (DecisionType): The decision type of this decision event. In Citi Bike
scenario, there are 2 types:
- `Supply` indicates there is too many bikes in the corresponding station, so
it is better to reposition some of them to other stations.
- `Demand` indicates there is no enough bikes in the corresponding station, so
it is better to reposition bikes from other stations.
- **action_scope** (dict): A dictionary that maintains the information for
calculating the valid action scope:
- The key of these item indicate the station/agent ids.
- The meaning of the value differs for different decision type:
- If the decision type is Supply, the value of the station itself means its bike inventory at that moment, while the value of other target stations means the number of their empty docks;
- If the decision type is Demand, the value of the station itself means the number of its empty docks, while the value of other target stations means their bike inventory.
- If the decision type is `Supply`, the value of the station itself means its
bike inventory at that moment, while the value of other target stations means
the number of their empty docks.
- If the decision type is `Demand`, the value of the station itself means the
number of its empty docks, while the value of other target stations means
their bike inventory.
### Action
#### Action
Once we get a **DecisionEvent** from the envirionment, we should respond with an **Action**. Valid Action could be:
Once we get a `DecisionEvent` from the environment, we should respond with an
`Action`. Valid `Action` could be:
- None, which means do nothing.
- A valid Action instance, including:
- **from_station_idx**: int, the id of the source station of the bike transportation
- **to_station_idx**: int, the id of the destination station of the bike transportation
- **number**: int, the quantity of the bike transportation
- `None`, which means do nothing.
- A valid `Action` instance, including:
- **from_station_idx** (int): The id of the source station of the bike
transportation.
- **to_station_idx** (int): The id of the destination station of the bike
transportation.
- **number** (int): The quantity of the bike transportation.
### Example
Here we will show you a simple example of interaction with the environment in
random mode, we hope this could help you learn how to use the environment interfaces:
```python
from maro.simulator import Env
from maro.simulator.scenarios.citi_bike.common import Action, DecisionEvent, DecisionType
import random
# Initialize an Env for citi_bike scenario
env = Env(scenario="citi_bike", topology="ny201912", start_tick=0, durations=1440, snapshot_resolution=30)
# Initialize an environment of Citi Bike scenario, with a specific topology.
# In CitiBike, 1 tick means 1 minute, durations=1440 here indicates a length of 1 day.
# In CitiBike, one snapshot will be maintained every snapshot_resolution ticks,
# snapshot_resolution=30 here indicates 1 snapshot per 30 minutes.
env = Env(scenario="citi_bike", topology="toy.3s_4t", start_tick=0, durations=1440, snapshot_resolution=30)
is_done: bool = False
reward: int = None
# Query for the environment summary, the business instances and intra-instance attributes
# will be listed in the output for your reference.
print(env.summary)
metrics: object = None
decision_event: DecisionEvent = None
is_done: bool = False
action: Action = None
# Start the env with a None Action
reward, decision_event, is_done = env.step(action)
num_episode = 2
for ep in range(num_episode):
# Gym-like step function.
metrics, decision_event, is_done = env.step(None)
while not is_done:
if decision_event.type == DecisionType.Supply:
# the value of the station itself means the bike inventory if Supply
self_bike_inventory = decision_event.action_scope[decision_event.station_idx]
# the value of other stations means the quantity of empty docks if Supply
target_idx_dock_tuple_list = [
(k, v) for k, v in decision_event.action_scope.items() if k != decision_event.station_idx
while not is_done:
past_2hour_frames = [
x for x in range(decision_event.frame_index - 4, decision_event.frame_index)
]
# random choose a target station weighted by the quantity of empty docks
target_idx, target_dock = random.choices(
target_idx_dock_tuple_list,
weights=[item[1] for item in target_idx_dock_tuple_list]
)[0]
# generate the corresponding random Action
action = Action(
from_station_idx=decision_event.station_idx,
to_station_idx=target_idx,
number=random.randint(0, min(self_bike_inventory, target_dock))
)
decision_station_idx = decision_event.station_idx
intr_station_infos = ["trip_requirement", "bikes", "shortage"]
elif decision_event.type == DecisionType.Demand:
# the value of the station itself means the quantity of empty docks if Demand
self_available_dock = decision_event.action_scope[decision_event.station_idx]
# the value of other stations means their bike inventory if Demand
target_idx_inventory_tuple_list = [
(k, v) for k, v in decision_event.action_scope.items() if k != decision_event.station_idx
# Query the snapshot list of this environment to get the information of
# the trip requirements, bikes, shortage of the decision station in the past 2 hours.
past_2hour_info = env.snapshot_list["stations"][
past_2hour_frames : decision_station_idx : intr_station_infos
]
# random choose a target station weighted by the bike inventory
target_idx, target_inventory = random.choices(
target_idx_inventory_tuple_list,
weights=[item[1] for item in target_idx_inventory_tuple_list]
)[0]
# generate the corresponding random Action
action = Action(
from_station_idx=target_idx,
to_station_idx=decision_event.station_idx,
number=random.randint(0, min(self_available_dock, target_inventory))
)
else:
action = None
if decision_event.type == DecisionType.Supply:
# Supply: the value of the station itself means the bike inventory.
self_bike_inventory = decision_event.action_scope[decision_event.station_idx]
# Supply: the value of other stations means the quantity of empty docks.
target_idx_dock_tuple_list = [
(k, v) for k, v in decision_event.action_scope.items() if k != decision_event.station_idx
]
# Randomly choose a target station weighted by the quantity of empty docks.
target_idx, target_dock = random.choices(
target_idx_dock_tuple_list,
weights=[item[1] for item in target_idx_dock_tuple_list],
k=1
)[0]
# Generate the corresponding random Action.
action = Action(
from_station_idx=decision_event.station_idx,
to_station_idx=target_idx,
number=random.randint(0, min(self_bike_inventory, target_dock))
)
# Random sampling some records to show in the output TODO
# if random.random() > 0.95:
# print("*************\n{decision_event}\n{action}")
# Respond the environment with the generated Action
reward, decision_event, is_done = env.step(action)
``` -->
elif decision_event.type == DecisionType.Demand:
# Demand: the value of the station itself means the quantity of empty docks.
self_available_dock = decision_event.action_scope[decision_event.station_idx]
# Demand: the value of other stations means their bike inventory.
target_idx_inventory_tuple_list = [
(k, v) for k, v in decision_event.action_scope.items() if k != decision_event.station_idx
]
# Randomly choose a target station weighted by the bike inventory.
target_idx, target_inventory = random.choices(
target_idx_inventory_tuple_list,
weights=[item[1] for item in target_idx_inventory_tuple_list],
k=1
)[0]
# Generate the corresponding random Action.
action = Action(
from_station_idx=target_idx,
to_station_idx=decision_event.station_idx,
number=random.randint(0, min(self_available_dock, target_inventory))
)
else:
action = None
# Drive the environment with the random action.
metrics, decision_event, is_done = env.step(action)
# Query for the environment business metrics at the end of each episode,
# it is usually users' optimized object (usually includes multi-target).
print(f"ep: {ep}, environment metrics: {env.metrics}")
env.reset()
```
Jump to [this notebook](https://github.com/microsoft/maro/blob/master/notebooks/bike_repositioning/interact_with_simulator.ipynb)
for a quick experience.
<!--
### Naive Baseline
Below are the final environment metrics of the method *no repositioning* and
*random repositioning* in different topologies. For each experiment, we setup
the environment and test for a duration of 1 week.
#### No Repositioning
| Topology | Total Requirement | Resource Shortage | Repositioning Cost|
| :-------: | :---------------: | :---------------: | :---------------: |
| toy.3s_4t | +/- | +/- | +/- |
| toy.4s_4t | +/- | +/- | +/- |
| toy.5s_6t | +/- | +/- | +/- |
| Topology | Total Requirement | Resource Shortage | Repositioning Cost|
| :-------: | :---------------: | :---------------: | :---------------: |
| ny.201801 | +/- | +/- | +/- |
| ny.201802 | +/- | +/- | +/- |
| ny.201803 | +/- | +/- | +/- |
| ny.201804 | +/- | +/- | +/- |
| ny.201805 | +/- | +/- | +/- |
| ny.201806 | +/- | +/- | +/- |
| ny.201807 | +/- | +/- | +/- |
| ny.201808 | +/- | +/- | +/- |
| ny.201809 | +/- | +/- | +/- |
| ny.201810 | +/- | +/- | +/- |
| ny.201811 | +/- | +/- | +/- |
| ny.201812 | +/- | +/- | +/- |
| Topology | Total Requirement | Resource Shortage | Repositioning Cost|
| :-------: | :---------------: | :---------------: | :---------------: |
| ny.201901 | +/- | +/- | +/- |
| ny.201902 | +/- | +/- | +/- |
| ny.201903 | +/- | +/- | +/- |
| ny.201904 | +/- | +/- | +/- |
| ny.201905 | +/- | +/- | +/- |
| ny.201906 | +/- | +/- | +/- |
| ny.201907 | +/- | +/- | +/- |
| ny.201908 | +/- | +/- | +/- |
| ny.201909 | +/- | +/- | +/- |
| ny.201910 | +/- | +/- | +/- |
| ny.201911 | +/- | +/- | +/- |
| ny.201912 | +/- | +/- | +/- |
| Topology | Total Requirement | Resource Shortage | Repositioning Cost|
| :-------: | :---------------: | :---------------: | :---------------: |
| ny.202001 | +/- | +/- | +/- |
| ny.202002 | +/- | +/- | +/- |
| ny.202003 | +/- | +/- | +/- |
| ny.202004 | +/- | +/- | +/- |
| ny.202005 | +/- | +/- | +/- |
| ny.202006 | +/- | +/- | +/- |
#### Random Repositioning
| Topology | Total Requirement | Resource Shortage | Repositioning Cost|
| :-------: | :---------------: | :---------------: | :---------------: |
| toy.3s_4t | +/- | +/- | +/- |
| toy.4s_4t | +/- | +/- | +/- |
| toy.5s_6t | +/- | +/- | +/- |
| Topology | Total Requirement | Resource Shortage | Repositioning Cost|
| :-------: | :---------------: | :---------------: | :---------------: |
| ny.201801 | +/- | +/- | +/- |
| ny.201802 | +/- | +/- | +/- |
| ny.201803 | +/- | +/- | +/- |
| ny.201804 | +/- | +/- | +/- |
| ny.201805 | +/- | +/- | +/- |
| ny.201806 | +/- | +/- | +/- |
| ny.201807 | +/- | +/- | +/- |
| ny.201808 | +/- | +/- | +/- |
| ny.201809 | +/- | +/- | +/- |
| ny.201810 | +/- | +/- | +/- |
| ny.201811 | +/- | +/- | +/- |
| ny.201812 | +/- | +/- | +/- |
| Topology | Total Requirement | Resource Shortage | Repositioning Cost|
| :-------: | :---------------: | :---------------: | :---------------: |
| ny.201901 | +/- | +/- | +/- |
| ny.201902 | +/- | +/- | +/- |
| ny.201903 | +/- | +/- | +/- |
| ny.201904 | +/- | +/- | +/- |
| ny.201905 | +/- | +/- | +/- |
| ny.201906 | +/- | +/- | +/- |
| ny.201907 | +/- | +/- | +/- |
| ny.201908 | +/- | +/- | +/- |
| ny.201909 | +/- | +/- | +/- |
| ny.201910 | +/- | +/- | +/- |
| ny.201911 | +/- | +/- | +/- |
| ny.201912 | +/- | +/- | +/- |
| Topology | Total Requirement | Resource Shortage | Repositioning Cost|
| :-------: | :---------------: | :---------------: | :---------------: |
| ny.202001 | +/- | +/- | +/- |
| ny.202002 | +/- | +/- | +/- |
| ny.202003 | +/- | +/- | +/- |
| ny.202004 | +/- | +/- | +/- |
| ny.202005 | +/- | +/- | +/- |
| ny.202006 | +/- | +/- | +/- |
-->

Просмотреть файл

@ -130,128 +130,223 @@ manually.
*(To make it clearer, the figure above only shows the service routes among ports.)*
### Naive Baseline
Below are the performance of *no repositioning* and *random repositioning* in
different topologies. The performance metric used here is the *fulfillment ratio*.
| Topology | No Repositioning | Random Repositioning |
| :--------------: | :--------------: | :------------------: |
| toy.4p_ssdd_l0.0 | 11.16 +/- 0.00 | 36.76 +/- 3.19 |
| toy.4p_ssdd_l0.1 | 11.16 +/- 0.00 | 78.42 +/- 5.67 |
| toy.4p_ssdd_l0.2 | 11.16 +/- 0.00 | 71.37 +/- 12.51 |
| toy.4p_ssdd_l0.3 | 11.16 +/- 0.00 | 67.37 +/- 11.60 |
| toy.4p_ssdd_l0.4 | 11.16 +/- 0.03 | 68.99 +/- 12.11 |
| toy.4p_ssdd_l0.5 | 11.16 +/- 0.03 | 67.07 +/- 12.81 |
| toy.4p_ssdd_l0.6 | 11.16 +/- 0.03 | 67.90 +/- 4.44 |
| toy.4p_ssdd_l0.7 | 11.16 +/- 0.03 | 62.13 +/- 5.82 |
| toy.4p_ssdd_l0.8 | 11.17 +/- 0.03 | 64.05 +/- 5.16 |
| Topology | No Repositioning | Random Repositioning |
| :---------------: | :--------------: | :------------------: |
| toy.5p_ssddd_l0.0 | 22.32 +/- 0.00 | 57.15 +/- 4.38 |
| toy.5p_ssddd_l0.1 | 22.32 +/- 0.00 | 64.15 +/- 3.91 |
| toy.5p_ssddd_l0.2 | 22.32 +/- 0.00 | 64.14 +/- 3.54 |
| toy.5p_ssddd_l0.3 | 22.33 +/- 0.00 | 64.37 +/- 3.87 |
| toy.5p_ssddd_l0.4 | 22.32 +/- 0.06 | 63.53 +/- 3.93 |
| toy.5p_ssddd_l0.5 | 22.32 +/- 0.06 | 63.93 +/- 3.72 |
| toy.5p_ssddd_l0.6 | 22.32 +/- 0.06 | 54.60 +/- 5.40 |
| toy.5p_ssddd_l0.7 | 22.32 +/- 0.06 | 45.00 +/- 6.05 |
| toy.5p_ssddd_l0.8 | 22.34 +/- 0.06 | 46.32 +/- 4.96 |
| Topology | No Repositioning | Random Repositioning |
| :----------------: | :--------------: | :------------------: |
| toy.6p_sssbdd_l0.0 | 34.15 +/- 0.00 | 44.69 +/- 6.84 |
| toy.6p_sssbdd_l0.1 | 34.15 +/- 0.00 | 59.35 +/- 5.11 |
| toy.6p_sssbdd_l0.2 | 34.15 +/- 0.00 | 59.35 +/- 4.97 |
| toy.6p_sssbdd_l0.3 | 34.16 +/- 0.00 | 56.69 +/- 4.45 |
| toy.6p_sssbdd_l0.4 | 34.14 +/- 0.09 | 56.72 +/- 4.37 |
| toy.6p_sssbdd_l0.5 | 34.14 +/- 0.09 | 56.13 +/- 4.34 |
| toy.6p_sssbdd_l0.6 | 34.14 +/- 0.09 | 56.76 +/- 1.52 |
| toy.6p_sssbdd_l0.7 | 34.14 +/- 0.09 | 55.86 +/- 2.70 |
| toy.6p_sssbdd_l0.8 | 34.18 +/- 0.09 | 55.36 +/- 2.11 |
| Topology | No Repositioning | Random Repositioning |
| :-------------------: | :--------------: | :------------------: |
| global_trade.22p_l0.0 | 68.57 +/- 0.00 | 59.27 +/- 1.56 |
| global_trade.22p_l0.1 | 66.64 +/- 0.00 | 64.56 +/- 0.70 |
| global_trade.22p_l0.2 | 66.55 +/- 0.00 | 64.73 +/- 0.57 |
| global_trade.22p_l0.3 | 65.24 +/- 0.00 | 63.31 +/- 0.68 |
| global_trade.22p_l0.4 | 65.22 +/- 0.15 | 63.46 +/- 0.76 |
| global_trade.22p_l0.5 | 64.90 +/- 0.15 | 63.10 +/- 0.79 |
| global_trade.22p_l0.6 | 63.74 +/- 0.49 | 60.98 +/- 0.50 |
| global_trade.22p_l0.7 | 60.14 +/- 0.47 | 56.38 +/- 0.75 |
| global_trade.22p_l0.8 | 60.17 +/- 0.45 | 56.45 +/- 0.67 |
<!-- ## Quick Start
## Quick Start
### Data Preparation
To start a simulation in ECR scenario, no extra data processing is needed. You
can just specify the scenario and the topology when initialize an environment and
enjoy your exploration in this scenario.
### Environment Interface
Before starting interaction with the environment, we need to know the definition
of `DecisionEvent` and `Action` in ECR scenario first. Besides, you can query the
environment [snapshot list](../key_components/data_model.html#advanced-features)
to get more detailed information for the decision making.
#### DecisionEvent
Once the environment need the agent's response to promote the simulation, it will throw an **DecisionEvent**. In the scenario of ECR, the information of each DecisionEvent is listed as below:
Once the environment need the agent's response to promote the simulation, it will
throw an `DecisionEvent`. In the scenario of ECR, the information of each
`DecisionEvent` is listed as below:
- **tick**: (int) the corresponding tick
- **port_idx**: (int) the id of the port/agent that needs to respond to the environment
- **vessel_idx**: (int) the id of the vessel/operation object of the port/agnet.
- **snapshot_list**: (int) **Snapshots of the environment to input into the decision model** TODO: confirm the meaning
- **action_scope**: **Load and discharge scope for agent to generate decision**
- **early_discharge**: **Early discharge number of corresponding vessel**
- **tick** (int): The corresponding tick.
- **port_idx** (int): The id of the port/agent that needs to respond to the
environment.
- **vessel_idx** (int): The id of the vessel/operation object of the port/agent.
- **action_scope** (ActionScope): ActionScope has two attributes:
- `load` indicates the maximum quantity that can be loaded from the port the
vessel.
- `discharge` indicates the maximum quantity that can be discharged from the
vessel to the port.
- **early_discharge** (int): When the available capacity in the vessel is not
enough to load the ladens, some of the empty containers in the vessel will be
early discharged to free the space. The quantity of empty containers that have
been early discharged due to the laden loading is recorded in this field.
#### Action
Once we get a DecisionEvent from the environment, we should respond with an Action. Valid Action could be:
Once we get a `DecisionEvent` from the environment, we should respond with an
`Action`. Valid `Action` could be:
- None, which means do nothing.
- A valid Action instance, including:
- **vessel_idx**: (int) the id of the vessel/operation object of the port/agent.
- **port_idx**: (int) the id of the port/agent that take this action.
- **quantity**: (int) the sign of this value denotes different meanings:
- positive quantity means unloading empty containers from vessel to port.
- negative quantity means loading empty containers from port to vessel.
- `None`, which means do nothing.
- A valid `Action` instance, including:
- **vessel_idx** (int): The id of the vessel/operation object of the port/agent.
- **port_idx** (int): The id of the port/agent that take this action.
- **quantity** (int): The sign of this value denotes different meanings:
- Positive quantity means discharging empty containers from vessel to port.
- Negative quantity means loading empty containers from port to vessel.
### Example (Random Action)
### Example
Here we will show you a simple example of interaction with the environment in
random mode, we hope this could help you learn how to use the environment interfaces:
```python
from maro.simulator import Env
from maro.simulator.scenarios.ecr.common import Action
from maro.simulator.scenarios.ecr.common import Action, DecisionEvent
start_tick = 0
durations = 100 # 100 days
import random
# Initialize an environment with a specific scenario, related topology.
env = Env(scenario="ecr", topology="5p_ssddd_l0.0",
start_tick=start_tick, durations=durations)
# Initialize an environment of ECR scenario, with a specific topology.
# In ECR, 1 tick means 1 day, durations=100 here indicates a length of 100 days.
env = Env(scenario="ecr", topology="toy.5p_ssddd_l0.0", start_tick=0, durations=100)
# Query environment summary, which includes business instances, intra-instance attributes, etc.
# Query for the environment summary, the business instances and intra-instance attributes
# will be listed in the output for your reference.
print(env.summary)
for ep in range(2):
# Gym-like step function
metrics: object = None
decision_event: DecisionEvent = None
is_done: bool = False
action: Action = None
num_episode = 2
for ep in range(num_episode):
# Gym-like step function.
metrics, decision_event, is_done = env.step(None)
while not is_done:
past_week_ticks = [x for x in range(
decision_event.tick - 7, decision_event.tick)]
past_week_ticks = [
x for x in range(decision_event.tick - 7, decision_event.tick)
]
decision_port_idx = decision_event.port_idx
intr_port_infos = ["booking", "empty", "shortage"]
# Query the decision port booking, empty container inventory, shortage information in the past week
past_week_info = env.snapshot_list["ports"][past_week_ticks:
decision_port_idx:
intr_port_infos]
# Query the snapshot list of this environment to get the information of
# the booking, empty, shortage of the decision port in the past week.
past_week_info = env.snapshot_list["ports"][
past_week_ticks : decision_port_idx : intr_port_infos
]
dummy_action = Action(decision_event.vessel_idx,
decision_event.port_idx, 0)
# Generate a random Action according to the action_scope in DecisionEvent.
random_quantity = random.randint(
-decision_event.action_scope.load,
decision_event.action_scope.discharge
)
action = Action(
vessel_idx=decision_event.vessel_idx,
port_idx=decision_event.port_idx,
quantity=random_quantity
)
# Drive environment with dummy action (no repositioning)
metrics, decision_event, is_done = env.step(dummy_action)
# Drive the environment with the random action.
metrics, decision_event, is_done = env.step(action)
# Query environment business metrics at the end of an episode, it is your optimized object (usually includes multi-target).
print(f"ep: {ep}, environment metrics: {env.get_metrics()}")
# Query for the environment business metrics at the end of each episode,
# it is usually users' optimized object in ECR scenario (usually includes multi-target).
print(f"ep: {ep}, environment metrics: {env.metrics}")
env.reset()
```
Detail link -->
Jump to [this notebook](https://github.com/microsoft/maro/blob/master/notebooks/empty_container_repositioning/interact_with_simulator.ipynb)
for a quick experience.
<!--
### Naive Baseline
Below are the final environment metrics of the method *no repositioning* and
*random repositioning* in different topologies. For each experiment, we setup
the environment and test for a duration of 1120 ticks (days).
#### No Repositioning
| Topology | Total Requirement | Resource Shortage | Repositioning Cost|
| :--------------: | :---------------: | :---------------: | :---------------: |
| toy.4p_ssdd_l0.0 | +/- | +/- | +/- |
| toy.4p_ssdd_l0.1 | +/- | +/- | +/- |
| toy.4p_ssdd_l0.2 | +/- | +/- | +/- |
| toy.4p_ssdd_l0.3 | +/- | +/- | +/- |
| toy.4p_ssdd_l0.4 | +/- | +/- | +/- |
| toy.4p_ssdd_l0.5 | +/- | +/- | +/- |
| toy.4p_ssdd_l0.6 | +/- | +/- | +/- |
| toy.4p_ssdd_l0.7 | +/- | +/- | +/- |
| toy.4p_ssdd_l0.8 | +/- | +/- | +/- |
| Topology | Total Requirement | Resource Shortage | Repositioning Cost|
| :---------------: | :---------------: | :---------------: | :---------------: |
| toy.5p_ssddd_l0.0 | +/- | +/- | +/- |
| toy.5p_ssddd_l0.1 | +/- | +/- | +/- |
| toy.5p_ssddd_l0.2 | +/- | +/- | +/- |
| toy.5p_ssddd_l0.3 | +/- | +/- | +/- |
| toy.5p_ssddd_l0.4 | +/- | +/- | +/- |
| toy.5p_ssddd_l0.5 | +/- | +/- | +/- |
| toy.5p_ssddd_l0.6 | +/- | +/- | +/- |
| toy.5p_ssddd_l0.7 | +/- | +/- | +/- |
| toy.5p_ssddd_l0.8 | +/- | +/- | +/- |
| Topology | Total Requirement | Resource Shortage | Repositioning Cost|
| :----------------: | :---------------: | :---------------: | :---------------: |
| toy.6p_sssbdd_l0.0 | +/- | +/- | +/- |
| toy.6p_sssbdd_l0.1 | +/- | +/- | +/- |
| toy.6p_sssbdd_l0.2 | +/- | +/- | +/- |
| toy.6p_sssbdd_l0.3 | +/- | +/- | +/- |
| toy.6p_sssbdd_l0.4 | +/- | +/- | +/- |
| toy.6p_sssbdd_l0.5 | +/- | +/- | +/- |
| toy.6p_sssbdd_l0.6 | +/- | +/- | +/- |
| toy.6p_sssbdd_l0.7 | +/- | +/- | +/- |
| toy.6p_sssbdd_l0.8 | +/- | +/- | +/- |
| Topology | Total Requirement | Resource Shortage | Repositioning Cost|
| :-------------------: | :---------------: | :---------------: | :---------------: |
| global_trade.22p_l0.0 | +/- | +/- | +/- |
| global_trade.22p_l0.1 | +/- | +/- | +/- |
| global_trade.22p_l0.2 | +/- | +/- | +/- |
| global_trade.22p_l0.3 | +/- | +/- | +/- |
| global_trade.22p_l0.4 | +/- | +/- | +/- |
| global_trade.22p_l0.5 | +/- | +/- | +/- |
| global_trade.22p_l0.6 | +/- | +/- | +/- |
| global_trade.22p_l0.7 | +/- | +/- | +/- |
| global_trade.22p_l0.8 | +/- | +/- | +/- |
#### Random Repositioning
| Topology | Total Requirement | Resource Shortage | Repositioning Cost|
| :--------------: | :---------------: | :---------------: | :---------------: |
| toy.4p_ssdd_l0.0 | +/- | +/- | +/- |
| toy.4p_ssdd_l0.1 | +/- | +/- | +/- |
| toy.4p_ssdd_l0.2 | +/- | +/- | +/- |
| toy.4p_ssdd_l0.3 | +/- | +/- | +/- |
| toy.4p_ssdd_l0.4 | +/- | +/- | +/- |
| toy.4p_ssdd_l0.5 | +/- | +/- | +/- |
| toy.4p_ssdd_l0.6 | +/- | +/- | +/- |
| toy.4p_ssdd_l0.7 | +/- | +/- | +/- |
| toy.4p_ssdd_l0.8 | +/- | +/- | +/- |
| Topology | Total Requirement | Resource Shortage | Repositioning Cost|
| :---------------: | :---------------: | :---------------: | :---------------: |
| toy.5p_ssddd_l0.0 | +/- | +/- | +/- |
| toy.5p_ssddd_l0.1 | +/- | +/- | +/- |
| toy.5p_ssddd_l0.2 | +/- | +/- | +/- |
| toy.5p_ssddd_l0.3 | +/- | +/- | +/- |
| toy.5p_ssddd_l0.4 | +/- | +/- | +/- |
| toy.5p_ssddd_l0.5 | +/- | +/- | +/- |
| toy.5p_ssddd_l0.6 | +/- | +/- | +/- |
| toy.5p_ssddd_l0.7 | +/- | +/- | +/- |
| toy.5p_ssddd_l0.8 | +/- | +/- | +/- |
| Topology | Total Requirement | Resource Shortage | Repositioning Cost|
| :----------------: | :---------------: | :---------------: | :---------------: |
| toy.6p_sssbdd_l0.0 | +/- | +/- | +/- |
| toy.6p_sssbdd_l0.1 | +/- | +/- | +/- |
| toy.6p_sssbdd_l0.2 | +/- | +/- | +/- |
| toy.6p_sssbdd_l0.3 | +/- | +/- | +/- |
| toy.6p_sssbdd_l0.4 | +/- | +/- | +/- |
| toy.6p_sssbdd_l0.5 | +/- | +/- | +/- |
| toy.6p_sssbdd_l0.6 | +/- | +/- | +/- |
| toy.6p_sssbdd_l0.7 | +/- | +/- | +/- |
| toy.6p_sssbdd_l0.8 | +/- | +/- | +/- |
| Topology | Total Requirement | Resource Shortage | Repositioning Cost|
| :-------------------: | :---------------: | :---------------: | :---------------: |
| global_trade.22p_l0.0 | +/- | +/- | +/- |
| global_trade.22p_l0.1 | +/- | +/- | +/- |
| global_trade.22p_l0.2 | +/- | +/- | +/- |
| global_trade.22p_l0.3 | +/- | +/- | +/- |
| global_trade.22p_l0.4 | +/- | +/- | +/- |
| global_trade.22p_l0.5 | +/- | +/- | +/- |
| global_trade.22p_l0.6 | +/- | +/- | +/- |
| global_trade.22p_l0.7 | +/- | +/- | +/- |
| global_trade.22p_l0.8 | +/- | +/- | +/- |
-->

Просмотреть файл

@ -0,0 +1,10 @@
env:
scenario: "citi_bike"
topology: "ny.201801"
start_tick: 0
durations: 2880
resolution: 10
seed: 128
agent:
supply_top_k: 1
demand_top_k: 1

Просмотреть файл

@ -0,0 +1,77 @@
import heapq
import io
import random
import yaml
from maro.simulator import Env
from maro.simulator.scenarios.citi_bike.common import Action, DecisionEvent, DecisionType
from maro.utils import convert_dottable
with io.open("config.yml", "r") as in_file:
raw_config = yaml.safe_load(in_file)
config = convert_dottable(raw_config)
class GreedyAgent:
def __init__(self, supply_top_k: int = 1, demand_top_k: int = 1):
"""
Agent that executes a greedy policy. If the event type is supply, send as many bikes as possible to one of the
demand_k stations with the most empty slots. If the event type is demand, request as many bikes as possible from
one of the supply_k stations with the most bikes.
Args:
supply_top_k (int): number of top supply candidates to choose from.
demand_top_k (int): number of top demand candidates to choose from.
"""
self._supply_top_k = supply_top_k
self._demand_top_k = demand_top_k
def choose_action(self, decision_event: DecisionEvent):
if decision_event.type == DecisionType.Supply:
# find k target stations with the most empty slots, randomly choose one of them and send as many bikes to
# it as allowed by the action scope
top_k_demands = []
for demand_candidate, available_docks in decision_event.action_scope.items():
if demand_candidate == decision_event.station_idx:
continue
heapq.heappush(top_k_demands, (available_docks, demand_candidate))
if len(top_k_demands) > self._demand_top_k:
heapq.heappop(top_k_demands)
max_reposition, target_station_idx = random.choice(top_k_demands)
action = Action(decision_event.station_idx, target_station_idx, max_reposition)
else:
# find k source stations with the most bikes, randomly choose one of them and request as many bikes from
# it as allowed by the action scope
top_k_supplies = []
for supply_candidate, available_bikes in decision_event.action_scope.items():
if supply_candidate == decision_event.station_idx:
continue
heapq.heappush(top_k_supplies, (available_bikes, supply_candidate))
if len(top_k_supplies) > self._supply_top_k:
heapq.heappop(top_k_supplies)
max_reposition, source_idx = random.choice(top_k_supplies)
action = Action(source_idx, decision_event.station_idx, max_reposition)
return action
if __name__ == "__main__":
env = Env(scenario=config.env.scenario, topology=config.env.topology, start_tick=config.env.start_tick,
durations=config.env.durations, snapshot_resolution=config.env.resolution)
if config.env.seed is not None:
env.set_seed(config.env.seed)
agent = GreedyAgent(config.agent.supply_top_k, config.agent.demand_top_k)
metrics, decision_event, done = env.step(None)
while not done:
metrics, decision_event, done = env.step(agent.choose_action(decision_event))
print(f"Greedy agent policy performance: {env.metrics}")
env.reset()

Просмотреть файл

@ -0,0 +1,18 @@
This file contains instructions on how to use MARO on the Empty Container Repositioning (ECR) scenario.
### Overview
The ECR problem is one of the quintessential use cases of MARO. The example can be run with a set of scenario
configurations that can be found under maro/simulator/scenarios/ecr. General experimental parameters (e.g., type of
topology, type of algorithm to use, number of training episodes) can be configured through config.yml. Each RL
formulation has a dedicated folder, e.g., dqn, and all algorithm-specific parameters can be configured through
the config.py file in that folder.
### Single-host Single-process Mode:
To run the ECR example using the DQN algorithm under single-host mode, go to examples/ecr/dqn and run
single_process_launcher.py. You may play around with the configuration if you want to try out different
settings.
### Distributed Mode:
The examples/ecr/dqn/components folder contains dist_learner.py and dist_actor.py for distributed
training. For debugging purposes, we provide a script that simulates distributed mode using multi-processing.
Simply go to examples/ecr/dqn and run multi_process_launcher.py to start the learner and actor processes.

Просмотреть файл

Просмотреть файл

@ -0,0 +1,33 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.
from maro.rl import ActionShaper
from maro.simulator.scenarios.ecr.common import Action
class ECRActionShaper(ActionShaper):
def __init__(self, action_space):
super().__init__()
self._action_space = action_space
self._zero_action_index = action_space.index(0)
def __call__(self, model_action, decision_event, snapshot_list):
scope = decision_event.action_scope
tick = decision_event.tick
port_idx = decision_event.port_idx
vessel_idx = decision_event.vessel_idx
port_empty = snapshot_list["ports"][tick: port_idx: ["empty", "full", "on_shipper", "on_consignee"]][0]
vessel_remaining_space = snapshot_list["vessels"][tick: vessel_idx: ["empty", "full", "remaining_space"]][2]
early_discharge = snapshot_list["vessels"][tick:vessel_idx: "early_discharge"][0]
assert 0 <= model_action < len(self._action_space)
if model_action < self._zero_action_index:
actual_action = max(round(self._action_space[model_action] * port_empty), -vessel_remaining_space)
elif model_action > self._zero_action_index:
plan_action = self._action_space[model_action] * (scope.discharge + early_discharge) - early_discharge
actual_action = round(plan_action) if plan_action > 0 else round(self._action_space[model_action] * scope.discharge)
else:
actual_action = 0
return Action(vessel_idx, port_idx, actual_action)

Просмотреть файл

@ -0,0 +1,28 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.
import numpy as np
from maro.rl import AbsAgent, ColumnBasedStore
class ECRAgent(AbsAgent):
def __init__(self, name, algorithm, experience_pool: ColumnBasedStore, min_experiences_to_train,
num_batches, batch_size):
super().__init__(name, algorithm, experience_pool)
self._min_experiences_to_train = min_experiences_to_train
self._num_batches = num_batches
self._batch_size = batch_size
def train(self):
if len(self._experience_pool) < self._min_experiences_to_train:
return
for _ in range(self._num_batches):
indexes, sample = self._experience_pool.sample_by_key("loss", self._batch_size)
state = np.asarray(sample["state"])
action = np.asarray(sample["action"])
reward = np.asarray(sample["reward"])
next_state = np.asarray(sample["next_state"])
loss = self._algorithm.train(state, action, reward, next_state)
self._experience_pool.update(indexes, {"loss": loss})

Просмотреть файл

@ -0,0 +1,45 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.
import io
import yaml
from torch.nn.functional import smooth_l1_loss
from torch.optim import RMSprop
from maro.rl import AbsAgentManager, LearningModel, MLPDecisionLayers, DQN, DQNHyperParams, ColumnBasedStore
from maro.utils import convert_dottable, set_seeds
from .agent import ECRAgent
with io.open("config.yml", "r") as in_file:
raw_config = yaml.safe_load(in_file)
config = convert_dottable(raw_config)
config = config.agents
class DQNAgentManager(AbsAgentManager):
def _assemble(self, agent_dict):
set_seeds(config.seed)
num_actions = config.algorithm.num_actions
for agent_id in self._agent_id_list:
eval_model = LearningModel(decision_layers=MLPDecisionLayers(name=f'{agent_id}.policy',
input_dim=self._state_shaper.dim,
output_dim=num_actions,
**config.algorithm.model)
)
algorithm = DQN(model_dict={"eval": eval_model},
optimizer_opt=(RMSprop, config.algorithm.optimizer),
loss_func_dict={"eval": smooth_l1_loss},
hyper_params=DQNHyperParams(**config.algorithm.hyper_parameters,
num_actions=num_actions))
experience_pool = ColumnBasedStore(**config.experience_pool)
agent_dict[agent_id] = ECRAgent(name=agent_id, algorithm=algorithm, experience_pool=experience_pool,
**config.training_loop_parameters)
def store_experiences(self, experiences):
for agent_id, exp in experiences.items():
exp.update({"loss": [1e8] * len(exp[next(iter(exp))])})
self._agent_dict[agent_id].store_experiences(exp)

Просмотреть файл

@ -0,0 +1,50 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.
import io
import yaml
import numpy as np
from maro.simulator import Env
from maro.rl import AgentMode, SimpleActor, ActorWorker, KStepExperienceShaper, TwoPhaseLinearExplorer
from maro.utils import convert_dottable
from examples.ecr.dqn.components.state_shaper import ECRStateShaper
from examples.ecr.dqn.components.action_shaper import ECRActionShaper
from examples.ecr.dqn.components.experience_shaper import TruncatedExperienceShaper
from examples.ecr.dqn.components.agent_manager import DQNAgentManager
with io.open("config.yml", "r") as in_file:
raw_config = yaml.safe_load(in_file)
config = convert_dottable(raw_config)
if __name__ == "__main__":
env = Env(config.env.scenario, config.env.topology, durations=config.env.durations)
agent_id_list = [str(agent_id) for agent_id in env.agent_idx_list]
state_shaper = ECRStateShaper(**config.state_shaping)
action_shaper = ECRActionShaper(action_space=list(np.linspace(-1.0, 1.0, config.agents.algorithm.num_actions)))
if config.experience_shaping.type == "truncated":
experience_shaper = TruncatedExperienceShaper(**config.experience_shaping.truncated)
else:
experience_shaper = KStepExperienceShaper(reward_func=lambda mt: mt["perf"], **config.experience_shaping.k_step)
exploration_config = {"epsilon_range_dict": {"_all_": config.exploration.epsilon_range},
"split_point_dict": {"_all_": config.exploration.split_point},
"with_cache": config.exploration.with_cache
}
explorer = TwoPhaseLinearExplorer(agent_id_list, config.general.total_training_episodes, **exploration_config)
agent_manager = DQNAgentManager(name="ecr_remote_actor",
agent_id_list=agent_id_list,
mode=AgentMode.INFERENCE,
state_shaper=state_shaper,
action_shaper=action_shaper,
experience_shaper=experience_shaper,
explorer=explorer)
proxy_params = {"group_name": config.distributed.group_name,
"expected_peers": config.distributed.actor.peer,
"redis_address": (config.distributed.redis.host_name, config.distributed.redis.port)
}
actor_worker = ActorWorker(local_actor=SimpleActor(env=env, inference_agents=agent_manager),
proxy_params=proxy_params)
actor_worker.launch()

Просмотреть файл

@ -0,0 +1,41 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.
import os
import io
import yaml
from maro.simulator import Env
from maro.rl import ActorProxy, SimpleLearner, AgentMode, TwoPhaseLinearExplorer
from examples.ecr.dqn.components.state_shaper import ECRStateShaper
from maro.utils import Logger, convert_dottable
from examples.ecr.dqn.components.agent_manager import DQNAgentManager
with io.open("config.yml", "r") as in_file:
raw_config = yaml.safe_load(in_file)
config = convert_dottable(raw_config)
if __name__ == "__main__":
env = Env(config.env.scenario, config.env.topology, durations=config.env.durations)
agent_id_list = [str(agent_id) for agent_id in env.agent_idx_list]
state_shaper = ECRStateShaper(**config.state_shaping)
exploration_config = {"epsilon_range_dict": {"_all_": config.exploration.epsilon_range},
"split_point_dict": {"_all_": config.exploration.split_point},
"with_cache": config.exploration.with_cache
}
explorer = TwoPhaseLinearExplorer(agent_id_list, config.general.total_training_episodes, **exploration_config)
agent_manager = DQNAgentManager(name="ecr_remote_learner", agent_id_list=agent_id_list, mode=AgentMode.TRAIN,
state_shaper=state_shaper, explorer=explorer)
proxy_params = {"group_name": config.distributed.group_name,
"expected_peers": config.distributed.learner.peer,
"redis_address": (config.distributed.redis.host_name, config.distributed.redis.port)
}
learner = SimpleLearner(trainable_agents=agent_manager,
actor=ActorProxy(proxy_params=proxy_params),
logger=Logger("distributed_ecr_learner", auto_timestamp=False))
learner.train(total_episodes=config.general.total_training_episodes)
learner.test()
learner.dump_models(os.path.join(os.getcwd(), "models"))

Просмотреть файл

@ -0,0 +1,49 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.
from collections import defaultdict
import numpy as np
from maro.rl import ExperienceShaper
class TruncatedExperienceShaper(ExperienceShaper):
def __init__(self, *, time_window: int, time_decay_factor: float, fulfillment_factor: float,
shortage_factor: float):
super().__init__(reward_func=None)
self._time_window = time_window
self._time_decay_factor = time_decay_factor
self._fulfillment_factor = fulfillment_factor
self._shortage_factor = shortage_factor
def __call__(self, trajectory, snapshot_list):
experiences_by_agent = {}
for i in range(len(trajectory) - 1):
transition = trajectory[i]
agent_id = transition["agent_id"]
if agent_id not in experiences_by_agent:
experiences_by_agent[agent_id] = defaultdict(list)
experiences = experiences_by_agent[agent_id]
experiences["state"].append(transition["state"])
experiences["action"].append(transition["action"])
experiences["reward"].append(self._compute_reward(transition["event"], snapshot_list))
experiences["next_state"].append(trajectory[i+1]["state"])
return experiences_by_agent
def _compute_reward(self, decision_event, snapshot_list):
start_tick = decision_event.tick + 1
end_tick = decision_event.tick + self._time_window
ticks = list(range(start_tick, end_tick))
# calculate tc reward
future_fulfillment = snapshot_list["ports"][ticks::"fulfillment"]
future_shortage = snapshot_list["ports"][ticks::"shortage"]
decay_list = [self._time_decay_factor ** i for i in range(end_tick - start_tick)
for _ in range(future_fulfillment.shape[0]//(end_tick-start_tick))]
tot_fulfillment = np.dot(future_fulfillment, decay_list)
tot_shortage = np.dot(future_shortage, decay_list)
return np.float(self._fulfillment_factor * tot_fulfillment - self._shortage_factor * tot_shortage)

Просмотреть файл

@ -0,0 +1,28 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.
import numpy as np
from maro.rl import StateShaper
class ECRStateShaper(StateShaper):
def __init__(self, *, look_back, max_ports_downstream, port_attributes, vessel_attributes):
super().__init__()
self._look_back = look_back
self._max_ports_downstream = max_ports_downstream
self._port_attributes = port_attributes
self._vessel_attributes = vessel_attributes
self._dim = (look_back + 1) * (max_ports_downstream + 1) * len(port_attributes) + len(vessel_attributes)
def __call__(self, decision_event, snapshot_list):
tick, port_idx, vessel_idx = decision_event.tick, decision_event.port_idx, decision_event.vessel_idx
ticks = [tick - rt for rt in range(self._look_back-1)]
future_port_idx_list = snapshot_list["vessels"][tick: vessel_idx: 'future_stop_list'].astype('int')
port_features = snapshot_list["ports"][ticks: [port_idx] + list(future_port_idx_list): self._port_attributes]
vessel_features = snapshot_list["vessels"][tick: vessel_idx: self._vessel_attributes]
state = np.concatenate((port_features, vessel_features))
return str(port_idx), state
@property
def dim(self):
return self._dim

Просмотреть файл

@ -0,0 +1,66 @@
env:
scenario: "ecr"
topology: "toy.4p_ssdd_l0.0"
durations: 1120
general:
total_training_episodes: 500 # max episode
state_shaping:
look_back: 7
max_ports_downstream: 2
port_attributes:
- "empty"
- "full"
- "on_shipper"
- "on_consignee"
- "booking"
- "shortage"
- "fulfillment"
vessel_attributes:
- "empty"
- "full"
- "remaining_space"
experience_shaping:
type: "truncated"
k_step:
reward_decay: 0.9
steps: 5
truncated:
time_window: 100
fulfillment_factor: 1.0
shortage_factor: 1.0
time_decay_factor: 0.97
exploration:
epsilon_range: [0.0, 0.4]
split_point: [0.5, 0.8]
with_cache: true
agents:
algorithm:
num_actions: 21
model:
hidden_dims:
- 256
- 128
- 64
dropout_p: 0.0
optimizer:
lr: 0.05
hyper_parameters:
reward_decay: .0
num_training_rounds_per_target_replacement: 5
tau: 0.1
experience_pool:
capacity: -1
training_loop_parameters:
min_experiences_to_train: 1024
num_batches: 10 # number of times the algorithm's step() method is called
batch_size: 128
seed: 1024 # for reproducibility
distributed:
group_name: "dqn_distributed_test"
actor:
peer: {"actor": 1}
learner:
peer: {"actor_worker": 1}
redis:
host_name: "localhost"
port: 6379

Просмотреть файл

@ -0,0 +1,20 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.
"""
This script is used to debug distributed algorithm in single host multi-process mode.
"""
import os
ACTOR_NUM = 1 # must be same as in config
LEARNER_NUM = 1
learner_path = "components/dist_learner.py &"
actor_path = "components/dist_actor.py &"
for l_num in range(LEARNER_NUM):
os.system(f"python " + learner_path)
for a_num in range(ACTOR_NUM):
os.system(f"python " + actor_path)

Просмотреть файл

@ -0,0 +1,52 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.
import os
import io
import yaml
import numpy as np
from maro.simulator import Env
from maro.rl import SimpleLearner, SimpleActor, AgentMode, KStepExperienceShaper, TwoPhaseLinearExplorer
from maro.utils import Logger, convert_dottable
from examples.ecr.dqn.components.state_shaper import ECRStateShaper
from examples.ecr.dqn.components.action_shaper import ECRActionShaper
from examples.ecr.dqn.components.experience_shaper import TruncatedExperienceShaper
from examples.ecr.dqn.components.agent_manager import DQNAgentManager
with io.open("config.yml", "r") as in_file:
raw_config = yaml.safe_load(in_file)
config = convert_dottable(raw_config)
if __name__ == "__main__":
env = Env(config.env.scenario, config.env.topology, durations=config.env.durations)
agent_id_list = [str(agent_id) for agent_id in env.agent_idx_list]
state_shaper = ECRStateShaper(**config.state_shaping)
action_shaper = ECRActionShaper(action_space=list(np.linspace(-1.0, 1.0, config.agents.algorithm.num_actions)))
if config.experience_shaping.type == "truncated":
experience_shaper = TruncatedExperienceShaper(**config.experience_shaping.truncated)
else:
experience_shaper = KStepExperienceShaper(reward_func=lambda mt: mt["perf"], **config.experience_shaping.k_step)
exploration_config = {"epsilon_range_dict": {"_all_": config.exploration.epsilon_range},
"split_point_dict": {"_all_": config.exploration.split_point},
"with_cache": config.exploration.with_cache
}
explorer = TwoPhaseLinearExplorer(agent_id_list, config.general.total_training_episodes, **exploration_config)
agent_manager = DQNAgentManager(name="ecr_learner",
mode=AgentMode.TRAIN_INFERENCE,
agent_id_list=agent_id_list,
state_shaper=state_shaper,
action_shaper=action_shaper,
experience_shaper=experience_shaper,
explorer=explorer)
learner = SimpleLearner(trainable_agents=agent_manager,
actor=SimpleActor(env=env, inference_agents=agent_manager),
logger=Logger("single_host_ecr_learner", auto_timestamp=False))
learner.train(total_episodes=config.general.total_training_episodes)
learner.test()
learner.dump_models(os.path.join(os.getcwd(), "models"))

Просмотреть файл

@ -378,6 +378,8 @@ cdef class NPSnapshotList(SnapshotListAbc):
"""Reset snapshot list"""
self._cur_index = 0
self._tick2index_dict.clear()
self._index2tick_dict.clear()
self._history_dict.clear()
cdef str node_name
cdef AttrInfo attr_info

Просмотреть файл

@ -1,3 +1,6 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.
import os
import shutil
import tempfile

Просмотреть файл

@ -317,7 +317,7 @@ class WeatherPipeline(DataPipeline):
return WeatherPipeline.WeatherEnum.SUNNY.value
def _parse_date(self, row: dict):
dstr = row["Date"] if row["Date"] != "" else None
dstr = row.get("Date", None)
return dstr
@ -327,9 +327,9 @@ class WeatherPipeline(DataPipeline):
wh = self._weather(row=row)
temp_str = row["Avg Temp"]
temp = round(float(temp_str), 2) if temp_str != "" else self._last_day_temp
temp = round(float(temp_str), 2) if temp_str != "" and temp_str is not None else self._last_day_temp
self.last_day_temp = temp
self._last_day_temp = temp
return {"date": date, "weather": wh, "temp": temp} if date is not None else None

Просмотреть файл

@ -4,10 +4,9 @@
import json
import numpy as np
import os
import pycurl
import requests
import shutil
import sys
import urllib.request
import uuid
from maro.cli.utils.params import GlobalPaths
@ -43,13 +42,10 @@ def download_file(source: str, destination: str):
os.makedirs(tmpdir, exist_ok=True)
os.makedirs(os.path.dirname(destination), exist_ok=True)
source_data = urllib.request.urlopen(source)
res_data = source_data.read()
with open(temp_file_name, "wb") as f:
curl = pycurl.Curl()
curl.setopt(pycurl.URL, source)
curl.setopt(pycurl.WRITEDATA, f)
curl.setopt(pycurl.FOLLOWLOCATION, True)
curl.perform()
curl.close()
f.write(res_data)
if os.path.exists(destination):
os.remove(destination)

Просмотреть файл

@ -143,7 +143,7 @@ class ContainerTrackingAgent(multiprocessing.Process):
encoding='utf8')
nvidia_smi_str = completed_process.stdout
node_details['resources']['actual_gpu_usage'] = f"{float(nvidia_smi_str)}%"
except:
except Exception:
pass
@staticmethod

Просмотреть файл

@ -21,7 +21,8 @@ def copy_files_to_node(local_path: str, remote_dir: str, admin_username: str, no
admin_username (str)
node_ip_address (str)
"""
copy_scripts = f"rsync -e 'ssh -o StrictHostKeyChecking=no' -az {local_path} {admin_username}@{node_ip_address}:{remote_dir}"
copy_scripts = f"rsync -e 'ssh -o StrictHostKeyChecking=no' " \
f"-az {local_path} {admin_username}@{node_ip_address}:{remote_dir}"
_ = SubProcess.run(copy_scripts)
@ -34,7 +35,8 @@ def copy_files_from_node(local_dir: str, remote_path: str, admin_username: str,
admin_username (str)
node_ip_address (str)
"""
copy_scripts = f"rsync -e 'ssh -o StrictHostKeyChecking=no' -az {admin_username}@{node_ip_address}:{remote_path} {local_dir}"
copy_scripts = f"rsync -e 'ssh -o StrictHostKeyChecking=no' " \
f"-az {admin_username}@{node_ip_address}:{remote_path} {local_dir}"
_ = SubProcess.run(copy_scripts)

Просмотреть файл

@ -1,3 +1,7 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.
import hashlib

Просмотреть файл

@ -435,7 +435,10 @@ class K8sAzureExecutor:
sas = self._check_and_get_account_sas()
# Push data
copy_command = f'azcopy copy "{local_path}" "https://{cluster_id}st.file.core.windows.net/{cluster_id}-fs{remote_dir}?{sas}" --recursive=True'
copy_command = f'azcopy copy ' \
f'"{local_path}" ' \
f'"https://{cluster_id}st.file.core.windows.net/{cluster_id}-fs{remote_dir}?{sas}" ' \
f'--recursive=True'
_ = SubProcess.run(copy_command)
def pull_data(self, local_dir: str, remote_path: str):
@ -447,7 +450,10 @@ class K8sAzureExecutor:
sas = self._check_and_get_account_sas()
# Push data
copy_command = f'azcopy copy "https://{cluster_id}st.file.core.windows.net/{cluster_id}-fs{remote_path}?{sas}" "{local_dir}" --recursive=True'
copy_command = f'azcopy copy ' \
f'"https://{cluster_id}st.file.core.windows.net/{cluster_id}-fs{remote_path}?{sas}" ' \
f'"{local_dir}" ' \
f'--recursive=True'
_ = SubProcess.run(copy_command)
def remove_data(self, remote_path: str):
@ -461,7 +467,9 @@ class K8sAzureExecutor:
sas = self._check_and_get_account_sas()
# Remove data
copy_command = f'azcopy remove "https://{cluster_id}st.file.core.windows.net/{cluster_id}-fs{remote_path}?{sas}" --recursive=True'
copy_command = f'azcopy remove ' \
f'"https://{cluster_id}st.file.core.windows.net/{cluster_id}-fs{remote_path}?{sas}" ' \
f'--recursive=True'
_ = SubProcess.run(copy_command)
def _check_and_get_account_sas(self):
@ -523,12 +531,14 @@ class K8sAzureExecutor:
yaml.safe_dump(k8s_job_config, fw)
# Apply k8s config
command = f"kubectl apply -f {GlobalPaths.MARO_CLUSTERS}/{self.cluster_name}/jobs/{job_name}/k8s_configs/jobs.yml"
command = f"kubectl apply -f " \
f"{GlobalPaths.MARO_CLUSTERS}/{self.cluster_name}/jobs/{job_name}/k8s_configs/jobs.yml"
_ = SubProcess.run(command)
def stop_job(self, job_name: str):
# Stop job
command = f"kubectl delete -f {GlobalPaths.MARO_CLUSTERS}/{self.cluster_name}/jobs/{job_name}/k8s_configs/jobs.yml"
command = f"kubectl delete -f " \
f"{GlobalPaths.MARO_CLUSTERS}/{self.cluster_name}/jobs/{job_name}/k8s_configs/jobs.yml"
_ = SubProcess.run(command)
@staticmethod

Просмотреть файл

@ -43,7 +43,10 @@ def main():
# maro env
parser_env = subparsers.add_parser(
'env',
help='Get all environment-related information, such as the supported scenarios, topologies. And it is also responsible to generate data to the specific environment, which has external data dependency.',
help=('Get all environment-related information, '
'such as the supported scenarios, topologies. '
'And it is also responsible to generate data to the specific environment, '
'which has external data dependency.'),
parents=[global_parser]
)
parser_env.set_defaults(func=_help_func(parser=parser_env))
@ -777,7 +780,9 @@ def load_parser_data(prev_parser: ArgumentParser, global_parser: ArgumentParser)
type=int,
default=None,
required=False,
help="Specified start timestamp (in UTC) for binary file, then this timestamp will be considered as tick=0 for binary reader, this can be used to adjust the reader pipeline.")
help=("Specified start timestamp (in UTC) for binary file, "
"then this timestamp will be considered as tick=0 for binary reader, "
"this can be used to adjust the reader pipeline."))
build_cmd_parser.set_defaults(func=convert)

Просмотреть файл

@ -1,3 +1,7 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.
from functools import wraps
from maro.cli.utils.details import load_cluster_details

Просмотреть файл

@ -1,3 +1,7 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.
MARO_GRASS_CREATE = """
Examples:
Create a cluster in grass mode with a deployment

Просмотреть файл

@ -1,3 +1,7 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.
import logging
class GlobalParams:

Просмотреть файл

@ -1,60 +1,55 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.
from maro.rl.actor.abstract_actor import AbstractActor, RolloutMode
from maro.rl.actor.abs_actor import AbsActor
from maro.rl.actor.simple_actor import SimpleActor
from maro.rl.learner.abstract_learner import AbstractLearner
from maro.rl.learner.abs_learner import AbsLearner
from maro.rl.learner.simple_learner import SimpleLearner
from maro.rl.agent.agent import Agent, AgentParameters
from maro.rl.agent.agent_manager import AgentManager, AgentMode
from maro.rl.algorithms.torch.algorithm import Algorithm
from maro.rl.agent.abs_agent import AbsAgent
from maro.rl.agent.abs_agent_manager import AbsAgentManager, AgentMode
from maro.rl.algorithms.torch.abs_algorithm import AbsAlgorithm
from maro.rl.algorithms.torch.dqn import DQN, DQNHyperParams
from maro.rl.models.torch.mlp_representation import MLPRepresentation
from maro.rl.models.torch.decision_layers import MLPDecisionLayers
from maro.rl.models.torch.learning_model import LearningModel
from maro.rl.storage.abstract_store import AbstractStore
from maro.rl.storage.unbounded_store import UnboundedStore
from maro.rl.storage.fixed_size_store import FixedSizeStore, OverwriteType
from maro.rl.shaping.abstract_state_shaper import AbstractStateShaper
from maro.rl.shaping.abstract_action_shaper import AbstractActionShaper
from maro.rl.shaping.abstract_reward_shaper import AbstractRewardShaper
from maro.rl.shaping.k_step_reward_shaper import KStepRewardShaper
from maro.rl.explorer.abstract_explorer import AbstractExplorer
from maro.rl.storage.abs_store import AbsStore
from maro.rl.storage.column_based_store import ColumnBasedStore
from maro.rl.storage.utils import OverwriteType
from maro.rl.shaping.abs_shaper import AbsShaper
from maro.rl.shaping.state_shaper import StateShaper
from maro.rl.shaping.action_shaper import ActionShaper
from maro.rl.shaping.experience_shaper import ExperienceShaper
from maro.rl.shaping.k_step_experience_shaper import KStepExperienceShaper
from maro.rl.explorer.abs_explorer import AbsExplorer
from maro.rl.explorer.simple_explorer import LinearExplorer, TwoPhaseLinearExplorer
from maro.rl.dist_topologies.multi_actor_single_learner_sync import ActorProxy, ActorWorker
from maro.rl.common import ExperienceKey, ExperienceInfoKey, TransitionInfoKey
from maro.rl.dist_topologies.single_learner_multi_actor_sync_mode import ActorProxy, ActorWorker
__all__ = [
"AbstractActor",
"RolloutMode",
"AbsActor",
"SimpleActor",
"AbstractLearner",
"AbsLearner",
"SimpleLearner",
"Agent",
"AgentParameters",
"AgentManager",
"AbsAgent",
"AbsAgentManager",
"AgentMode",
"Algorithm",
"AbsAlgorithm",
"DQN",
"DQNHyperParams",
"MLPRepresentation",
"MLPDecisionLayers",
"LearningModel",
"AbstractStore",
"UnboundedStore",
"FixedSizeStore",
"AbsStore",
"ColumnBasedStore",
"OverwriteType",
"AbstractStateShaper",
"AbstractActionShaper",
"AbstractRewardShaper",
"KStepRewardShaper",
"AbstractExplorer",
"AbsShaper",
"StateShaper",
"ActionShaper",
"ExperienceShaper",
"KStepExperienceShaper",
"AbsExplorer",
"LinearExplorer",
"TwoPhaseLinearExplorer",
"ActorProxy",
"ActorWorker",
"ExperienceKey",
"ExperienceInfoKey",
"TransitionInfoKey"
"ActorWorker"
]

Просмотреть файл

@ -0,0 +1,44 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.
from abc import ABC, abstractmethod
from typing import Union
from maro.rl.agent.abs_agent_manager import AbsAgentManager
from maro.simulator import Env
class AbsActor(ABC):
def __init__(self, env: Env, inference_agents: Union[dict, AbsAgentManager]):
"""
Actor contains env and agents, and it is responsible for collecting experience from the interaction
with the environment.
Args:
env (Env): an Env instance.
inference_agents (dict or AbsAgentManager): a dict of agents or an AgentManager instance that
manages all agents.
"""
self._env = env
self._inference_agents = inference_agents
@abstractmethod
def roll_out(self, model_dict: dict = None, epsilon_dict: dict = None, done: bool = None,
return_details: bool = True):
"""
Performs a single episode of roll-out to collect experiences and performance data from the environment.
Args:
model_dict (dict): if not None, the agents will load the models from model_dict and use these models
to perform roll-out.
epsilon_dict (dict): exploration rate by agent.
done (bool): if True, the current call is the last call, i.e., no more roll-outs will be performed.
This flag is used to signal remote actor workers to exit.
return_details (bool): if True, return episode details (e.g., experiences) as well as performance
metrics provided by the env.
"""
return NotImplementedError
@property
def inference_agents(self):
return self._inference_agents

Просмотреть файл

@ -1,47 +1,50 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.
from typing import Union
from maro.simulator import Env
from .abstract_actor import AbstractActor, RolloutMode
from maro.rl.agent.agent_manager import AgentManager
from .abs_actor import AbsActor
from maro.rl.agent.abs_agent_manager import AbsAgentManager
from maro.simulator import Env
class SimpleActor(AbstractActor):
class SimpleActor(AbsActor):
"""
A simple actor class that implements typical roll-out logic
A simple actor class that implements simple roll-out logic
"""
def __init__(self, env: Union[dict, Env], inference_agents: AgentManager):
assert isinstance(inference_agents, AgentManager), \
"SimpleActor only accepts type AgentManager for parameter inference_agents"
super().__int__(env, inference_agents)
def __init__(self, env: Env, inference_agents: AbsAgentManager):
super().__init__(env, inference_agents)
def roll_out(self, mode, models=None, epsilon_dict=None, seed: int = None):
if mode == RolloutMode.EXIT:
def roll_out(self, model_dict: dict = None, epsilon_dict: dict = None, done: bool = False,
return_details: bool = True):
"""
The main interface provided by the Actor class, in which the agents perform a single episode of roll-out
to collect experiences and performance data from the environment
Args:
model_dict (dict): if not None, the agents will load the models from model_dict and use these models
to perform roll-out.
epsilon_dict (dict): exploration rate by agent
done (bool): if True, the current call is the last call, i.e., no more roll-outs will be performed.
This flag is used to signal remote actor workers to exit.
return_details (bool): if True, return experiences as well as performance metrics provided by the env.
"""
if done:
return None, None
env = self._env if isinstance(self._env, Env) else self._env[mode]
if seed is not None:
env.set_seed(seed)
self._env.reset()
# assign epsilons
if epsilon_dict is not None:
self._inference_agents.explorer.epsilon = epsilon_dict
# load models
if models is not None:
self._inference_agents.load_models(models)
if model_dict is not None:
self._inference_agents.load_models(model_dict)
metrics, decision_event, is_done = env.step(None)
metrics, decision_event, is_done = self._env.step(None)
while not is_done:
action = self._inference_agents.choose_action(decision_event, env.snapshot_list)
metrics, decision_event, is_done = env.step(action)
action = self._inference_agents.choose_action(decision_event, self._env.snapshot_list)
metrics, decision_event, is_done = self._env.step(action)
self._inference_agents.on_env_feedback(metrics)
exp_by_agent = self._inference_agents.post_process(env.snapshot_list) if mode == RolloutMode.TRAIN else None
performance = env.metrics
env.reset()
details = self._inference_agents.post_process(self._env.snapshot_list) if return_details else None
return {'local': performance}, exp_by_agent
return self._env.metrics, details

Просмотреть файл

@ -0,0 +1,83 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.
from abc import ABC, abstractmethod
import os
import pickle
import torch
from maro.rl.algorithms.torch.abs_algorithm import AbsAlgorithm
class AbsAgent(ABC):
def __init__(self,
name: str,
algorithm: AbsAlgorithm,
experience_pool
):
"""
RL agent class. It's a sandbox for the RL algorithm. Scenario-specific details will be excluded out.
We focus on the abstraction algorithm development here. Environment observation and decision events will be
converted to a uniform format before calling in. And the output will be converted to an environment executable
format before return back to the environment. Its key responsibility is optimizing policy based on interaction
with the environment.
Args:
name (str): agent's name.
algorithm: a concrete algorithm instance that inherits from AbstractAlgorithm. This is the centerpiece
of the Agent class and is responsible for the most important tasks of an agent: choosing
actions and optimizing models.
experience_pool: a data store that stores experiences generated by the experience shaper
"""
self._name = name
self._algorithm = algorithm
self._experience_pool = experience_pool
@property
def algorithm(self):
return self._algorithm
@property
def experience_pool(self):
return self._experience_pool
def choose_action(self, model_state, epsilon: float = .0):
"""
Choose an action using the underlying algorithm based a preprocessed env state.
Args:
model_state: state vector as accepted by the underlying algorithm.
epsilon (float): exploration rate.
Returns:
Action given by the underlying policy model.
"""
return self._algorithm.choose_action(model_state, epsilon)
@abstractmethod
def train(self):
"""
Runs a specified number of training steps, with each step consisting of sampling a batch from the experience
pool and running the underlying algorithm's train_on_batch() method.
"""
return NotImplementedError
def store_experiences(self, experiences):
self._experience_pool.put(experiences)
def load_model_dict(self, model_dict: dict):
self._algorithm.model_dict = model_dict
def load_model_dict_from_file(self, file_path):
model_dict = torch.load(file_path)
for model_key, state_dict in model_dict.items():
self._algorithm.model_dict[model_key].load_state_dict(state_dict)
def dump_model_dict(self, dir_path: str):
torch.save({model_key: model.state_dict() for model_key, model in self._algorithm.model_dict.items()},
os.path.join(dir_path, self._name))
def dump_experience_store(self, dir_path: str):
with open(os.path.join(dir_path, self._name)) as fp:
pickle.dump(self._experience_pool, fp)

Просмотреть файл

@ -0,0 +1,162 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.
from abc import ABC, abstractmethod
from enum import Enum
import os
from maro.rl.shaping.state_shaper import StateShaper
from maro.rl.shaping.action_shaper import ActionShaper
from maro.rl.shaping.experience_shaper import ExperienceShaper
from maro.rl.explorer.abs_explorer import AbsExplorer
from maro.utils.exception.rl_toolkit_exception import UnsupportedAgentModeError, MissingShaperError, WrongAgentModeError
class AgentMode(Enum):
TRAIN = "train"
INFERENCE = "inference"
TRAIN_INFERENCE = "train_inference"
class AbsAgentManager(ABC):
def __init__(self,
name: str,
mode: AgentMode,
agent_id_list: [str],
state_shaper: StateShaper = None,
action_shaper: ActionShaper = None,
experience_shaper: ExperienceShaper = None,
explorer: AbsExplorer = None):
"""
Manages all agents.
Args:
name (str): name of agent manager.
mode (AgentMode): An AgentMode enum member that specifies that role of the agent. Some attributes may
be None under certain modes.
agent_id_list (list): list of agent identifiers.
experience_shaper: responsible for processing data in the replay buffer at the end of an episode, e.g.,
adjusting rewards and computing target states.
state_shaper: responsible for extracting information from a decision event and the snapshot list for
the event to form a state vector as accepted by the underlying algorithm
action_shaper: responsible for converting the output of an agent's action to an EnvAction object that can
be executed by the environment. Cannot be None under Inference and TrainInference modes.
explorer: responsible for storing and updating exploration rates.
"""
self._name = name
if mode not in AgentMode:
raise UnsupportedAgentModeError(msg='mode must be "train", "inference" or "train_inference"')
self._mode = mode
if mode in {AgentMode.INFERENCE, AgentMode.TRAIN_INFERENCE}:
if state_shaper is None:
raise MissingShaperError(msg=f"state shaper cannot be None under mode {self._mode}")
if action_shaper is None:
raise MissingShaperError(msg=f"action_shaper cannot be None under mode {self._mode}")
if experience_shaper is None:
raise MissingShaperError(msg=f"experience_shaper cannot be None under mode {self._mode}")
self._state_shaper = state_shaper
self._action_shaper = action_shaper
self._experience_shaper = experience_shaper
self._explorer = explorer
self._agent_id_list = agent_id_list
self._trajectory = []
self._agent_dict = {}
self._assemble(self._agent_dict)
def __getitem__(self, agent_id):
return self._agent_dict[agent_id]
def _assemble(self, agent_dict):
"""
abstract method to populate the _agent_dict attribute.
"""
return NotImplemented
def choose_action(self, decision_event, snapshot_list):
self._assert_inference_mode()
agent_id, model_state = self._state_shaper(decision_event, snapshot_list)
model_action = self._agent_dict[agent_id].choose_action(
model_state, self._explorer.epsilon[agent_id] if self._explorer else None)
self._trajectory.append({"state": model_state,
"action": model_action,
"reward": None,
"agent_id": agent_id,
"event": decision_event})
return self._action_shaper(model_action, decision_event, snapshot_list)
def on_env_feedback(self, metrics):
self._trajectory[-1]["metrics"] = metrics
def post_process(self, snapshot_list):
"""
Called at the end of an episode, this function processes data from the latest episode, including reward
adjustments and next-state computations, and returns experiences for individual agents.
Args:
snapshot_list: the snapshot list from the env at the end of an episode.
"""
experiences = self._experience_shaper(self._trajectory, snapshot_list)
self._trajectory.clear()
self._state_shaper.reset()
self._action_shaper.reset()
self._experience_shaper.reset()
return experiences
@abstractmethod
def store_experiences(self, experiences):
return NotImplementedError
def update_epsilon(self, performance):
"""
This updates the exploration rates for each agent.
Args:
performance: performance from the latest episode.
"""
if self._explorer:
self._explorer.update(performance)
def train(self):
self._assert_train_mode()
for agent in self._agent_dict.values():
agent.train()
def load_models(self, agent_model_dict):
for agent_id, model_dict in agent_model_dict.items():
self._agent_dict[agent_id].load_model_dict(model_dict)
def load_models_from_files(self, file_path_dict):
for agent_id, file_path in file_path_dict.items():
self._agent_dict[agent_id].load_model_dict_from(file_path)
def dump_models(self, dir_path: str):
os.makedirs(dir_path, exist_ok=True)
for agent in self._agent_dict.values():
agent.dump_model_dict(dir_path)
def get_models(self):
return {agent_id: agent.algorithm.model_dict for agent_id, agent in self._agent_dict.items()}
@property
def name(self):
return self._name
@property
def agents(self):
return self._agent_dict
@property
def explorer(self):
return self._explorer
def _assert_train_mode(self):
if self._mode != AgentMode.TRAIN and self._mode != AgentMode.TRAIN_INFERENCE:
raise WrongAgentModeError(msg=f"this method is unavailable under mode {self._mode}")
def _assert_inference_mode(self):
if self._mode != AgentMode.INFERENCE and self._mode != AgentMode.TRAIN_INFERENCE:
raise WrongAgentModeError(msg=f"this method is unavailable under mode {self._mode}")

Просмотреть файл

@ -0,0 +1,63 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.
from abc import ABC, abstractmethod
import itertools
from typing import Union
class AbsAlgorithm(ABC):
def __init__(self, model_dict: dict, optimizer_opt: Union[dict, tuple], loss_func_dict: dict, hyper_params: object):
"""
It's the abstraction of RL algorithm, which provides a uniform policy interface, such choose_action, train_on_batch.
We also provide some predefined RL algorithm based on it, such DQN, A2C, etc. User can inherit from it to customize
their own algorithms.
Args:
model_dict (dict): underlying models for the algorithm (e.g., for A2C,
model_dict = {"actor": ..., "critic": ...})
optimizer_opt (tuple or dict): tuple or dict of tuples of (optimizer_class, optimizer_params) associated
with the models in model_dict. If it is a tuple, the optimizer to be
instantiated applies to all trainable parameters from model_dict. If it
is a dict, the optimizer will be applied to the related model with the same key.
loss_func_dict (dict): loss function types associated with the models in model_dict.
hyper_params (object): algorithm-specific hyper-parameter set.
"""
self._loss_func_dict = loss_func_dict
self._hyper_params = hyper_params
self._model_dict = model_dict
self._register_optimizers(optimizer_opt)
def _register_optimizers(self, optimizer_opt):
if isinstance(optimizer_opt, tuple):
# If a single optimizer_opt tuple is provided, a single optimizer will be created to jointly
# optimize all model parameters involved in the algorithm.
optim_cls, optim_params = optimizer_opt
model_params = [model.parameters() for model in self._model_dict.values()]
self._optimizer = optim_cls(itertools.chain(*model_params), **optim_params)
else:
self._optimizer = {}
for model_key, model in self._model_dict.items():
# No gradient required
if model_key not in optimizer_opt or optimizer_opt[model_key] is None:
self._model_dict[model_key].eval()
self._optimizer[model_key] = None
else:
optim_cls, optim_params = optimizer_opt[model_key]
self._optimizer[model_key] = optim_cls(model.parameters(), **optim_params)
@property
def model_dict(self):
return self._model_dict
@model_dict.setter
def model_dict(self, model_dict):
self._model_dict = model_dict
@abstractmethod
def train(self, *args, **kwargs):
return NotImplementedError
@abstractmethod
def choose_action(self, state, epsilon: float = None):
return NotImplementedError

Просмотреть файл

@ -1,32 +1,50 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.
import random
from copy import deepcopy
from typing import Union
import numpy as np
import torch
from maro.rl.algorithms.torch.algorithm import Algorithm
from maro.rl.common import ExperienceKey, ExperienceInfoKey
from maro.rl.algorithms.torch.abs_algorithm import AbsAlgorithm
from maro.utils import clone
class DQNHyperParams:
def __init__(self, num_actions: int, replace_target_frequency: int, tau: float):
__slots__ = ["num_actions", "reward_decay", "num_training_rounds_per_target_replacement", "tau"]
def __init__(self, num_actions: int, reward_decay: float, num_training_rounds_per_target_replacement: int,
tau: float = 1.0):
"""
DQN hyper-parameters.
Args:
num_actions (int): number of possible actions
reward_decay (float): reward decay as defined in standard RL terminology
num_training_rounds_per_target_replacement (int): number of training frequency of target model replacement
tau (float): soft update coefficient, e.g., target_model = tau * eval_model + (1-tau) * target_model
"""
self.num_actions = num_actions
self.replace_target_frequency = replace_target_frequency
self.reward_decay = reward_decay
self.num_training_rounds_per_target_replacement = num_training_rounds_per_target_replacement
self.tau = tau
class DQN(Algorithm):
def __init__(self, model_dict, optimizer_opt, loss_func_dict, hyper_params: DQNHyperParams):
class DQN(AbsAlgorithm):
def __init__(self, model_dict: dict, optimizer_opt: Union[dict, tuple], loss_func_dict: dict,
hyper_params: DQNHyperParams):
"""
DQN algorithm. The model_dict must contain the key "eval". Optionally a model corresponding to
the key "target" can be provided. If the key "target" is absent or model_dict["target"] is None,
the target model will be a deep copy of the provided eval model.
"""
if model_dict.get("target", None) is None:
model_dict['target'] = deepcopy(model_dict['eval'])
model_dict["target"] = clone(model_dict["eval"])
super().__init__(model_dict, optimizer_opt, loss_func_dict, hyper_params)
self._train_cnt = 0
self._device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
def choose_action(self, state, epsilon: float = None):
if epsilon is None or random.random() > epsilon:
def choose_action(self, state: np.ndarray, epsilon: float = None):
if epsilon is None or np.random.rand() > epsilon:
state = torch.from_numpy(state).unsqueeze(0)
self._model_dict["eval"].eval()
with torch.no_grad():
@ -34,27 +52,32 @@ class DQN(Algorithm):
best_action_idx = q_values.argmax(dim=1).item()
return best_action_idx
return random.choice(range(self._hyper_params.num_actions))
return np.random.choice(self._hyper_params.num_actions)
def train_on_batch(self, batch):
state = torch.from_numpy(batch[ExperienceKey.STATE])
action = torch.from_numpy(batch[ExperienceKey.ACTION])
reward = torch.from_numpy(batch[ExperienceKey.REWARD]).squeeze(1)
next_state = torch.from_numpy(batch[ExperienceKey.NEXT_STATE])
discount = torch.from_numpy(batch[ExperienceInfoKey.DISCOUNT])
q_value = self._model_dict["eval"](state).gather(1, action).squeeze(1)
target = (reward + discount * self._model_dict["target"](next_state).max(dim=1)[0]).detach()
loss = self._loss_func_dict["eval"](q_value, target)
def _prepare_batch(self, raw_batch):
return {key: torch.from_numpy(np.asarray(lst)).to(self._device) for key, lst in raw_batch.items()}
def train(self, state: np.ndarray, action: np.ndarray, reward: np.ndarray, next_state: np.ndarray):
state = torch.from_numpy(state).to(self._device)
action = torch.from_numpy(action).to(self._device)
reward = torch.from_numpy(reward).to(self._device)
next_state = torch.from_numpy(next_state).to(self._device)
if len(action.shape) == 1:
action = action.unsqueeze(1)
current_q_values = self._model_dict["eval"](state).gather(1, action).squeeze(1)
next_q_values = self._model_dict["target"](next_state).max(dim=1)[0]
target_q_values = (reward + self._hyper_params.reward_decay * next_q_values).detach()
loss = self._loss_func_dict["eval"](current_q_values, target_q_values)
self._model_dict["eval"].train()
self._optimizer.zero_grad()
loss.backward()
self._optimizer.step()
self._train_cnt += 1
if self._train_cnt % self._hyper_params.replace_target_frequency == 0:
if self._train_cnt % self._hyper_params.num_training_rounds_per_target_replacement == 0:
self._update_target_model()
return np.abs((q_value - target).detach().numpy())
return np.abs((current_q_values - target_q_values).detach().numpy())
def _update_target_model(self):
for evl, target in zip(self._model_dict["eval"].parameters(), self._model_dict["target"].parameters()):
target.data = self._hyper_params.tau * evl.data + (1 - self._hyper_params.tau) * target.data
for eval_params, target_params in zip(self._model_dict["eval"].parameters(), self._model_dict["target"].parameters()):
target_params.data = self._hyper_params.tau * eval_params.data + (1 - self._hyper_params.tau) * target_params.data

Просмотреть файл

@ -2,9 +2,10 @@ from enum import Enum
class PayloadKey(Enum):
RolloutMode = "rollout_mode"
MODEL = "model"
EPSILON = "epsilon"
PERFORMANCE = "performance"
EXPERIENCE = "experience"
SEED = "seed"
DONE = "done"
RETURN_DETAILS = "return_details"

Просмотреть файл

@ -0,0 +1,81 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.
from enum import Enum
from collections import defaultdict
import sys
from maro.communication import Proxy, SessionType
from maro.communication.registry_table import RegisterTable
from maro.rl.dist_topologies.common import PayloadKey
from maro.rl.actor.abs_actor import AbsActor
class MessageTag(Enum):
ROLLOUT = "rollout"
UPDATE = "update"
class ActorProxy(object):
def __init__(self, proxy_params):
self._proxy = Proxy(component_type="actor", **proxy_params)
def roll_out(self, model_dict: dict = None, epsilon_dict: dict = None, done: bool = False,
return_details: bool = True):
if done:
self._proxy.ibroadcast(tag=MessageTag.ROLLOUT,
session_type=SessionType.NOTIFICATION,
payload={PayloadKey.DONE: True})
return None, None
else:
performance, exp_by_agent = {}, {}
payloads = [(peer, {PayloadKey.MODEL: model_dict,
PayloadKey.EPSILON: epsilon_dict,
PayloadKey.RETURN_DETAILS: return_details})
for peer in self._proxy.peers["actor_worker"]]
# TODO: double check when ack enable
replies = self._proxy.scatter(tag=MessageTag.ROLLOUT, session_type=SessionType.TASK,
destination_payload_list=payloads)
for msg in replies:
performance[msg.source] = msg.payload[PayloadKey.PERFORMANCE]
if msg.payload[PayloadKey.EXPERIENCE] is not None:
for agent_id, exp_set in msg.payload[PayloadKey.EXPERIENCE].items():
if agent_id not in exp_by_agent:
exp_by_agent[agent_id] = defaultdict(list)
for k, v in exp_set.items():
exp_by_agent[agent_id][k].extend(v)
return performance, exp_by_agent
class ActorWorker(object):
def __init__(self, local_actor: AbsActor, proxy_params):
self._local_actor = local_actor
self._proxy = Proxy(component_type="actor_worker", **proxy_params)
self._registry_table = RegisterTable(self._proxy.get_peers)
self._registry_table.register_event_handler("actor:rollout:1", self.on_rollout_request)
def on_rollout_request(self, message):
data = message.payload
if data.get(PayloadKey.DONE, False):
sys.exit(0)
performance, experiences = self._local_actor.roll_out(model_dict=data[PayloadKey.MODEL],
epsilon_dict=data[PayloadKey.EPSILON],
return_details=data[PayloadKey.RETURN_DETAILS])
self._proxy.reply(received_message=message,
tag=MessageTag.UPDATE,
payload={PayloadKey.PERFORMANCE: performance,
PayloadKey.EXPERIENCE: experiences}
)
def launch(self):
"""
This launches an ActorWorker instance.
"""
for msg in self._proxy.receive():
self._registry_table.push(msg)
triggered_events = self._registry_table.get()
for handler_fn, cached_messages in triggered_events:
handler_fn(cached_messages)

Просмотреть файл

@ -0,0 +1,51 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.
from abc import ABC, abstractmethod
class AbsExplorer(ABC):
def __init__(self, agent_id_list: list, total_episodes: int, epsilon_range_dict: dict, with_cache: bool = True):
"""
Args:
agent_id_list (list): list of agent ID's.
total_episodes: total number of episodes in the training phase.
epsilon_range_dict (dict): a dictionary containing tuples of lower and upper bounds
for the generated exploration rate for each agent. If the
dictionary contains "_all_" as a key, the corresponding
value will be shared amongst all agents.
with_cache (bool): if True, incoming performances will be cached.
"""
self._total_episodes = total_episodes
self._epsilon_range_dict = epsilon_range_dict
self._performance_cache = [] if with_cache else None
if "_all_" in self._epsilon_range_dict:
self._current_epsilon = {agent_id: self._epsilon_range_dict["_all_"][1] for agent_id in agent_id_list}
else:
self._current_epsilon = {agent_id: self._epsilon_range_dict.get(agent_id, (.0, .0))[1]
for agent_id in agent_id_list}
# TODO: performance: summary -> total perf (current version), details -> per-agent perf
@abstractmethod
def update(self, performance=None):
"""
Updates current epsilon.
Args:
performance: performance from the latest episode.
"""
return NotImplementedError
@property
def epsilon_range_dict(self):
return self._epsilon_range_dict
@property
def epsilon(self):
return self._current_epsilon
@epsilon.setter
def epsilon(self, epsilon_dict: dict):
self._current_epsilon = epsilon_dict
def epsilon_range_by_id(self, agent_id):
return self._epsilon_range_dict[agent_id]

Просмотреть файл

@ -1,10 +1,10 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.
from .abstract_explorer import AbstractExplorer
from .abs_explorer import AbsExplorer
class LinearExplorer(AbstractExplorer):
class LinearExplorer(AbsExplorer):
def __init__(self, agent_id_list, total_episodes, epsilon_range_dict, with_cache=True):
super().__init__(agent_id_list, total_episodes, epsilon_range_dict, with_cache=with_cache)
self._step_dict = {}
@ -17,7 +17,7 @@ class LinearExplorer(AbstractExplorer):
self._current_epsilon[agent_id] = max(.0, self._current_epsilon[agent_id] - self._step_dict[agent_id])
class TwoPhaseLinearExplorer(AbstractExplorer):
class TwoPhaseLinearExplorer(AbsExplorer):
"""
An exploration scheme that consists of two linear schedules separated by a split point
"""

Просмотреть файл

@ -0,0 +1,22 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.
from abc import ABC, abstractmethod
class AbsLearner(ABC):
def __init__(self):
pass
@abstractmethod
def train(self, total_episodes):
"""
Main loop for collecting experiences and performance from the actor and using them to optimize models
Args:
total_episodes (int): number of episodes for the main training loop
"""
pass
@abstractmethod
def test(self):
pass

Просмотреть файл

@ -1,55 +1,64 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.
from maro.rl.actor.abstract_actor import RolloutMode
from .abstract_learner import AbstractLearner
from maro.rl.agent.agent_manager import AgentManager
from .abs_learner import AbsLearner
from maro.rl.agent.abs_agent_manager import AbsAgentManager
from maro.rl.actor.simple_actor import SimpleActor
from maro.utils import DummyLogger
class SimpleLearner(AbstractLearner):
class SimpleLearner(AbsLearner):
"""
A learner class that executes simple roll-out-and-train cycles.
It is used to control the policy learning process...
"""
def __init__(self, trainable_agents: AgentManager, actor, logger=DummyLogger(), seed: int = None):
def __init__(self, trainable_agents: AbsAgentManager, actor, logger=DummyLogger()):
"""
seed (int): initial random seed value for the underlying simulator. If None, no manual seed setting \n
is performed.
seed (int): initial random seed value for the underlying simulator. If None, no manual seed setting is
performed.
Args:
trainable_agents (dict or AgentManager): an AgentManager instance that manages all agents
actor (Actor of ActorProxy): an Actor or VectorActorProxy instance.
logger: used for logging important events
seed (int): initial random seed value for the underlying simulator. If None, no seed fixing is done \n
for the underlying simulator.
trainable_agents (AbsAgentManager): an AgentManager instance that manages all agents.
actor (Actor or ActorProxy): an Actor or VectorActorProxy instance.
logger: used for logging important events.
"""
assert isinstance(trainable_agents, AgentManager), \
"SimpleLearner only accepts AgentManager for parameter trainable_agents"
super().__init__(trainable_agents=trainable_agents, actor=actor, logger=logger)
self._seed = seed
super().__init__()
self._trainable_agents = trainable_agents
self._actor = actor
self._logger = logger
def train(self, total_episodes):
"""
Main loop for collecting experiences and performance from the actor and using them to optimize models.
Args:
total_episodes (int): number of episodes for the main training loop.
"""
for current_ep in range(1, total_episodes+1):
models = None if self._is_shared_agent_instance() else self._trainable_agents.get_models()
performance, exp_by_agent = self._actor.roll_out(mode=RolloutMode.TRAIN,
models=models,
epsilon_dict=self._trainable_agents.explorer.epsilon,
seed=self._seed)
if self._seed is not None:
self._seed += len(performance)
for actor_id, perf in performance.items():
self._logger.info(f"ep {current_ep} - performance: {perf}, source: {actor_id}, "
f"epsilons: {self._trainable_agents.explorer.epsilon}")
model_dict = None if self._is_shared_agent_instance() else self._trainable_agents.get_models()
epsilon_dict = self._trainable_agents.explorer.epsilon if self._trainable_agents.explorer else None
performance, exp_by_agent = self._actor.roll_out(model_dict=model_dict, epsilon_dict=epsilon_dict)
if isinstance(performance, dict):
for actor_id, perf in performance.items():
self._logger.info(f"ep {current_ep} - performance: {perf}, source: {actor_id}, epsilons: {epsilon_dict}")
else:
self._logger.info(f"ep {current_ep} - performance: {performance}, epsilons: {epsilon_dict}")
self._trainable_agents.store_experiences(exp_by_agent)
self._trainable_agents.train()
self._trainable_agents.update_epsilon(performance)
def test(self):
performance, _ = self._actor.roll_out(mode=RolloutMode.TEST, models=self._trainable_agents.get_models())
"""
This tells the actor to perform one episode of roll-out for model testing purposes.
"""
performance, _ = self._actor.roll_out(model_dict=self._trainable_agents.get_models(), return_details=False)
for actor_id, perf in performance.items():
self._logger.info(f"test performance from {actor_id}: {perf}")
self._actor.roll_out(mode=RolloutMode.EXIT)
self._actor.roll_out(done=True)
def dump_models(self, dir_path: str):
self._trainable_agents.dump_models(dir_path)
def _is_shared_agent_instance(self):
"""
If true, the set of agents performing inference in actor is the same as self._trainable_agents.
"""
return isinstance(self._actor, SimpleActor) and id(self._actor.inference_agents) == id(self._trainable_agents)

Просмотреть файл

@ -33,11 +33,10 @@ class MLPDecisionLayers(nn.Module):
self._head = nn.Linear(self._input_dim, self._output_dim)
else:
self._head = nn.Linear(hidden_dims[-1], self._output_dim)
self._device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
self._net = nn.Sequential(*self._layers, self._head).to(self._device)
self._net = nn.Sequential(*self._layers, self._head)
def forward(self, x):
return self._net(x.to(self._device)).double()
return self._net(x).double()
@property
def input_dim(self):

Просмотреть файл

@ -34,11 +34,10 @@ class MLPRepresentation(nn.Module):
self._head = nn.Linear(self._input_dim, self._output_dim)
else:
self._head = nn.Linear(hidden_dims[-1], self._output_dim)
self._device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
self._net = nn.Sequential(*self._layers, self._head).to(self._device)
self._net = nn.Sequential(*self._layers, self._head)
def forward(self, x):
return self._net(x.to(self._device)).double()
return self._net(x).double()
@property
def input_dim(self):

Просмотреть файл

@ -0,0 +1,13 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.
from abc import ABC, abstractmethod
class AbsShaper(ABC):
def __init__(self, *args, **kwargs):
pass
@abstractmethod
def __call__(self, *args, **kwargs):
pass

Просмотреть файл

@ -0,0 +1,18 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.
from abc import ABC, abstractmethod
from .abs_shaper import AbsShaper
class ActionShaper(AbsShaper):
"""
An action shaper is used to convert an agent's model output to an Action object which can be executed by the
environment.
"""
@abstractmethod
def __call__(self, model_action, decision_event, snapshot_list):
pass
def reset(self):
pass

Просмотреть файл

@ -0,0 +1,33 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.
from abc import ABC, abstractmethod
from typing import Callable, Iterable, Union
from .abs_shaper import AbsShaper
class ExperienceShaper(AbsShaper):
"""
A reward shaper is used to record transitions during a roll-out episode and perform necessary post-processing
at the end of the episode. The post-processing logic is encapsulated in the abstract shape() method and needs
to be implemented for each scenario. In particular, it is necessary to specify how to determine the reward for
an action given the business metrics associated with the corresponding transition.
"""
def __init__(self, reward_func: Union[Callable, None], *args, **kwargs):
super().__init__(*args, **kwargs)
self._reward_func = reward_func
@abstractmethod
def __call__(self, trajectory, snapshot_list) -> Iterable:
"""
Converts transitions along a trajectory to experiences.
Args:
snapshot_list: snapshot list stored in the env at the end of an episode.
Returns:
Experiences that can be used by the algorithm
"""
pass
def reset(self):
pass

Просмотреть файл

@ -0,0 +1,61 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.
from collections import defaultdict, deque
from enum import Enum
from typing import Callable
from .experience_shaper import ExperienceShaper
class KStepExperienceKeys(Enum):
STATE = "state"
ACTION = "action"
REWARD = "reward"
RETURN = "return"
NEXT_STATE = "next_state"
NEXT_ACTION = "next_action"
DISCOUNT = "discount"
class KStepExperienceShaper(ExperienceShaper):
def __init__(self, reward_func: Callable, reward_decay: float, steps: int, is_per_agent: bool = True):
"""
An experience shaper that generates K-step returns as well as the full return for each transition
along a trajectory.
Args:
reward_func: a function used to compute immediate rewards from metrics given by the env.
reward_decay: decay factor used to evaluate multi-step returns.
steps: number of time steps used in computing returns
is_per_agent: if True, the generated experiences will be bucketed by agent ID.
"""
super().__init__(reward_func)
self._reward_decay = reward_decay
self._steps = steps
self._is_per_agent = is_per_agent
def __call__(self, trajectory, snapshot_list):
experiences = defaultdict(lambda: defaultdict(deque)) if self._is_per_agent else defaultdict(deque)
reward_list = deque()
full_return = partial_return = 0
for i in range(len(trajectory) - 2, -1, -1):
transition = trajectory[i]
next_transition = trajectory[min(len(trajectory) - 1, i + self._steps)]
reward_list.appendleft(self._reward_func(trajectory[i]["metrics"]))
# compute the full return
full_return = full_return * self._reward_decay + reward_list[0]
# compute the partial return
partial_return = partial_return * self._reward_decay + reward_list[0]
if len(reward_list) > self._steps:
partial_return -= reward_list.pop() * self._reward_decay ** (self._steps - 1)
agent_exp = experiences[transition["agent_id"]] if self._is_per_agent else experiences
agent_exp[KStepExperienceKeys.STATE.value].appendleft(transition["state"])
agent_exp[KStepExperienceKeys.ACTION.value].appendleft(transition["action"])
agent_exp[KStepExperienceKeys.REWARD.value].appendleft(partial_return)
agent_exp[KStepExperienceKeys.RETURN.value].appendleft(full_return)
agent_exp[KStepExperienceKeys.NEXT_STATE.value].appendleft(next_transition["state"])
agent_exp[KStepExperienceKeys.NEXT_ACTION.value].appendleft(next_transition["action"])
agent_exp[KStepExperienceKeys.DISCOUNT.value].appendleft(self._reward_decay ** (min(self._steps, len(trajectory)-1-i)))
return experiences

Просмотреть файл

@ -0,0 +1,18 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.
from abc import abstractmethod
from .abs_shaper import AbsShaper
class StateShaper(AbsShaper):
"""
A state shaper is used to convert a decision event and snapshot list to a state vector as input to value or
policy models by extracting relevant temporal and spatial information.
"""
@abstractmethod
def __call__(self, decision_event, snapshot_list):
pass
def reset(self):
pass

Просмотреть файл

@ -0,0 +1,78 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.
from abc import ABC, abstractmethod
from typing import Sequence
class AbsStore(ABC):
def __init__(self):
pass
@abstractmethod
def get(self, indexes: Sequence):
"""
Get contents.
Args:
indexes: a sequence of indexes where store contents are to be retrieved.
Returns:
retrieved contents.
"""
pass
@abstractmethod
def put(self, contents: Sequence, index_sampler: Sequence):
"""
Put new contents.
Args:
contents: Item object list.
index_sampler: optional custom sampler used to obtain indexes for overwriting
Returns:
The newly appended item indexes.
"""
pass
@abstractmethod
def update(self, indexes: Sequence, contents: Sequence):
"""
Update selected contents.
Args:
indexes: Item indexes list.
contents: Item list, which has the same length as indexes.
Returns:
The updated item index list.
"""
pass
def filter(self, filters):
"""
Multi-filter method.
The next layer filter input is the last layer filter output.
Args:
filters (Iterable[Callable]): Filter list, each item is a lambda function.
The lambda input is a tuple, (index, object).
i.e. [lambda x: x['a'] == 1 and x['b'] == 1]
Returns:
list: Filtered indexes or contents. i.e. [1, 2, 3], ['a', 'b', 'c']
"""
pass
@abstractmethod
def sample(self, size, weights: Sequence, replace: bool = True):
"""
Obtain a random sample from the experience pool.
Args:
size (int): sample sizes for each round of sampling in the chain. If this is a single integer, it is
used as the sample size for all samplers in the chain.
weights (Sequence): a sequence of sampling weights
replace (bool): if True, sampling is performed with replacement. Default is True.
Returns:
Tuple: Sampled indexes and contents.i.e. [1, 2, 3], ['a', 'b', 'c']
"""
pass

Просмотреть файл

@ -0,0 +1,194 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.
from collections import defaultdict
from typing import Callable, List, Sequence, Tuple
import numpy as np
from .abs_store import AbsStore
from .utils import check_uniformity, get_update_indexes, normalize, OverwriteType
from maro.utils import clone
class ColumnBasedStore(AbsStore):
def __init__(self, capacity: int = -1, overwrite_type: OverwriteType = None):
"""
A ColumnBasedStore instance that uses a Python list as its internal storage data structure and supports unlimited
and limited storage.
Args:
capacity: if -1, the store is of unlimited capacity. Default is -1.
overwrite_type (OverwriteType): If storage capacity is bounded, this specifies how existing entries
are overwritten.
"""
super().__init__()
self._capacity = capacity
self._store = defaultdict(lambda: [] if self._capacity < 0 else [None] * self._capacity)
self._size = 0
self._overwrite_type = overwrite_type
self._iter_index = 0
def __len__(self):
return self._size
def __iter__(self):
return self
def __next__(self):
if self._iter_index >= self._size:
self._iter_index = 0
raise StopIteration
index = self._iter_index
self._iter_index += 1
return {k: lst[index] for k, lst in self._store.items()}
def __getitem__(self, index: int):
return {k: lst[index] for k, lst in self._store.items()}
@property
def capacity(self):
return self._capacity
@property
def overwrite_type(self):
return self._overwrite_type
def get(self, indexes: [int]) -> dict:
return {k: [self._store[k][i] for i in indexes] for k in self._store}
@check_uniformity(arg_num=1)
def put(self, contents: dict, overwrite_indexes: Sequence = None) -> List[int]:
if len(self._store) > 0 and contents.keys() != self._store.keys():
raise ValueError(f"expected keys {list(self._store.keys())}, got {list(contents.keys())}")
added_size = len(contents[next(iter(contents))])
if self._capacity < 0:
for key, lst in contents.items():
self._store[key].extend(lst)
self._size += added_size
return list(range(self._size-added_size, self._size))
else:
write_indexes = get_update_indexes(self._size, added_size, self._capacity, self._overwrite_type,
overwrite_indexes=overwrite_indexes)
self.update(write_indexes, contents)
self._size = min(self._capacity, self._size + added_size)
return write_indexes
@check_uniformity(arg_num=2)
def update(self, indexes: Sequence, contents: dict) -> Sequence:
"""
Update selected contents.
Args:
indexes: Item indexes list.
contents: contents to write to the internal store at given positions
Returns:
The updated item indexes.
"""
for key, value_list in contents.items():
assert len(indexes) == len(value_list), f"expected updates at {len(indexes)} indexes, got {len(value_list)}"
for index, value in zip(indexes, value_list):
self._store[key][index] = value
return indexes
def apply_multi_filters(self, filters: Sequence[Callable]):
"""Multi-filter method.
The next layer filter input is the last layer filter output.
Args:
filters (Sequence[Callable]): Filter list, each item is a lambda function.
i.e. [lambda d: d['a'] == 1 and d['b'] == 1]
Returns:
Filtered indexes and corresponding objects.
"""
indexes = range(self._size)
for f in filters:
indexes = [i for i in indexes if f(self[i])]
return indexes, self.get(indexes)
def apply_multi_samplers(self, samplers: Sequence[Tuple[Callable, int]], replace: bool = True) -> Tuple:
"""Multi-samplers method.
The next layer sampler input is the last layer sampler output.
Args:
samplers ([Tuple[Callable, int]]): Sampler list, each sampler is a tuple.
The 1st item of the tuple is a lambda function.
The 1st lambda input is index, the 2nd lambda input is a object.
The 2nd item of the tuple is the sample size.
i.e. [(lambda o: o['a'], 3)]
replace: If True, sampling will be performed with replacement.
Returns:
Sampled indexes and corresponding objects.
"""
indexes = range(self._size)
for weight_fn, sample_size in samplers:
weights = np.asarray([weight_fn(self[i]) for i in indexes])
indexes = np.random.choice(indexes, size=sample_size, replace=replace, p=weights/np.sum(weights))
return indexes, self.get(indexes)
@normalize
def sample(self, size, weights: Sequence = None, replace: bool = True):
"""
Obtain a random sample from the experience pool.
Args:
size (int): sample sizes for each round of sampling in the chain. If this is a single integer, it is
used as the sample size for all samplers in the chain.
weights (Sequence): a sequence of sampling weights.
replace (bool): if True, sampling is performed with replacement. Default is True.
Returns:
Sampled indexes and the corresponding objects.
"""
indexes = np.random.choice(self._size, size=size, replace=replace, p=weights)
return indexes, self.get(indexes)
def sample_by_key(self, key, size, replace: bool = True):
"""
Obtain a random sample from the store using one of the columns as sampling weights.
Args:
key: the column whose values are to be used as sampling weights.
size: sample size.
replace: If True, sampling is performed with replacement.
Returns:
Sampled indexes and the corresponding objects.
"""
weights = np.asarray(self._store[key][:self._size] if self._size < self._capacity else self._store[key])
indexes = np.random.choice(self._size, size=size, replace=replace, p=weights/np.sum(weights))
return indexes, self.get(indexes)
def sample_by_keys(self, keys: Sequence, sizes: Sequence, replace: bool = True):
"""
Obtain a random sample from the store by chained sampling using multiple columns as sampling weights.
Args:
keys: the column whose values are to be used as sampling weights.
sizes: sample size.
replace: If True, sampling is performed with replacement.
Returns:
Sampled indexes and the corresponding objects.
"""
if len(keys) != len(sizes):
raise ValueError(f"expected sizes of length {len(keys)}, got {len(sizes)}")
indexes = range(self._size)
for key, size in zip(keys, sizes):
weights = np.asarray([self._store[key][i] for i in indexes])
indexes = np.random.choice(indexes, size=size, replace=replace, p=weights/np.sum(weights))
return indexes, self.get(indexes)
def dumps(self):
return clone(self._store)
def get_by_key(self, key):
return self._store[key]
def clear(self):
del self._store
self._store = defaultdict(lambda: [] if self._capacity < 0 else [None] * self._capacity)
self._size = 0
self._iter_index = 0

60
maro/rl/storage/utils.py Normal file
Просмотреть файл

@ -0,0 +1,60 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.
from enum import Enum
from functools import wraps
import numpy as np
def check_uniformity(arg_num):
def decorator(func):
@wraps(func)
def wrapper(*args, **kwargs):
contents = args[arg_num]
length = len(contents[next(iter(contents))])
if any(len(lst) != length for lst in contents.values()):
raise ValueError(f"all sequences in contents should have the same length")
return func(*args, **kwargs)
return wrapper
return decorator
def normalize(func):
@wraps(func)
def wrapper(size, weights=None, replace=True):
if weights is not None and not isinstance(weights, np.ndarray):
weights = np.asarray(weights)
return func(size, weights/np.sum(weights), replace)
return wrapper
class OverwriteType(Enum):
ROLLING = "rolling"
RANDOM = "random"
def get_update_indexes(size, added_size, capacity, overwrite_type, overwrite_indexes=None):
if added_size > capacity:
raise ValueError(f"size of added items should not exceed the store capacity.")
num_overwrites = size + added_size - capacity
if num_overwrites < 0:
return list(range(size, size + added_size))
if overwrite_indexes is not None:
write_indexes = list(range(size, capacity)) + list(overwrite_indexes)
else:
# follow the overwrite rule set at init
if overwrite_type == OverwriteType.ROLLING:
# using the negative index convention for convenience
start_index = size - capacity
write_indexes = list(range(start_index, start_index + added_size))
else:
random_indexes = np.random.choice(size, size=num_overwrites, replace=False)
write_indexes = list(range(size, capacity)) + list(random_indexes)
return write_indexes

Просмотреть файл

@ -135,7 +135,10 @@ class CitibikeBusinessEngine(AbsBusinessEngine):
return tick + 1 == self._max_tick
def get_node_mapping(self)->dict:
return {}
node_mapping = {}
for station in self._stations:
node_mapping[station.index] = station.id
return node_mapping
def reset(self):
"""Reset after episode"""
@ -154,6 +157,8 @@ class CitibikeBusinessEngine(AbsBusinessEngine):
station.reset()
self._matrices_node.reset()
self._decision_strategy.reset()
def get_agent_idx_list(self) -> List[int]:
return [station.index for station in self._stations]
@ -446,12 +451,13 @@ class CitibikeBusinessEngine(AbsBusinessEngine):
self._event_buffer.insert_event(transfer_evt)
def _build_temp_data(self):
"""build temporary data for predefined environment"""
logger.warning_yellow(f"Binary data files for scenario: citi_bike topology: {self._topology} not found.")
citi_bike_process = CitiBikeProcess(is_temp=True)
if self._topology in citi_bike_process.topologies:
pid = str(os.getpid())
logger.warning_yellow(
f"Generating temp binary data file for scenario: citi_bike topology: {self._topology} pid: {pid}. If you want to keep the data, please use MARO CLI command 'maro data generate -s citi_bike -t {self._topology}' to generate the binary data files first.")
f"Generating temp binary data file for scenario: citi_bike topology: {self._topology} pid: {pid}. If you want to keep the data, please use MARO CLI command 'maro env data generate -s citi_bike -t {self._topology}' to generate the binary data files first.")
self._citi_bike_data_pipeline = citi_bike_process.topologies[self._topology]
self._citi_bike_data_pipeline.download()
self._citi_bike_data_pipeline.clean()

Просмотреть файл

@ -53,6 +53,9 @@ class DecisionEvent:
def __repr__(self):
return f"decision event {self.__getstate__()}"
def __str__(self):
return f'DecisionEvent(tick={self.tick}, station_idx={self.station_idx}, type={self.type}, action_scope={self.action_scope})'
class Action:
def __init__(self, from_station_idx: int, to_station_idx: int, number: int):

Просмотреть файл

@ -30,6 +30,9 @@ class DistanceFilter:
return result
def reset(self):
pass
class RequirementsFilter:
def __init__(self, conf: dict):
self._output_num = conf["num"]
@ -42,27 +45,47 @@ class RequirementsFilter:
return {neighbor_scope[i][0]: neighbor_scope[i][1] for i in range(output_num)}
def reset(self):
pass
class TripsWindowFilter:
def __init__(self, conf: dict, snapshot_list):
self._output_num = conf["num"]
self._windows = conf["windows"]
self._snapshot_list = snapshot_list
self._window_states_cache = {}
def filter(self, station_idx:int, decision_type: DecisionType, source: Dict[int, int]) -> Dict[int, int]:
output_num = min(self._output_num, len(source))
avaiable_frame_indices = self._snapshot_list.get_frame_index_list()
avaiable_frame_indices = avaiable_frame_indices[-output_num:]
# max windows we can get
available_windows = min(self._windows, len(avaiable_frame_indices))
trip_states = self._snapshot_list["stations"][avaiable_frame_indices::"trip_requirement"]
trip_states = trip_states.reshape(-1, len(self._snapshot_list["stations"]))
# get frame index list for latest N windows
avaiable_frame_indices = avaiable_frame_indices[-available_windows:]
source_trips = {}
for neighbor_idx, _ in source.items():
source_trips[neighbor_idx] = trip_states[:, neighbor_idx].sum()
for i, frame_index in enumerate(avaiable_frame_indices):
if i == available_windows -1 or frame_index not in self._window_states_cache:
# overwrite latest one, since it may be changes, and cache not exist one
trip_state = self._snapshot_list["stations"][frame_index::"trip_requirement"]
self._window_states_cache[frame_index] = trip_state
trip_state = self._window_states_cache[frame_index]
for neighbor_idx, _ in source.items():
trip_num = trip_state[neighbor_idx]
if neighbor_idx not in source_trips:
source_trips[neighbor_idx] = trip_num
else:
source_trips[neighbor_idx] += trip_num
is_sort_reverse = False
if decision_type == DecisionType.Demand:
@ -75,8 +98,12 @@ class TripsWindowFilter:
for neighbor_idx, _ in sorted_neighbors[0: output_num]:
result[neighbor_idx] = source[neighbor_idx]
return result
def reset(self):
self._window_states_cache.clear()
class BikeDecisionStrategy:
"""Helper to provide decision related logic"""
@ -219,6 +246,11 @@ class BikeDecisionStrategy:
if bike_number == 0:
break
def reset(self):
"""Reset internal states"""
for filter_instance in self._filters:
filter_instance.reset()
def _construct_action_scope_filters(self, conf: dict):
for filter_conf in conf["filters"]:
filter_type = filter_conf["type"]

Просмотреть файл

@ -1,5 +1,5 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.s
# Licensed under the MIT license.
import sys
import warnings

Просмотреть файл

@ -33,4 +33,9 @@ ERROR_CODE = {
3001: "Command Error",
3002: "Parsing Error",
3003: "Deployment Error",
# 4000-4999: Error codes for RL toolkit
4001: "Unsupported Agent Mode Error",
4002: "Missing Shaper Error",
4003: "Wrong Agent Mode Error"
}

Просмотреть файл

@ -0,0 +1,28 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.
from maro.utils.exception import MAROException
class UnsupportedAgentModeError(MAROException):
"""
Unsupported agent mode error
"""
def __init__(self, msg: str = None):
super().__init__(4001, msg)
class MissingShaperError(MAROException):
"""
Missing shaper error
"""
def __init__(self, msg: str = None):
super().__init__(4002, msg)
class WrongAgentModeError(MAROException):
"""
Wrong agent mode error
"""
def __init__(self, msg: str = None):
super().__init__(4003, msg)

Просмотреть файл

@ -95,8 +95,8 @@ class Logger:
Args:
tag (str): Log tag for stream and file output.
format_ (LogFormat): Predefined formatter, the default value is LogFormat.full. \n
i.e. LogFormat.full: full time | host | user | pid | tag | level | msg \n
format_ (LogFormat): Predefined formatter, the default value is LogFormat.full.
i.e. LogFormat.full: full time | host | user | pid | tag | level | msg
LogFormat.simple: simple time | tag | level | msg
dump_folder (str): Log dumped folder, the default value is the current folder. The dumped log level is
logging.DEBUG. The full path of the dumped log file is `dump_folder/tag.log`.

Просмотреть файл

@ -2,6 +2,7 @@
# Licensed under the MIT license.
import configparser
from glob import glob
import io
import numpy as np
@ -11,7 +12,6 @@ import random
import shutil
import time
import warnings
import yaml
from maro import __data_version__
from maro.utils.exception.cli_exception import CommandError
@ -61,7 +61,7 @@ def set_seeds(seed):
random.seed(seed)
version_file_path = os.path.join(os.path.expanduser("~/.maro"), "version.yml")
version_file_path = os.path.join(os.path.expanduser("~/.maro"), "version.ini")
project_root = os.path.join(os.path.dirname(os.path.realpath(__file__)), "..")
@ -85,12 +85,12 @@ def deploy(hide_info=True):
for target_dir, source_dir in target_source_pairs:
shutil.copytree(source_dir, target_dir)
# deploy success
version_info = {
"data_version": __data_version__,
"deploy_time": time.time()
}
version_info = configparser.ConfigParser()
version_info["MARO_DATA"] = {}
version_info["MARO_DATA"]["version"] = __data_version__
version_info["MARO_DATA"]["deploy_time"] = str(int(time.time()))
with io.open(version_file_path, "w") as version_file:
yaml.dump(version_info, version_file)
version_info.write(version_file)
info_list.append("Data files for MARO deployed.")
except Exception as e:
error_list.append(f"An issue occured while deploying meta files for MARO. {e} Please run 'maro meta deploy' to deploy the data files.")
@ -111,10 +111,12 @@ def check_deployment_status():
ret = False
if os.path.exists(version_file_path):
with io.open(version_file_path, "r") as version_file:
version_info = yaml.safe_load(version_file)
if "deploy_time" in version_info \
and "data_version" in version_info \
and version_info["data_version"] == __data_version__:
version_info = configparser.ConfigParser()
version_info.read(version_file)
if "MARO_DATA" in version_info \
and "deploy_time" in version_info["MARO_DATA"] \
and "version" in version_info["MARO_DATA"] \
and version_info["MARO_DATA"]["version"] == __data_version__:
ret = True
return ret

Просмотреть файл

@ -4,13 +4,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Quick Start"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Quick Start\n",
"\n",
"Below is a simple demo of interaction with the environment."
]
},
@ -18,44 +13,54 @@
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"10:54:35 | WARNING | \u001b[33mBinary data files for scenario: citi_bike topology: toy.3s_4t not found.\u001b[0m\n",
"10:54:35 | WARNING | \u001b[33mGenerating temp binary data file for scenario: citi_bike topology: toy.3s_4t pid: 77526. If you want to keep the data, please use MARO CLI command 'maro env data generate -s citi_bike -t toy.3s_4t' to generate the binary data files first.\u001b[0m\n",
"10:54:35 | INFO | \u001b[32mGenerating trip data for topology toy.3s_4t .\u001b[0m\n",
"10:54:36 | INFO | \u001b[32mCleaning weather data\u001b[0m\n",
"10:54:36 | INFO | \u001b[32mBuilding binary data from /home/Jinyu/.maro/data/citi_bike/.source/.clean/toy.3s_4t/9c6a7687fb9d42b7/trips.csv to /home/Jinyu/.maro/data/citi_bike/.build/toy.3s_4t/9c6a7687fb9d42b7/trips.bin\u001b[0m\n",
"10:54:44 | INFO | \u001b[32mBuilding binary data from /home/Jinyu/.maro/data/citi_bike/.source/.clean/toy.3s_4t/40c6de7db2cc44f1/weather.csv to /home/Jinyu/.maro/data/citi_bike/.build/toy.3s_4t/40c6de7db2cc44f1/KNYC_daily.bin\u001b[0m\n",
"{'perf': 0.4517766497461929, 'total_trips': 2167, 'total_shortage': 1188}\n"
]
}
],
"source": [
"from maro.simulator import Env\n",
"from maro.simulator.scenarios.citi_bike.common import Action, DecisionEvent\n",
"\n",
"env = Env(scenario=\"citi_bike\", topology=\"ny.201912\", start_tick=0, durations=1440, snapshot_resolution=30)\n",
"env = Env(scenario=\"citi_bike\", topology=\"toy.3s_4t\", start_tick=0, durations=1440, snapshot_resolution=30)\n",
"\n",
"is_done: bool = False\n",
"reward: int = None\n",
"metrics: object = None\n",
"decision_event: DecisionEvent = None\n",
"is_done: bool = False\n",
"\n",
"while not is_done:\n",
" action: Action = None\n",
" reward, decision_event, is_done = env.step(action)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Environment of the bike repositioning"
" metrics, decision_event, is_done = env.step(action)\n",
"\n",
"print(metrics)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Environment of the bike repositioning\n",
"\n",
"To initialize an environment, you need to specify the values of several parameters:\n",
"- **scenario**: The target scenario of this Env. \"citi_bike\" denotes for the bike repositioning.\n",
"- **topology**: The target topology of this Env.\n",
" + There are some predefined topologies in MARO, that you can directly use it as in the demo.\n",
" + Also, you can define your own topologies following the guidance in the [doc](docs/customization/new_topology.rst).\n",
"- **scenario**: The target scenario of this Env.\n",
" - `citi_bike` denotes for the bike repositioning.\n",
"- **topology**: The target topology of this Env. As shown below, you can get the predefined topology list by calling `get_topologies(scenario='citi_bike')`\n",
"- **start_tick**: The start tick of this Env, 1 tick corresponds to 1 minute in citi_bike.\n",
" + In the demo above, *start_tick=0* indicates a simulation start from the beginning of the given topology.\n",
" - In the demo above, `start_tick=0` indicates a simulation start from the beginning of the given topology.\n",
"- **durations**: The duration of thie Env, in the unit of tick/minute.\n",
" + In the demo above, *durations=1440* indicates a simulation length of 1 day (24h * 60min/h).\n",
" - In the demo above, `durations=1440` indicates a simulation length of 1 day (24h * 60min/h).\n",
"- **snapshot_resolution**: The time granularity of maintaining the snapshots of the environments, in the unit of tick/minute.\n",
" + In the demo above, *snapshot_resolution=30* indicates that a snapshot will be created and saved every 30 minutes during the simulation.\n",
" - In the demo above, `snapshot_resolution=30` indicates that a snapshot will be created and saved every 30 minutes during the simulation.\n",
"\n",
"You can get all available scenarios and topologies by calling:"
]
@ -66,57 +71,64 @@
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[{'scenario': 'citi_bike', 'topology': 'ny201912'},\n",
" {'scenario': 'citi_bike', 'topology': 'train'},\n",
" {'scenario': 'ecr', 'topology': '6p_sssbdd_l0.0'},\n",
" {'scenario': 'ecr', 'topology': '6p_sssbdd_l0.3'},\n",
" {'scenario': 'ecr', 'topology': '5p_ssddd_l0.6'},\n",
" {'scenario': 'ecr', 'topology': '5p_ssddd_l0.5'},\n",
" {'scenario': 'ecr', 'topology': '6p_sssbdd_l0.4'},\n",
" {'scenario': 'ecr', 'topology': '6p_sssbdd_l0.5'},\n",
" {'scenario': 'ecr', 'topology': '5p_ssddd_l0.2'},\n",
" {'scenario': 'ecr', 'topology': '22p_global_trade_l0.1'},\n",
" {'scenario': 'ecr', 'topology': '6p_sssbdd_l0.6'},\n",
" {'scenario': 'ecr', 'topology': '6p_sssbdd_l0.8'},\n",
" {'scenario': 'ecr', 'topology': '22p_global_trade_l0.3'},\n",
" {'scenario': 'ecr', 'topology': '22p_global_trade_l0.4'},\n",
" {'scenario': 'ecr', 'topology': '5p_ssddd_l0.8'},\n",
" {'scenario': 'ecr', 'topology': '6p_sssbdd_l0.2'},\n",
" {'scenario': 'ecr', 'topology': '6p_sssbdd_l0.7'},\n",
" {'scenario': 'ecr', 'topology': '22p_global_trade_l0.2'},\n",
" {'scenario': 'ecr', 'topology': '5p_ssddd_l0.3'},\n",
" {'scenario': 'ecr', 'topology': '4p_ssdd_l0.7'},\n",
" {'scenario': 'ecr', 'topology': '4p_ssdd_l0.6'},\n",
" {'scenario': 'ecr', 'topology': '4p_ssdd_l0.5'},\n",
" {'scenario': 'ecr', 'topology': '22p_global_trade_l0.0'},\n",
" {'scenario': 'ecr', 'topology': '5p_ssddd_l0.1'},\n",
" {'scenario': 'ecr', 'topology': '5p_ssddd_l0.0'},\n",
" {'scenario': 'ecr', 'topology': '4p_ssdd_l0.1'},\n",
" {'scenario': 'ecr', 'topology': '4p_ssdd_l0.4'},\n",
" {'scenario': 'ecr', 'topology': '6p_sssbdd_l0.1'},\n",
" {'scenario': 'ecr', 'topology': '4p_ssdd_l0.3'},\n",
" {'scenario': 'ecr', 'topology': '5p_ssddd_l0.4'},\n",
" {'scenario': 'ecr', 'topology': '4p_ssdd_l0.8'},\n",
" {'scenario': 'ecr', 'topology': '22p_global_trade_l0.6'},\n",
" {'scenario': 'ecr', 'topology': '22p_global_trade_l0.7'},\n",
" {'scenario': 'ecr', 'topology': '5p_ssddd_l0.7'},\n",
" {'scenario': 'ecr', 'topology': '22p_global_trade_l0.5'},\n",
" {'scenario': 'ecr', 'topology': '22p_global_trade_l0.8'},\n",
" {'scenario': 'ecr', 'topology': '4p_ssdd_l0.2'},\n",
" {'scenario': 'ecr', 'topology': '4p_ssdd_l0.0'}]"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
"name": "stdout",
"output_type": "stream",
"text": [
"'The available scenarios in MARO:'\n",
"['citi_bike', 'ecr']\n",
"\n",
"'The predefined topologies in Citi Bike:'\n",
"['ny.201912',\n",
" 'ny.201808',\n",
" 'ny.201907',\n",
" 'ny.202005',\n",
" 'ny.201812',\n",
" 'ny.201804',\n",
" 'toy.3s_4t',\n",
" 'toy.4s_4t',\n",
" 'ny.201908',\n",
" 'ny.201910',\n",
" 'train',\n",
" 'ny.201909',\n",
" 'ny.202002',\n",
" 'ny.201811',\n",
" 'ny.201906',\n",
" 'ny.201802',\n",
" 'ny.201803',\n",
" 'ny.201905',\n",
" 'ny.202003',\n",
" 'ny.201805',\n",
" 'ny.201809',\n",
" 'toy.5s_6t',\n",
" 'ny.201801',\n",
" 'ny.201904',\n",
" 'ny.201902',\n",
" 'ny.201901',\n",
" 'ny.201911',\n",
" 'ny.201903',\n",
" 'ny.202001',\n",
" 'ny.202004',\n",
" 'ny.201806',\n",
" 'ny.201807',\n",
" 'ny.202006',\n",
" 'ny.201810']\n"
]
}
],
"source": [
"from maro.simulator.utils import get_available_envs\n",
"from maro.simulator.utils import get_scenarios, get_topologies\n",
"from pprint import pprint\n",
"from typing import List\n",
"\n",
"get_available_envs() # TODO: specify the scenario"
"scenarios: List[str] = get_scenarios()\n",
"topologies: List[str] = get_topologies(scenario='citi_bike')\n",
"\n",
"pprint(f'The available scenarios in MARO:')\n",
"pprint(scenarios)\n",
"\n",
"print()\n",
"pprint(f'The predefined topologies in Citi Bike:') # TODO: update the ordered output\n",
"pprint(topologies)"
]
},
{
@ -135,35 +147,51 @@
"name": "stdout",
"output_type": "stream",
"text": [
"10:54:44 | WARNING | \u001b[33mBinary data files for scenario: citi_bike topology: toy.3s_4t not found.\u001b[0m\n",
"10:54:44 | WARNING | \u001b[33mGenerating temp binary data file for scenario: citi_bike topology: toy.3s_4t pid: 77526. If you want to keep the data, please use MARO CLI command 'maro env data generate -s citi_bike -t toy.3s_4t' to generate the binary data files first.\u001b[0m\n",
"10:54:44 | INFO | \u001b[32mGenerating trip data for topology toy.3s_4t .\u001b[0m\n",
"10:54:45 | INFO | \u001b[32mCleaning weather data\u001b[0m\n",
"10:54:45 | INFO | \u001b[32mBuilding binary data from /home/Jinyu/.maro/data/citi_bike/.source/.clean/toy.3s_4t/ef14844303414e4c/trips.csv to /home/Jinyu/.maro/data/citi_bike/.build/toy.3s_4t/ef14844303414e4c/trips.bin\u001b[0m\n",
"10:54:53 | INFO | \u001b[32mBuilding binary data from /home/Jinyu/.maro/data/citi_bike/.source/.clean/toy.3s_4t/900b86b777124bb6/weather.csv to /home/Jinyu/.maro/data/citi_bike/.build/toy.3s_4t/900b86b777124bb6/KNYC_daily.bin\u001b[0m\n",
"The current tick: 0.\n",
"The current frame index: 0.\n",
"There are 528 agents in this Env.\n",
"There are 3 agents in this Env.\n",
"There will be 48 snapshots in total.\n",
"\n",
"Env Summary:\n",
"{'node_detail': {'matrices': {'attributes': {...}, 'number': 1},\n",
" 'stations': {'attributes': {...}, 'number': 528}},\n",
"{'node_detail': {'matrices': {'attributes': {'trips_adj': {'slots': 9,\n",
" 'type': 'i'}},\n",
" 'number': 1},\n",
" 'stations': {'attributes': {'bikes': {'slots': 1, 'type': 'i'},\n",
" 'capacity': {'slots': 1,\n",
" 'type': 'i'},\n",
" 'extra_cost': {'slots': 1,\n",
" 'type': 'i'},\n",
" 'failed_return': {'slots': 1,\n",
" 'type': 'i'},\n",
" 'fulfillment': {'slots': 1,\n",
" 'type': 'i'},\n",
" 'holiday': {'slots': 1,\n",
" 'type': 'i2'},\n",
" 'min_bikes': {'slots': 1,\n",
" 'type': 'i'},\n",
" 'shortage': {'slots': 1,\n",
" 'type': 'i'},\n",
" 'temperature': {'slots': 1,\n",
" 'type': 'i2'},\n",
" 'transfer_cost': {'slots': 1,\n",
" 'type': 'i'},\n",
" 'trip_requirement': {'slots': 1,\n",
" 'type': 'i'},\n",
" 'weather': {'slots': 1,\n",
" 'type': 'i2'},\n",
" 'weekday': {'slots': 1,\n",
" 'type': 'i2'}},\n",
" 'number': 3}},\n",
" 'node_mapping': {}}\n",
"\n",
"Env Summary - matrices:\n",
"{'attributes': {'distance_adj': {'slots': 278784, 'type': 'f'},\n",
" 'trips_adj': {'slots': 278784, 'type': 'i'}},\n",
" 'number': 1}\n",
"\n",
"Env Summary - stations:\n",
"{'attributes': {'bikes': {'slots': 1, 'type': 'i'},\n",
" 'capacity': {'slots': 1, 'type': 'i'},\n",
" 'extra_cost': {'slots': 1, 'type': 'i'},\n",
" 'failed_return': {'slots': 1, 'type': 'i'},\n",
" 'fulfillment': {'slots': 1, 'type': 'i'},\n",
" 'holiday': {'slots': 1, 'type': 'i2'},\n",
" 'shortage': {'slots': 1, 'type': 'i'},\n",
" 'temperature': {'slots': 1, 'type': 'i2'},\n",
" 'transfer_cost': {'slots': 1, 'type': 'i'},\n",
" 'trip_requirement': {'slots': 1, 'type': 'i'},\n",
" 'weather': {'slots': 1, 'type': 'i2'},\n",
" 'weekday': {'slots': 1, 'type': 'i2'}},\n",
" 'number': 528}\n"
"Env Metrics:\n",
"{'perf': 1, 'total_trips': 0, 'total_shortage': 0}\n"
]
}
],
@ -175,7 +203,7 @@
"\n",
"\n",
"# Initialize an Env for citi_bike scenario\n",
"env = Env(scenario=\"citi_bike\", topology=\"ny201912\", start_tick=0, durations=1440, snapshot_resolution=30)\n",
"env = Env(scenario=\"citi_bike\", topology=\"toy.3s_4t\", start_tick=0, durations=1440, snapshot_resolution=30)\n",
"\n",
"# The current tick\n",
"tick: int = env.tick\n",
@ -197,82 +225,105 @@
"# The summary info of the environment\n",
"summary: dict = env.summary\n",
"print(f\"\\nEnv Summary:\")\n",
"pprint(summary, depth=3)\n",
"pprint(summary)\n",
"\n",
"print(f\"\\nEnv Summary - matrices:\")\n",
"pprint(summary['node_detail']['matrices'])\n",
"# The metrics of the environment\n",
"metrics: dict = env.metrics\n",
"print(f\"\\nEnv Metrics:\") # TODO: update the output with node mapping\n",
"pprint(metrics)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Interaction with the environment\n",
"\n",
"Before starting interaction with the environment, we need to know **DecisionEvent** and **Action** first.\n",
"\n",
"print(f\"\\nEnv Summary - stations:\")\n",
"pprint(summary['node_detail']['stations'])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Interaction with the environment"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Before starting interaction with the environment, we need to know **DecisionEvent** and **Action** first."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## DecisionEvent\n",
"\n",
"Once the environment need the agent's response to promote the simulation, it will throw an **DecisionEvent**. In the scenario of citi_bike, the information of each DecisionEvent is listed as below:\n",
"- **station_idx**: the id of the station/agent that needs to respond to the environment\n",
"- **tick**: the corresponding tick\n",
"- **frame_index**: the corresponding frame index, that is the index of the corresponding snapshot in the snapshot list\n",
"- **type**: the decision type of this decision event. In citi_bike scenario, there are 2 types:\n",
" + **Supply**: There is too many bikes in the corresponding station, it's better to reposition some of them to other stations.\n",
" + **Demand**: There is no enough bikes in the corresponding station, it's better to reposition bikes from other stations\n",
"- **action_scope**: a dictionary of valid action items.\n",
" + The key of the item indicates the station/agent id;\n",
" + The meaning of the value differs for different decision type:\n",
" * If the decision type is Supply, the value of the station itself means its bike inventory at that moment, while the value of other target stations means the number of their empty docks;\n",
" * If the decision type is Demand, the value of the station itself means the number of its empty docks, while the value of other target stations means their bike inventory."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Once the environment need the agent's response to promote the simulation, it will throw an **DecisionEvent**. In the scenario of citi_bike, the information of each `DecisionEvent` is listed as below:\n",
"- **station_idx**: (int) The id of the station/agent that needs to respond to the environment;\n",
"- **tick**: (int) The corresponding tick;\n",
"- **frame_index**: (int) The corresponding frame index, that is the index of the corresponding snapshot in the snapshot list;\n",
"- **type**: (DecisionType) The decision type of this decision event. In citi_bike scenario, there are two types:\n",
" - `Supply` indicates there is too many bikes in the corresponding station, so it is better to reposition some of them to other stations.\n",
" - `Demand` indicates there is no enough bikes in the corresponding station, so it is better to reposition bikes from other stations\n",
"- **action_scope**: (Dict) A dictionary that maintains the information for calculating the valid action scope:\n",
" - The key of the item indicates the station/agent id;\n",
" - The meaning of the value differs for different decision type:\n",
" - If the decision type is `Supply`, the value of the station itself means its bike inventory at that moment, while the value of other target stations means the number of their empty docks;\n",
" - If the decision type is `Demand`, the value of the station itself means the number of its empty docks, while the value of other target stations means their bike inventory.\n",
"\n",
"## Action\n",
"\n",
"Once we get a **DecisionEvent** from the envirionment, we should respond with an **Action**. Valid Action could be:\n",
"- None, which means do nothing.\n",
"- A valid Action instance, including:\n",
" + **from_station_idx**: int, the id of the source station of the bike transportation\n",
" + **to_station_idx**: int, the id of the destination station of the bike transportation\n",
" + **number**: int, the quantity of the bike transportation"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Generate random actions based on the DecisionEvent"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The demo code in the Quick Start part has shown an interaction mode that doing nothing(responding with None action). Here we read the detailed information about the DecisionEvent and generate random actions based on it."
"Once we get a `DecisionEvent` from the envirionment, we should respond with an `Action`. Valid `Action` could be:\n",
"- `None`, which means do nothing.\n",
"- A valid `Action` instance, including:\n",
" - **from_station_idx**: (int) The id of the source station of the bike transportation\n",
" - **to_station_idx**: (int) The id of the destination station of the bike transportation\n",
" - **number**: (int) The quantity of the bike transportation\n",
"\n",
"## Generate random actions based on the DecisionEvent\n",
"\n",
"The demo code in the Quick Start part has shown an interaction mode that doing nothing(responding with `None` action). Here we read the detailed information about the `DecisionEvent` and generate random `Action` based on it."
]
},
{
"cell_type": "code",
"execution_count": null,
"execution_count": 4,
"metadata": {},
"outputs": [],
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"10:54:53 | WARNING | \u001b[33mBinary data files for scenario: citi_bike topology: toy.3s_4t not found.\u001b[0m\n",
"10:54:53 | WARNING | \u001b[33mGenerating temp binary data file for scenario: citi_bike topology: toy.3s_4t pid: 77526. If you want to keep the data, please use MARO CLI command 'maro env data generate -s citi_bike -t toy.3s_4t' to generate the binary data files first.\u001b[0m\n",
"10:54:53 | INFO | \u001b[32mGenerating trip data for topology toy.3s_4t .\u001b[0m\n",
"10:54:55 | INFO | \u001b[32mCleaning weather data\u001b[0m\n",
"10:54:55 | INFO | \u001b[32mBuilding binary data from /home/Jinyu/.maro/data/citi_bike/.source/.clean/toy.3s_4t/03d1f989818548b3/trips.csv to /home/Jinyu/.maro/data/citi_bike/.build/toy.3s_4t/03d1f989818548b3/trips.bin\u001b[0m\n",
"10:55:03 | INFO | \u001b[32mBuilding binary data from /home/Jinyu/.maro/data/citi_bike/.source/.clean/toy.3s_4t/12e2ead45d4c4ec7/weather.csv to /home/Jinyu/.maro/data/citi_bike/.build/toy.3s_4t/12e2ead45d4c4ec7/KNYC_daily.bin\u001b[0m\n",
"*************\n",
"decision event {'station_idx': 0, 'tick': 19, 'frame_index': 0, 'type': <DecisionType.Demand: 'demand'>, 'action_scope': {2: 5, 1: 8, 0: 28}}\n",
"<maro.simulator.scenarios.citi_bike.common.Action object at 0x7fe5a02e15d0>\n",
"*************\n",
"decision event {'station_idx': 1, 'tick': 99, 'frame_index': 3, 'type': <DecisionType.Demand: 'demand'>, 'action_scope': {0: 0, 2: 0, 1: 29}}\n",
"<maro.simulator.scenarios.citi_bike.common.Action object at 0x7fe5a02f9350>\n",
"*************\n",
"decision event {'station_idx': 1, 'tick': 139, 'frame_index': 4, 'type': <DecisionType.Demand: 'demand'>, 'action_scope': {0: 0, 2: 0, 1: 30}}\n",
"<maro.simulator.scenarios.citi_bike.common.Action object at 0x7fe5a02e6e90>\n",
"*************\n",
"decision event {'station_idx': 2, 'tick': 339, 'frame_index': 11, 'type': <DecisionType.Demand: 'demand'>, 'action_scope': {1: 1, 0: 0, 2: 30}}\n",
"<maro.simulator.scenarios.citi_bike.common.Action object at 0x7fe5a02a7e90>\n",
"*************\n",
"decision event {'station_idx': 0, 'tick': 359, 'frame_index': 11, 'type': <DecisionType.Demand: 'demand'>, 'action_scope': {2: 1, 1: 1, 0: 30}}\n",
"<maro.simulator.scenarios.citi_bike.common.Action object at 0x7fe5a0293fd0>\n",
"*************\n",
"decision event {'station_idx': 1, 'tick': 679, 'frame_index': 22, 'type': <DecisionType.Demand: 'demand'>, 'action_scope': {2: 0, 0: 1, 1: 30}}\n",
"<maro.simulator.scenarios.citi_bike.common.Action object at 0x7fe5a0308e50>\n",
"*************\n",
"decision event {'station_idx': 0, 'tick': 759, 'frame_index': 25, 'type': <DecisionType.Demand: 'demand'>, 'action_scope': {2: 2, 1: 4, 0: 30}}\n",
"<maro.simulator.scenarios.citi_bike.common.Action object at 0x7fe59e24b450>\n",
"*************\n",
"decision event {'station_idx': 0, 'tick': 779, 'frame_index': 25, 'type': <DecisionType.Demand: 'demand'>, 'action_scope': {2: 0, 1: 0, 0: 30}}\n",
"<maro.simulator.scenarios.citi_bike.common.Action object at 0x7fe59e251a10>\n",
"*************\n",
"decision event {'station_idx': 1, 'tick': 919, 'frame_index': 30, 'type': <DecisionType.Demand: 'demand'>, 'action_scope': {2: 0, 0: 0, 1: 30}}\n",
"<maro.simulator.scenarios.citi_bike.common.Action object at 0x7fe5a0308e10>\n",
"*************\n",
"decision event {'station_idx': 0, 'tick': 1199, 'frame_index': 39, 'type': <DecisionType.Demand: 'demand'>, 'action_scope': {2: 0, 1: 3, 0: 30}}\n",
"<maro.simulator.scenarios.citi_bike.common.Action object at 0x7fe59e43efd0>\n",
"*************\n",
"decision event {'station_idx': 2, 'tick': 1319, 'frame_index': 43, 'type': <DecisionType.Demand: 'demand'>, 'action_scope': {1: 0, 0: 0, 2: 30}}\n",
"<maro.simulator.scenarios.citi_bike.common.Action object at 0x7fe59e500450>\n",
"*************\n",
"decision event {'station_idx': 0, 'tick': 1339, 'frame_index': 44, 'type': <DecisionType.Demand: 'demand'>, 'action_scope': {2: 1, 1: 1, 0: 30}}\n",
"<maro.simulator.scenarios.citi_bike.common.Action object at 0x7fe59e500090>\n"
]
}
],
"source": [
"from maro.simulator import Env\n",
"from maro.simulator.scenarios.citi_bike.common import Action, DecisionEvent, DecisionType\n",
@ -280,30 +331,30 @@
"import random\n",
"\n",
"# Initialize an Env for citi_bike scenario\n",
"env = Env(scenario=\"citi_bike\", topology=\"ny201912\", start_tick=0, durations=1440, snapshot_resolution=30)\n",
"env = Env(scenario=\"citi_bike\", topology=\"toy.3s_4t\", start_tick=0, durations=1440, snapshot_resolution=30)\n",
"\n",
"is_done: bool = False\n",
"reward: int = None\n",
"metrics: object = None\n",
"decision_event: DecisionEvent = None\n",
"is_done: bool = False\n",
"action: Action = None\n",
"\n",
"# Start the env with a None Action\n",
"reward, decision_event, is_done = env.step(action)\n",
"metrics, decision_event, is_done = env.step(action)\n",
"\n",
"while not is_done:\n",
" if decision_event.type == DecisionType.Supply:\n",
" # the value of the station itself means the bike inventory if Supply\n",
" # Supply: the value of the station itself means the bike inventory\n",
" self_bike_inventory = decision_event.action_scope[decision_event.station_idx]\n",
" # the value of other stations means the quantity of empty docks if Supply\n",
" # Supply: the value of other stations means the quantity of empty docks\n",
" target_idx_dock_tuple_list = [\n",
" (k, v) for k, v in decision_event.action_scope.items() if k != decision_event.station_idx\n",
" ]\n",
" # random choose a target station weighted by the quantity of empty docks\n",
" # Randomly choose a target station weighted by the quantity of empty docks\n",
" target_idx, target_dock = random.choices(\n",
" target_idx_dock_tuple_list,\n",
" weights=[item[1] for item in target_idx_dock_tuple_list]\n",
" )[0]\n",
" # generate the corresponding random Action\n",
" # Generate the corresponding random Action\n",
" action = Action(\n",
" from_station_idx=decision_event.station_idx,\n",
" to_station_idx=target_idx,\n",
@ -311,18 +362,18 @@
" )\n",
"\n",
" elif decision_event.type == DecisionType.Demand:\n",
" # the value of the station itself means the quantity of empty docks if Demand\n",
" # Demand: the value of the station itself means the quantity of empty docks\n",
" self_available_dock = decision_event.action_scope[decision_event.station_idx]\n",
" # the value of other stations means their bike inventory if Demand\n",
" # Demand: the value of other stations means their bike inventory\n",
" target_idx_inventory_tuple_list = [\n",
" (k, v) for k, v in decision_event.action_scope.items() if k != decision_event.station_idx\n",
" ]\n",
" # random choose a target station weighted by the bike inventory\n",
" # Randomly choose a target station weighted by the bike inventory\n",
" target_idx, target_inventory = random.choices(\n",
" target_idx_inventory_tuple_list,\n",
" weights=[item[1] for item in target_idx_inventory_tuple_list]\n",
" )[0]\n",
" # generate the corresponding random Action\n",
" # Generate the corresponding random Action\n",
" action = Action(\n",
" from_station_idx=target_idx,\n",
" to_station_idx=decision_event.station_idx,\n",
@ -332,61 +383,64 @@
" else:\n",
" action = None\n",
" \n",
" # Random sampling some records to show in the output TODO\n",
"# if random.random() > 0.95:\n",
"# print(\"*************\\n{decision_event}\\n{action}\")\n",
" # Randomly sample some records to show in the output\n",
" if random.random() > 0.95:\n",
" print(f\"*************\\n{decision_event}\\n{action}\") # TODO: update the output\n",
" \n",
" # Respond the environment with the generated Action\n",
" reward, decision_event, is_done = env.step(action)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Get the environment observation"
" metric, decision_event, is_done = env.step(action)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Get the environment observation\n",
"\n",
"You can also implement other strategies or build models to take action. At this time, real-time information and historical records of the environment are very important for making good decisions. In this case, the the environment snapshot list is exactly what you need.\n",
"\n",
"The information in the snapshot list is indexed by 3 dimensions:\n",
"- A frame index or a frame index list. (int or list of int) Empty indicates for all time slides till now\n",
"- A station id (list). (int of list of int) Empty indicates for all stations/agents\n",
"- An Attribute name (list). (str of list of str) You can get all available attributes in env.summary as shown before.\n",
"- A frame index (list). (int / List[int]) Empty indicates for all time slides till now\n",
"- A station id (list). (int / List[int]) Empty indicates for all stations/agents\n",
"- An Attribute name (list). (str / List[str]) You can get all available attributes in `env.summary` as shown before.\n",
"\n",
"The return value from the snapshot list is a numpy.ndarray with shape **(frame * attribute * station, )**.\n",
"The return value from the snapshot list is a numpy.ndarray with shape **(num_frame * num_station * num_attribute, )**.\n",
"\n",
"More detailed introduction to the snapshot list is [here](). # TODO: add hyper-link"
]
},
{
"cell_type": "code",
"execution_count": 12,
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"10:55:03 | WARNING | \u001b[33mBinary data files for scenario: citi_bike topology: toy.3s_4t not found.\u001b[0m\n",
"10:55:03 | WARNING | \u001b[33mGenerating temp binary data file for scenario: citi_bike topology: toy.3s_4t pid: 77526. If you want to keep the data, please use MARO CLI command 'maro env data generate -s citi_bike -t toy.3s_4t' to generate the binary data files first.\u001b[0m\n",
"10:55:03 | INFO | \u001b[32mGenerating trip data for topology toy.3s_4t .\u001b[0m\n",
"10:55:04 | INFO | \u001b[32mCleaning weather data\u001b[0m\n",
"10:55:04 | INFO | \u001b[32mBuilding binary data from /home/Jinyu/.maro/data/citi_bike/.source/.clean/toy.3s_4t/14444479ac7e4a05/trips.csv to /home/Jinyu/.maro/data/citi_bike/.build/toy.3s_4t/14444479ac7e4a05/trips.bin\u001b[0m\n",
"10:55:12 | INFO | \u001b[32mBuilding binary data from /home/Jinyu/.maro/data/citi_bike/.source/.clean/toy.3s_4t/5cbe492813084a6e/weather.csv to /home/Jinyu/.maro/data/citi_bike/.build/toy.3s_4t/5cbe492813084a6e/KNYC_daily.bin\u001b[0m\n",
"{'matrices': {'attributes': {...}, 'number': 1},\n",
" 'stations': {'attributes': {...}, 'number': 3}}\n",
"\n",
"Env Summary - matrices:\n",
"{'attributes': {'bikes': {'slots': 1, 'type': 'i'},\n",
" 'capacity': {'slots': 1, 'type': 'i'},\n",
" 'extra_cost': {'slots': 1, 'type': 'i'},\n",
" 'failed_return': {'slots': 1, 'type': 'i'},\n",
" 'fulfillment': {'slots': 1, 'type': 'i'},\n",
" 'holiday': {'slots': 1, 'type': 'i2'},\n",
" 'min_bikes': {'slots': 1, 'type': 'i'},\n",
" 'shortage': {'slots': 1, 'type': 'i'},\n",
" 'temperature': {'slots': 1, 'type': 'i2'},\n",
" 'transfer_cost': {'slots': 1, 'type': 'i'},\n",
" 'trip_requirement': {'slots': 1, 'type': 'i'},\n",
" 'weather': {'slots': 1, 'type': 'i2'},\n",
" 'weekday': {'slots': 1, 'type': 'i2'}},\n",
" 'number': 528}\n"
" 'number': 3}\n"
]
}
],
@ -396,36 +450,32 @@
"\n",
"\n",
"# Initialize an Env for citi_bike scenario\n",
"env = Env(scenario=\"citi_bike\", topology=\"ny201912\", start_tick=0, durations=1440, snapshot_resolution=30)\n",
"env = Env(scenario=\"citi_bike\", topology=\"toy.3s_4t\", start_tick=0, durations=1440, snapshot_resolution=30)\n",
"\n",
"# The summary info of the environment\n",
"print(f\"\\nEnv Summary - matrices:\")\n",
"# To get the attribute list that can be accessed in snapshot_list\n",
"pprint(env.summary['node_detail'], depth=2)\n",
"print()\n",
"# The attribute list of stations\n",
"pprint(env.summary['node_detail']['stations'])"
]
},
{
"cell_type": "code",
"execution_count": 11,
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<class 'numpy.ndarray'> (10,)\n",
"Trip requirements for station 5 with time going by: [1. 0. 0. 1. 4. 2. 1. 4. 2. 0.]\n",
"\n",
"<class 'numpy.ndarray'> (10560,)\n",
"<class 'numpy.ndarray'> (10, 2, 528)\n",
"Station 1: [12. 13. 13. 13. 13. 13. 14. 14. 12. 12.]\n",
"Station 3: [19. 18. 21. 20. 18. 17. 16. 16. 15. 16.]\n",
"Station 5: [11. 12. 12. 12. 9. 9. 8. 4. 7. 7.]\n",
"Station 7: [8. 8. 8. 7. 8. 7. 7. 7. 7. 7.]\n",
"Station 9: [15. 15. 15. 16. 17. 18. 22. 21. 23. 23.]\n",
"Station 11: [13. 13. 14. 13. 16. 17. 20. 17. 17. 16.]\n",
"Station 13: [13. 13. 13. 14. 14. 14. 12. 9. 11. 12.]\n",
"Station 17: [27. 28. 28. 27. 29. 28. 31. 32. 32. 33.]\n",
"Station 19: [18. 19. 19. 19. 20. 20. 20. 20. 20. 21.]\n"
"10:55:12 | WARNING | \u001b[33mBinary data files for scenario: citi_bike topology: toy.3s_4t not found.\u001b[0m\n",
"10:55:12 | WARNING | \u001b[33mGenerating temp binary data file for scenario: citi_bike topology: toy.3s_4t pid: 77526. If you want to keep the data, please use MARO CLI command 'maro env data generate -s citi_bike -t toy.3s_4t' to generate the binary data files first.\u001b[0m\n",
"10:55:12 | INFO | \u001b[32mGenerating trip data for topology toy.3s_4t .\u001b[0m\n",
"10:55:13 | INFO | \u001b[32mCleaning weather data\u001b[0m\n",
"10:55:13 | INFO | \u001b[32mBuilding binary data from /home/Jinyu/.maro/data/citi_bike/.source/.clean/toy.3s_4t/0387209299f74786/trips.csv to /home/Jinyu/.maro/data/citi_bike/.build/toy.3s_4t/0387209299f74786/trips.bin\u001b[0m\n",
"10:55:21 | INFO | \u001b[32mBuilding binary data from /home/Jinyu/.maro/data/citi_bike/.source/.clean/toy.3s_4t/d01e92f2501f48f2/weather.csv to /home/Jinyu/.maro/data/citi_bike/.build/toy.3s_4t/d01e92f2501f48f2/KNYC_daily.bin\u001b[0m\n",
"array([12., 0., 12., 17., 0., 17., 15., 0., 15., 12., 0., 12.],\n",
" dtype=float32)\n"
]
}
],
@ -436,36 +486,33 @@
"from typing import List\n",
"\n",
"\n",
"# Initialize an Env for citi_bike scenario, from 07:00 to 12:00\n",
"env = Env(scenario=\"citi_bike\", topology=\"ny201912\", start_tick=420, durations=300, snapshot_resolution=30)\n",
"# Initialize an Env for citi_bike scenario\n",
"env = Env(scenario=\"citi_bike\", topology=\"toy.3s_4t\", start_tick=0, durations=1440, snapshot_resolution=30)\n",
"\n",
"# Start the environment with None action\n",
"_, decision_event, is_done = env.step(None)\n",
"\n",
"# Run the environment to the end\n",
"_, _, is_done = env.step(None)\n",
"while not is_done:\n",
" _, _, is_done = env.step(None)\n",
" # Case of access snapshot after a certain number of frames\n",
" if env.frame_index >= 24:\n",
" # The frame list of past 2 hours\n",
" past_2hour_frames = [x for x in range(env.frame_index - 4, env.frame_index)]\n",
" decision_station_idx = decision_event.station_idx\n",
" intr_station_infos = [\"trip_requirement\", \"bikes\", \"shortage\"]\n",
"\n",
"# Get trip requirement from snapshot list by directly using station id and attribute name\n",
"station_id = 5\n",
"trip_info = env.snapshot_list[\"stations\"][:station_id:\"trip_requirement\"]\n",
"print(type(trip_info), trip_info.shape)\n",
"print(f\"Trip requirements for station {station_id} with time going by: {trip_info}\\n\")\n",
" # Query the snapshot list of this environment to get the information of\n",
" # the trip requirements, bikes, shortage of the decision station in the past 2 days\n",
" past_2hour_info = env.snapshot_list[\"stations\"][\n",
" past_2hour_frames : decision_station_idx : intr_station_infos\n",
" ]\n",
" pprint(past_2hour_info)\n",
" \n",
" # This demo code is used to show how to access the information in snapshot,\n",
" # so we terminate the env here for clear output\n",
" break\n",
"\n",
"# Get capacity and bikes from snapshot list simultaneously by using attribute list\n",
"attribute_list = [\"capacity\", \"bikes\"]\n",
"info = env.snapshot_list[\"stations\"][::attribute_list]\n",
"print(type(info), info.shape)\n",
"\n",
"# Reshape the info of capacity and bikes into a user-friendly shape\n",
"num_attributes = len(attribute_list)\n",
"num_frame = env.frame_index + 1\n",
"num_stations = len(env.agent_idx_list)\n",
"info = info.reshape(num_frame, num_attributes, num_stations)\n",
"print(type(info), info.shape)\n",
"\n",
"# Pring and show the change of bikes in some stations:\n",
"bikes_idx = 1\n",
"for station_id in [1, 3, 5, 7, 9, 11, 13, 17, 19]:\n",
" print(f\"Station {station_id}: {info[:, bikes_idx, station_id]}\")"
" # Drive the environment with None action\n",
" _, decision_event, is_done = env.step(None)"
]
}
],
@ -490,4 +537,4 @@
},
"nbformat": 4,
"nbformat_minor": 4
}
}

Просмотреть файл

@ -4,117 +4,125 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Quick Start"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Below is a simple demo of interaction with the environment."
"# Quick Start\n",
"\n",
"Below is a sample demo of interaction with the environment."
]
},
{
"cell_type": "code",
"execution_count": 3,
"execution_count": 1,
"metadata": {},
"outputs": [],
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'perf': 0.5, 'total_shortage': 100000, 'total_cost': 0}\n"
]
}
],
"source": [
"from maro.simulator import Env\n",
"from maro.simulator.scenarios.ecr.common import Action, DecisionEvent\n",
"\n",
"env = Env(scenario=\"ecr\", topology=\"toy.5p_ssddd_l0.0\", start_tick=0, durations=100)\n",
"\n",
"is_done: bool = False\n",
"reward: int = None\n",
"metrics: object = None\n",
"decision_event: DecisionEvent = None\n",
"is_done: bool = False\n",
"\n",
"while not is_done:\n",
" action: Action = None\n",
" reward, decision_event, is_done = env.step(action)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Environment of Empty Container Repositioning (ECR)"
" metrics, decision_event, is_done = env.step(action)\n",
"\n",
"print(metrics)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Environment of ECR\n",
"\n",
"To initialize an environment, you need to specify the values of several parameters:\n",
"- **scenario**: The target scenario of this Env. \"ecr\" denotes for the Empty Container Repositioning.\n",
"- **topology**: The target topology of this Env.\n",
" + There are some predefined topologies in MARO, that you can directly use it as in the demo.\n",
" + Also, you can define your own topologies following the guidance in the [doc](docs/customization/new_topology.rst).\n",
"- **start_tick**: The start tick of this Env, **1 tick corresponds to 1 minute in ecr.** (TODO: to confirm)\n",
" + In the demo above, *start_tick=0* indicates a simulation start from the beginning of the given topology.\n",
"- **durations**: The duration of thie Env, **in the unit of tick/minute**.(TODO: to confirm)\n",
" + In the demo above, *durations=1440* indicates a simulation length of 1 day (24h * 60min/h).\n",
"- **scenario**: The target scenario of this Env.\n",
" - `ecr` denotes for the Empty Container Repositioning (ECR).\n",
"- **topology**: The target topology of this Env. As shown below, you can get the predefined topology list by calling `get_topologies(scenario='ecr')`.\n",
"- **start_tick**: The start tick of this Env, 1 tick corresponds to 1 day in ecr.\n",
" - In the demo above, `start_tick=0` indicates a simulation start from the beginning of the given topology.\n",
"- **durations**: The duration of thie Env, in the unit of tick/day.\n",
" - In the demo above, `durations=100` indicates a simulation length of 100 days.\n",
"\n",
"You can get all available scenarios and topologies by calling:"
]
},
{
"cell_type": "code",
"execution_count": 4,
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[{'scenario': 'citi_bike', 'topology': 'ny201912'},\n",
" {'scenario': 'citi_bike', 'topology': 'train'},\n",
" {'scenario': 'ecr', 'topology': '6p_sssbdd_l0.0'},\n",
" {'scenario': 'ecr', 'topology': '6p_sssbdd_l0.3'},\n",
" {'scenario': 'ecr', 'topology': '5p_ssddd_l0.6'},\n",
" {'scenario': 'ecr', 'topology': '5p_ssddd_l0.5'},\n",
" {'scenario': 'ecr', 'topology': '6p_sssbdd_l0.4'},\n",
" {'scenario': 'ecr', 'topology': '6p_sssbdd_l0.5'},\n",
" {'scenario': 'ecr', 'topology': '5p_ssddd_l0.2'},\n",
" {'scenario': 'ecr', 'topology': '22p_global_trade_l0.1'},\n",
" {'scenario': 'ecr', 'topology': '6p_sssbdd_l0.6'},\n",
" {'scenario': 'ecr', 'topology': '6p_sssbdd_l0.8'},\n",
" {'scenario': 'ecr', 'topology': '22p_global_trade_l0.3'},\n",
" {'scenario': 'ecr', 'topology': '22p_global_trade_l0.4'},\n",
" {'scenario': 'ecr', 'topology': '5p_ssddd_l0.8'},\n",
" {'scenario': 'ecr', 'topology': '6p_sssbdd_l0.2'},\n",
" {'scenario': 'ecr', 'topology': '6p_sssbdd_l0.7'},\n",
" {'scenario': 'ecr', 'topology': '22p_global_trade_l0.2'},\n",
" {'scenario': 'ecr', 'topology': '5p_ssddd_l0.3'},\n",
" {'scenario': 'ecr', 'topology': '4p_ssdd_l0.7'},\n",
" {'scenario': 'ecr', 'topology': '4p_ssdd_l0.6'},\n",
" {'scenario': 'ecr', 'topology': '4p_ssdd_l0.5'},\n",
" {'scenario': 'ecr', 'topology': '22p_global_trade_l0.0'},\n",
" {'scenario': 'ecr', 'topology': '5p_ssddd_l0.1'},\n",
" {'scenario': 'ecr', 'topology': '5p_ssddd_l0.0'},\n",
" {'scenario': 'ecr', 'topology': '4p_ssdd_l0.1'},\n",
" {'scenario': 'ecr', 'topology': '4p_ssdd_l0.4'},\n",
" {'scenario': 'ecr', 'topology': '6p_sssbdd_l0.1'},\n",
" {'scenario': 'ecr', 'topology': '4p_ssdd_l0.3'},\n",
" {'scenario': 'ecr', 'topology': '5p_ssddd_l0.4'},\n",
" {'scenario': 'ecr', 'topology': '4p_ssdd_l0.8'},\n",
" {'scenario': 'ecr', 'topology': '22p_global_trade_l0.6'},\n",
" {'scenario': 'ecr', 'topology': '22p_global_trade_l0.7'},\n",
" {'scenario': 'ecr', 'topology': '5p_ssddd_l0.7'},\n",
" {'scenario': 'ecr', 'topology': '22p_global_trade_l0.5'},\n",
" {'scenario': 'ecr', 'topology': '22p_global_trade_l0.8'},\n",
" {'scenario': 'ecr', 'topology': '4p_ssdd_l0.2'},\n",
" {'scenario': 'ecr', 'topology': '4p_ssdd_l0.0'}]"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
"name": "stdout",
"output_type": "stream",
"text": [
"'The available scenarios in MARO:'\n",
"['citi_bike', 'ecr']\n",
"\n",
"'The predefined topologies in ECR:'\n",
"['toy.6p_sssbdd_l0.3',\n",
" 'toy.4p_ssdd_l0.2',\n",
" 'global_trade.22p_l0.7',\n",
" 'toy.5p_ssddd_l0.1',\n",
" 'toy.6p_sssbdd_l0.1',\n",
" 'toy.5p_ssddd_l0.8',\n",
" 'toy.6p_sssbdd_l0.8',\n",
" 'toy.6p_sssbdd_l0.2',\n",
" 'toy.6p_sssbdd_l0.5',\n",
" 'toy.6p_sssbdd_l0.0',\n",
" 'toy.6p_sssbdd_l0.7',\n",
" 'toy.5p_ssddd_l0.3',\n",
" 'toy.4p_ssdd_l0.3',\n",
" 'toy.5p_ssddd_l0.5',\n",
" 'toy.4p_ssdd_l0.8',\n",
" 'toy.4p_ssdd_l0.1',\n",
" 'toy.6p_sssbdd_l0.4',\n",
" 'toy.5p_ssddd_l0.7',\n",
" 'global_trade.22p_l0.3',\n",
" 'global_trade.22p_l0.8',\n",
" 'global_trade.22p_l0.2',\n",
" 'toy.4p_ssdd_l0.4',\n",
" 'toy.6p_sssbdd_l0.6',\n",
" 'toy.4p_ssdd_l0.7',\n",
" 'toy.5p_ssddd_l0.4',\n",
" 'global_trade.22p_l0.1',\n",
" 'global_trade.22p_l0.0',\n",
" 'global_trade.22p_l0.6',\n",
" 'toy.4p_ssdd_l0.5',\n",
" 'toy.5p_ssddd_l0.2',\n",
" 'toy.4p_ssdd_l0.0',\n",
" 'global_trade.22p_l0.5',\n",
" 'toy.5p_ssddd_l0.0',\n",
" 'toy.5p_ssddd_l0.6',\n",
" 'global_trade.22p_l0.4',\n",
" 'toy.4p_ssdd_l0.6']\n"
]
}
],
"source": [
"from maro.simulator.utils import get_available_envs\n",
"from maro.simulator.utils import get_scenarios, get_topologies\n",
"from pprint import pprint\n",
"from typing import List\n",
"\n",
"get_available_envs() # TODO: specify the scenario"
"scenarios: List[str] = get_scenarios()\n",
"topologies: List[str] = get_topologies(scenario='ecr')\n",
"\n",
"pprint(f'The available scenarios in MARO:')\n",
"pprint(scenarios)\n",
"\n",
"print()\n",
"pprint(f'The predefined topologies in ECR:') # TODO: update the ordered output\n",
"pprint(topologies)"
]
},
{
@ -126,7 +134,7 @@
},
{
"cell_type": "code",
"execution_count": 26,
"execution_count": 3,
"metadata": {},
"outputs": [
{
@ -139,18 +147,70 @@
"There will be 100 snapshots in total.\n",
"\n",
"Env Summary:\n",
"{'node_detail': {},\n",
" 'node_mapping': {'ports': {0: 'demand_port_001',\n",
" 1: 'demand_port_002',\n",
" 2: 'supply_port_001',\n",
" 3: 'supply_port_002',\n",
" 4: 'transfer_port_001'},\n",
" 'vessels': {0: 'rt1_vessel_001',\n",
" 1: 'rt1_vessel_002',\n",
" 2: 'rt1_vessel_003',\n",
" 3: 'rt2_vessel_001',\n",
" 4: 'rt2_vessel_002',\n",
" 5: 'rt2_vessel_003'}}}\n"
"{'node_detail': {'matrices': {'attributes': {'full_on_ports': {'slots': 25,\n",
" 'type': 'i'},\n",
" 'full_on_vessels': {'slots': 30,\n",
" 'type': 'i'},\n",
" 'vessel_plans': {'slots': 30,\n",
" 'type': 'i'}},\n",
" 'number': 1},\n",
" 'ports': {'attributes': {'acc_booking': {'slots': 1,\n",
" 'type': 'i'},\n",
" 'acc_fulfillment': {'slots': 1,\n",
" 'type': 'i'},\n",
" 'acc_shortage': {'slots': 1,\n",
" 'type': 'i'},\n",
" 'booking': {'slots': 1, 'type': 'i'},\n",
" 'capacity': {'slots': 1, 'type': 'f'},\n",
" 'empty': {'slots': 1, 'type': 'i'},\n",
" 'fulfillment': {'slots': 1,\n",
" 'type': 'i'},\n",
" 'full': {'slots': 1, 'type': 'i'},\n",
" 'on_consignee': {'slots': 1,\n",
" 'type': 'i'},\n",
" 'on_shipper': {'slots': 1,\n",
" 'type': 'i'},\n",
" 'shortage': {'slots': 1, 'type': 'i'},\n",
" 'transfer_cost': {'slots': 1,\n",
" 'type': 'f'}},\n",
" 'number': 5},\n",
" 'vessels': {'attributes': {'capacity': {'slots': 1,\n",
" 'type': 'f'},\n",
" 'early_discharge': {'slots': 1,\n",
" 'type': 'i'},\n",
" 'empty': {'slots': 1, 'type': 'i'},\n",
" 'full': {'slots': 1, 'type': 'i'},\n",
" 'future_stop_list': {'slots': 3,\n",
" 'type': 'i'},\n",
" 'future_stop_tick_list': {'slots': 3,\n",
" 'type': 'i'},\n",
" 'last_loc_idx': {'slots': 1,\n",
" 'type': 'i'},\n",
" 'next_loc_idx': {'slots': 1,\n",
" 'type': 'i'},\n",
" 'past_stop_list': {'slots': 4,\n",
" 'type': 'i'},\n",
" 'past_stop_tick_list': {'slots': 4,\n",
" 'type': 'i'},\n",
" 'remaining_space': {'slots': 1,\n",
" 'type': 'i'},\n",
" 'route_idx': {'slots': 1,\n",
" 'type': 'i'}},\n",
" 'number': 6}},\n",
" 'node_mapping': {'ports': {'demand_port_001': 0,\n",
" 'demand_port_002': 1,\n",
" 'supply_port_001': 2,\n",
" 'supply_port_002': 3,\n",
" 'transfer_port_001': 4},\n",
" 'vessels': {'rt1_vessel_001': 0,\n",
" 'rt1_vessel_002': 1,\n",
" 'rt1_vessel_003': 2,\n",
" 'rt2_vessel_001': 3,\n",
" 'rt2_vessel_002': 4,\n",
" 'rt2_vessel_003': 5}}}\n",
"\n",
"Env Metrics:\n",
"{'perf': 1, 'total_shortage': 0, 'total_cost': 0}\n"
]
}
],
@ -162,7 +222,7 @@
"\n",
"\n",
"# Initialize an Env for ECR scenario\n",
"env = Env(scenario=\"ecr\", topology=\"5p_ssddd_l0.0\", start_tick=0, durations=100)\n",
"env = Env(scenario=\"ecr\", topology=\"toy.5p_ssddd_l0.0\", start_tick=0, durations=100)\n",
"\n",
"# The current tick\n",
"tick: int = env.tick\n",
@ -184,72 +244,53 @@
"# The summary info of the environment\n",
"summary: dict = env.summary\n",
"print(f\"\\nEnv Summary:\")\n",
"pprint(summary, depth=3)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Interaction with the environment"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Before starting interaction with the environment, we need to know DecisionEvent and Action first."
"pprint(summary)\n",
"\n",
"# The metrics of the environment\n",
"metrics: dict = env.metrics\n",
"print(f\"\\nEnv Metrics:\")\n",
"pprint(metrics)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Interaction with the environment\n",
"\n",
"Before starting interaction with the environment, we need to know `DecisionEvent` and `Action` first.\n",
"\n",
"## DecisionEvent\n",
"\n",
"Once the environment need the agent's response to promote the simulation, it will throw an **DecisionEvent**. In the scenario of ECR, the information of each DecisionEvent is listed as below:\n",
"- **tick**: (int) the corresponding tick\n",
"- **port_idx**: (int) the id of the port/agent that needs to respond to the environment\n",
"- **vessel_idx**: (int) the id of the vessel/operation object of the port/agnet.\n",
"- **snapshot_list**: (int) **Snapshots of the environment to input into the decision model** TODO: confirm the meaning\n",
"- **action_scope**: **Load and discharge scope for agent to generate decision**\n",
"- **early_discharge**: **Early discharge number of corresponding vessel**"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Once the environment need the agent's response to promote the simulation, it will throw an `DecisionEvent`. In the scenario of ECR, the information of each `DecisionEvent` is listed as below:\n",
"- **tick**: (int) The corresponding tick;\n",
"- **port_idx**: (int) The id of the port/agent that needs to respond to the environment;\n",
"- **vessel_idx**: (int) The id of the vessel/operation object of the port/agnet;\n",
"- **action_scope**: (ActionScope) ActionScope has two attributes:\n",
" - `load` indicates the maximum quantity that can be loaded from the port the vessel;\n",
" - `discharge` indicates the maximum quantity that can be discharged from the vessel to the port;\n",
"- **early_discharge**: (int) When the available capacity in the vessel is not enough to load the ladens, some of the empty containers in the vessel will be early discharged to free the space. The quantity of empty containers that have been early discharged due to the laden loading is recorded in this field.\n",
"\n",
"## Action\n",
"\n",
"Once we get a DecisionEvent from the envirionment, we should respond with an Action. Valid Action could be:\n",
"Once we get a `DecisionEvent` from the envirionment, we should respond with an `Action`. Valid `Action` could be:\n",
"\n",
"- None, which means do nothing.\n",
"- A valid Action instance, including:\n",
" + **vessel_idx**: (int) the id of the vessel/operation object of the port/agent.\n",
" + **port_idx**: (int) the id of the port/agent that take this action.\n",
" + **quantity**: (int) the sign of this value denotes different meanings:\n",
" * positive quantity means unloading empty containers from vessel to port.\n",
" * negative quantity means loading empty containers from port to vessel."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Generate random actions based on the DecisionEvent"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The demo code in the Quick Start part has shown an interaction mode that doing nothing(responding with None action). Here we read the detailed information about the DecisionEvent and generate random actions based on it."
"- `None`, which means do nothing.\n",
"- A valid `Action` instance, including:\n",
" - **vessel_idx**: (int) The id of the vessel/operation object of the port/agent;\n",
" - **port_idx**: (int) The id of the port/agent that take this action;\n",
" - **quantity**: (int) The sign of this value denotes different meanings:\n",
" - Positive quantity means unloading empty containers from vessel to port;\n",
" - Negative quantity means loading empty containers from port to vessel.\n",
"\n",
"## Generate random actions based on the DecisionEvent\n",
"\n",
"The demo code in the Quick Start part has shown an interaction mode that doing nothing(responding with `None` action). Here we read the detailed information from the `DecisionEvent` and generate random `Action` based on it."
]
},
{
"cell_type": "code",
"execution_count": 25,
"execution_count": 4,
"metadata": {},
"outputs": [
{
@ -257,23 +298,14 @@
"output_type": "stream",
"text": [
"*************\n",
"DecisionEvent(tick=7, port_idx=3, vessel_idx=1, action_scope=ActionScope {load: 20000, discharge: 0 })\n",
"Action {quantity: -6886, port: 3, vessel: 1 }\n",
"DecisionEvent(tick=7, port_idx=4, vessel_idx=2, action_scope=ActionScope {load: 12000, discharge: 0 })\n",
"Action {quantity: -3730, port: 4, vessel: 2 }\n",
"*************\n",
"DecisionEvent(tick=14, port_idx=3, vessel_idx=0, action_scope=ActionScope {load: 13114, discharge: 2073 })\n",
"Action {quantity: -6744, port: 3, vessel: 0 }\n",
"DecisionEvent(tick=56, port_idx=0, vessel_idx=5, action_scope=ActionScope {load: 692, discharge: 2988 })\n",
"Action {quantity: 1532, port: 0, vessel: 5 }\n",
"*************\n",
"DecisionEvent(tick=21, port_idx=0, vessel_idx=4, action_scope=ActionScope {load: 389, discharge: 5977 })\n",
"Action {quantity: 1936, port: 0, vessel: 4 }\n",
"*************\n",
"DecisionEvent(tick=42, port_idx=3, vessel_idx=2, action_scope=ActionScope {load: 29092, discharge: 6041 })\n",
"Action {quantity: 5316, port: 3, vessel: 2 }\n",
"*************\n",
"DecisionEvent(tick=77, port_idx=3, vessel_idx=0, action_scope=ActionScope {load: 26402, discharge: 12393 })\n",
"Action {quantity: -14075, port: 3, vessel: 0 }\n",
"*************\n",
"DecisionEvent(tick=91, port_idx=0, vessel_idx=3, action_scope=ActionScope {load: 0, discharge: 6462 })\n",
"Action {quantity: 6194, port: 0, vessel: 3 }\n"
"DecisionEvent(tick=77, port_idx=1, vessel_idx=3, action_scope=ActionScope {load: 0, discharge: 538 })\n",
"Action {quantity: 218, port: 1, vessel: 3 }\n"
]
}
],
@ -283,71 +315,81 @@
"\n",
"import random\n",
"\n",
"env = Env(scenario=\"ecr\", topology=\"5p_ssddd_l0.0\", start_tick=0, durations=100)\n",
"# Initialize an Env for ecr scenario\n",
"env = Env(scenario=\"ecr\", topology=\"toy.5p_ssddd_l0.0\", start_tick=0, durations=100)\n",
"\n",
"is_done: bool = False\n",
"reward: int = None\n",
"metrics: object = None\n",
"decision_event: DecisionEvent = None\n",
"is_done: bool = False\n",
"action: Action = None\n",
"\n",
"reward, decision_event, is_done = env.step(None)\n",
"# Start the env with a None Action\n",
"metrics, decision_event, is_done = env.step(None)\n",
"\n",
"while not is_done:\n",
" # Generate a random Action according to the action_scope in DecisionEvent\n",
" random_quantity = random.randint(\n",
" -decision_event.action_scope.load,\n",
" decision_event.action_scope.discharge\n",
" )\n",
" action = Action(\n",
" vessel_idx=decision_event.vessel_idx,\n",
" port_idx=decision_event.port_idx,\n",
" quantity=random.randint(-decision_event.action_scope.load, decision_event.action_scope.discharge)\n",
" quantity=random_quantity\n",
" )\n",
" # random sampling some records to show in the output\n",
" \n",
" # Randomly sample some records to show in the output\n",
" if random.random() > 0.95:\n",
" print(f\"*************\\n{decision_event}\\n{action}\")\n",
" reward, decision_event, is_done = env.step(action) "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Get the environment observation"
" \n",
" # Respond the environment with the generated Action\n",
" metrics, decision_event, is_done = env.step(action)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Get the environment observation\n",
"\n",
"You can also implement other strategies or build models to take action. At this time, real-time information and historical records of the environment are very important for making good decisions. In this case, the the environment snapshot list is exactly what you need.\n",
"\n",
"The information in the snapshot list is indexed by 3 dimensions:\n",
"- A frame index or a frame index list. (int or list of int) Empty indicates for all time slides till now\n",
"- A station id (list). (int of list of int) Empty indicates for all ports/agents\n",
"- An Attribute name (list). (str of list of str) You can get all available attributes in env.summary as shown before.\n",
"- A tick index (list). (int / List[int]) Empty indicates for all time slides till now;\n",
"- A port id (list). (int / List[int]) Empty indicates for all ports/agents;\n",
"- An attribute name (list). (str / List[str]) You can get all available attributes in `env.summary` as shown before.\n",
"\n",
"The return value from the snapshot list is a numpy.ndarray with shape **(frame * attribute * station, )**.\n",
"The return value from the snapshot list is a numpy.ndarray with shape **(num_tick * num_port * num_attribute, )**.\n",
"\n",
"More detailed introduction to the snapshot list is [here](). # TODO: add hyper-link"
"More detailed introduction to the snapshot list is [here](docs/_build/html/key_components/data_model.html#advanced-features). # TODO: add hyper-link"
]
},
{
"cell_type": "code",
"execution_count": 27,
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'node_detail': {},\n",
" 'node_mapping': {'ports': {0: 'demand_port_001',\n",
" 1: 'demand_port_002',\n",
" 2: 'supply_port_001',\n",
" 3: 'supply_port_002',\n",
" 4: 'transfer_port_001'},\n",
" 'vessels': {0: 'rt1_vessel_001',\n",
" 1: 'rt1_vessel_002',\n",
" 2: 'rt1_vessel_003',\n",
" 3: 'rt2_vessel_001',\n",
" 4: 'rt2_vessel_002',\n",
" 5: 'rt2_vessel_003'}}}\n"
"{'matrices': {'attributes': {...}, 'number': 1},\n",
" 'ports': {'attributes': {...}, 'number': 5},\n",
" 'vessels': {'attributes': {...}, 'number': 6}}\n",
"\n",
"{'attributes': {'acc_booking': {'slots': 1, 'type': 'i'},\n",
" 'acc_fulfillment': {'slots': 1, 'type': 'i'},\n",
" 'acc_shortage': {'slots': 1, 'type': 'i'},\n",
" 'booking': {'slots': 1, 'type': 'i'},\n",
" 'capacity': {'slots': 1, 'type': 'f'},\n",
" 'empty': {'slots': 1, 'type': 'i'},\n",
" 'fulfillment': {'slots': 1, 'type': 'i'},\n",
" 'full': {'slots': 1, 'type': 'i'},\n",
" 'on_consignee': {'slots': 1, 'type': 'i'},\n",
" 'on_shipper': {'slots': 1, 'type': 'i'},\n",
" 'shortage': {'slots': 1, 'type': 'i'},\n",
" 'transfer_cost': {'slots': 1, 'type': 'f'}},\n",
" 'number': 5}\n"
]
}
],
@ -355,64 +397,27 @@
"from maro.simulator import Env\n",
"from pprint import pprint\n",
"\n",
"env = Env(scenario=\"ecr\", topology=\"5p_ssddd_l0.0\", start_tick=0, durations=100)\n",
"env = Env(scenario=\"ecr\", topology=\"toy.5p_ssddd_l0.0\", start_tick=0, durations=100)\n",
"\n",
"pprint(env.summary)"
"# To get the attribute list that can be accessed in snapshot_list\n",
"pprint(env.summary['node_detail'], depth=2)\n",
"print()\n",
"# The attribute list of ports\n",
"pprint(env.summary['node_detail']['ports'])"
]
},
{
"cell_type": "code",
"execution_count": 15,
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<class 'numpy.ndarray'> (5,)\n",
"Port 0 capacity: 100000.0\n",
"Port 1 capacity: 100000.0\n",
"Port 2 capacity: 100000.0\n",
"Port 3 capacity: 100000.0\n",
"Port 4 capacity: 100000.0\n",
"\n",
"<class 'numpy.ndarray'> (1000,)\n",
"<class 'numpy.ndarray'> (100, 2, 5)\n",
"Port 0: [ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 500. 500.\n",
" 500. 500. 500. 500. 500. 500. 500. 500. 500. 500. 500. 500. 500. 500.\n",
" 500. 500. 500. 500. 500. 500. 500. 500. 500. 500. 500. 500. 500. 500.\n",
" 500. 500. 500. 500. 500. 500. 500. 500. 500. 500. 500. 500. 500. 500.\n",
" 500. 500. 500. 500. 500. 500. 500. 500. 500. 500. 500. 500. 500. 500.\n",
" 500. 500.]\n",
"Port 1: [ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 500. 500.\n",
" 500. 500. 500. 500. 500. 500. 500. 500. 500. 500. 500. 500. 500. 500.\n",
" 500. 500. 500. 500. 500. 500. 500. 500. 500. 500. 500. 500. 500. 500.\n",
" 500. 500. 500. 500. 500. 500. 500. 500. 500. 500. 500. 500. 500. 500.\n",
" 500. 500. 500. 500. 500. 500. 500. 500. 500. 500. 500. 500. 500. 500.\n",
" 500. 500.]\n",
"Port 2: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0.]\n",
"Port 3: [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0.]\n",
"Port 4: [ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.\n",
" 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.\n",
" 1000. 1000. 1000. 1000. 1000. 1000. 1000. 1000. 1000. 1000. 1000. 1000.\n",
" 1000. 1000. 1000. 1000. 1000. 1000. 1000. 1000. 1000. 1000. 1000. 1000.\n",
" 1000. 1000. 1000. 1000. 1000. 1000. 1000. 1000. 1000. 1000. 1000. 1000.\n",
" 1000. 1000. 1000. 1000.]\n"
"array([1000., 0., 1000., 1000., 0., 1000., 1000., 0., 1000.,\n",
" 1000., 0., 1000., 1000., 0., 1000., 1000., 0., 1000.,\n",
" 1000., 0., 1000.], dtype=float32)\n"
]
}
],
@ -424,36 +429,34 @@
"\n",
"\n",
"# Initialize an Env for ECR scenario\n",
"env = Env(scenario=\"ecr\", topology=\"5p_ssddd_l0.0\", start_tick=0, durations=100)\n",
"env = Env(scenario=\"ecr\", topology=\"toy.5p_ssddd_l0.0\", start_tick=0, durations=100)\n",
"\n",
"# Start the environment with None action\n",
"_, decision_event, is_done = env.step(None)\n",
"\n",
"# Run the environment to the end\n",
"_, _, is_done = env.step(None)\n",
"while not is_done:\n",
" _, _, is_done = env.step(None)\n",
" # Case of access snapshot after a certain number of ticks\n",
" if env.tick >= 80:\n",
" # The tick list of past 1 week\n",
" past_week_ticks = [x for x in range(env.tick - 7, env.tick)]\n",
" # The port index of the current decision_event\n",
" decision_port_idx = decision_event.port_idx\n",
" # The attribute list to access \n",
" intr_port_infos = [\"booking\", \"empty\", \"shortage\"]\n",
"\n",
"# Get the capacity info for each ports by directly using initial frame index and attribute name\n",
"capacity = env.snapshot_list[\"ports\"][0::\"capacity\"]\n",
"print(type(capacity), capacity.shape)\n",
"for i in range(len(env.agent_idx_list)):\n",
" print(f\"Port {i} capacity: {capacity[i]}\")\n",
"print()\n",
" \n",
"# Get fulfillment and shortage info simultaneously by using attribute list\n",
"attribute_list = [\"fulfillment\", \"shortage\"]\n",
"info = env.snapshot_list[\"ports\"][::attribute_list]\n",
"print(type(info), info.shape)\n",
" # Query the snapshot list of this environment to get the information of\n",
" # the booking, empty, shortage of the decision port in the past week\n",
" past_week_info = env.snapshot_list[\"ports\"][\n",
" past_week_ticks : decision_port_idx : intr_port_infos\n",
" ]\n",
" pprint(past_week_info)\n",
" \n",
" # This demo code is used to show how to access the information in snapshot,\n",
" # so we terminate the env here for clear output\n",
" break\n",
"\n",
"# Reshape the info of fulfillment and shortage into a user-friendly shape\n",
"num_attributes = len(attribute_list)\n",
"num_frame = env.frame_index + 1\n",
"num_ports = len(env.agent_idx_list)\n",
"info = info.reshape(num_frame, num_attributes, num_ports)\n",
"print(type(info), info.shape)\n",
"\n",
"# Pring and show the change of shortage in each port:\n",
"shortage_idx = 1\n",
"for port_id in env.agent_idx_list:\n",
" print(f\"Port {port_id}: {info[:, shortage_idx, port_id]}\")"
" # Drive the environment with None action\n",
" _, decision_event, is_done = env.step(None)"
]
}
],
@ -478,4 +481,4 @@
},
"nbformat": 4,
"nbformat_minor": 4
}
}

Просмотреть файл

@ -1,15 +1,15 @@
jupyter==1.0.0
jupyter-client==5.3.4
jupyter-console==6.0.0
jupyter-contrib-core==0.3.3
jupyter-contrib-nbextensions==0.5.1
jupyter-core==4.6.1
jupyter-highlight-selected-word==0.2.0
jupyter-latex-envs==1.4.6
jupyter-nbextensions-configurator==0.4.1
jupyterlab==1.2.3
jupyterlab-server==1.0.6
jupyterthemes==0.20.0
jupyter-client
jupyter-console
jupyter-contrib-core
jupyter-contrib-nbextensions
jupyter-core
jupyter-highlight-selected-word
jupyter-latex-envs
jupyter-nbextensions-configurator
jupyterlab
jupyterlab-server
jupyterthemes
isort==4.3.21
autopep8==1.4.4
isort==4.3.21

Просмотреть файл

@ -1,2 +1,2 @@
[build-system]
requires = ["setuptools", "wheel", "numpy == 1.19.1"]
requires = ["setuptools", "wheel", "numpy == 1.19.1"]

Просмотреть файл

@ -1,8 +1,11 @@
#!/bin/bash
# script to build maro locally on linux/mac, usually for development
cd "$(dirname $(readlink -f $0))/.."
if [[ "$OSTYPE" == "linux-gnu"* ]]; then
cd "$(dirname $(readlink -f $0))/.."
elif [[ "$OSTYPE" == "darwin"* ]]; then
cd "$(cd "$(dirname "$0")"; pwd -P)/.."
fi
# compile cython files first
bash ./scripts/compile_cython.sh

Просмотреть файл

@ -2,7 +2,11 @@
# script to build docker for playground image on linux/mac, this require the source code of maro
cd "$(dirname $(readlink -f $0))/.."
if [[ "$OSTYPE" == "linux-gnu"* ]]; then
cd "$(dirname $(readlink -f $0))/.."
elif [[ "$OSTYPE" == "darwin"* ]]; then
cd "$(cd "$(dirname "$0")"; pwd -P)/.."
fi
bash ./scripts/compile_cython.sh

Просмотреть файл

@ -2,7 +2,11 @@
# script to create source package on linux
cd "$(dirname $(readlink -f $0))/.."
if [[ "$OSTYPE" == "linux-gnu"* ]]; then
cd "$(dirname $(readlink -f $0))/.."
elif [[ "$OSTYPE" == "darwin"* ]]; then
cd "$(cd "$(dirname "$0")"; pwd -P)/.."
fi
bash ./scripts/compile_cython.sh

Просмотреть файл

@ -1,6 +1,10 @@
cd "$(dirname $(readlink -f $0))/.."
if [[ "$OSTYPE" == "linux-gnu"* ]]; then
cd "$(dirname $(readlink -f $0))/.."
elif [[ "$OSTYPE" == "darwin"* ]]; then
cd "$(cd "$(dirname "$0")"; pwd -P)/.."
fi
pip install -r ./maro/requirements.build.txt

17
scripts/install_maro.bat Normal file
Просмотреть файл

@ -0,0 +1,17 @@
@ECHO OFF
rem Script to install MARO in editable mode on Windows,
rem usually for development.
chdir "%~dp0.."
rem Install dependencies.
pip install -r .\maro\requirements.build.txt
rem Compile cython files.
call .\scripts\compile_cython.bat
call .\scripts\install_torch.bat
rem Install MARO in editable mode.
pip install -e .

19
scripts/install_maro.sh Normal file
Просмотреть файл

@ -0,0 +1,19 @@
#!/bin/bash
# Script to install maro in editable mode on linux/darwin,
# usually for development.
if [[ "$OSTYPE" == "linux-gnu"* ]]; then
cd "$(dirname $(readlink -f $0))/.."
elif [[ "$OSTYPE" == "darwin"* ]]; then
cd "$(cd "$(dirname "$0")"; pwd -P)/.."
fi
# Install dependencies.
pip install -r ./maro/requirements.build.txt
# Compile cython files.
bash scripts/compile_cython.sh
# Install MARO in editable mode.
pip install -e .

Просмотреть файл

@ -2,8 +2,8 @@
# script to start playground environment within docker container
./redis-6.0.6/src/redis-server -p 6379 &
redis-commander -p 40009 &
./redis-6.0.6/src/redis-server --port 6379 &
redis-commander --port 40009 &
# Python 3.6
cd ./docs/_build/html; python -m http.server 40010 -b 0.0.0.0 &

Просмотреть файл

@ -8,7 +8,7 @@ call scripts/build_maro.bat
rem install requirements
pip install -r ./tests/requirements.txt
pip install -r ./tests/requirements.test.txt
rem show coverage

Просмотреть файл

@ -1,6 +1,10 @@
#!/bin/bash
cd "$(dirname $(readlink -f $0))/.."
if [[ "$OSTYPE" == "linux-gnu"* ]]; then
cd "$(dirname $(readlink -f $0))/.."
elif [[ "$OSTYPE" == "darwin"* ]]; then
cd "$(cd "$(dirname "$0")"; pwd -P)/.."
fi
bash ./scripts/build_maro.sh
@ -9,7 +13,7 @@ bash ./scripts/build_maro.sh
export PYTHONPATH="."
# install requirements
pip install -r ./tests/requirements.txt
pip install -r ./tests/requirements.test.txt
coverage run --rcfile=./tests/.coveragerc

Просмотреть файл

@ -109,7 +109,6 @@ setup(
"azure-storage-common==2.1.0",
"geopy==2.0.0",
"pandas==0.25.3",
"pycurl==7.43.0.5",
"PyYAML==5.3.1"
],
entry_points={

Просмотреть файл

@ -0,0 +1,42 @@
decision:
extra_cost_mode: source # how to assign extra cost, avaiable value: source, target, target_neighbors
resolution: 1 # frequency to check if a cell need an action
# random factor to set bikes transfer time
effective_time_mean: 20
effective_time_std: 10
# these 2 water mark will affect the if decision should be generated
supply_water_mark_ratio: 0.8
demand_water_mark_ratio: 0.001
# ratio of action
action_scope:
low: 0.05 # min ratio of available bikes to keep for current cell, to supply to neighbors
high: 1 # max ratio of available bikes neighbors can provide to current cell
filters: # filters used to pick destinations
- type: "distance" # sort by distance, from neareast to farest
num: 20 # number of output
reward: # reward options
fulfillment_factor: 0.4
shortage_factor: 0.3
transfer_cost_factor: 0.3
# timezone of the data
# NOTE: we need this if we want to fit local time, as binary data will convert timestamp into UTC
# name : https://en.wikipedia.org/wiki/List_of_tz_database_time_zones
time_zone: "America/New_York"
# path to read trip data binary file
trip_data: "tests/data/citi_bike/case_1/trips.bin"
# path to read weather data
weather_data: "tests/data/citi_bike/weathers.bin"
# path to read csv file that used to init stations, also with station id -> index mapping
stations_init_data: "tests/data/citi_bike/case_1/stations.csv"
# path to distance adj matrix
distance_adj_data: "tests/data/citi_bike/case_1/distance_adj.csv"

Просмотреть файл

@ -0,0 +1,3 @@
0,1
0,10.12
10.12,0
1 0 1
2 0 10.12
3 10.12 0

Просмотреть файл

@ -0,0 +1,3 @@
station_index,capacity,init,station_id
0,10,5,111
1,20,10,222
1 station_index capacity init station_id
2 0 10 5 111
3 1 20 10 222

Просмотреть файл

@ -0,0 +1,5 @@
start_time,duration,start_station_index,end_station_index
2019-01-01 00:00:00,5,0,1
2019-01-01 00:01:00,5,0,1
2019-01-01 00:01:00,5,1,0
2019-01-01 00:05:00,5,0,1
1 start_time duration start_station_index end_station_index
2 2019-01-01 00:00:00 5 0 1
3 2019-01-01 00:01:00 5 0 1
4 2019-01-01 00:01:00 5 1 0
5 2019-01-01 00:05:00 5 0 1

Просмотреть файл

@ -0,0 +1,42 @@
decision:
extra_cost_mode: source # how to assign extra cost, avaiable value: source, target, target_neighbors
resolution: 1 # frequency to check if a cell need an action
# random factor to set bikes transfer time
effective_time_mean: 20
effective_time_std: 10
# these 2 water mark will affect the if decision should be generated
supply_water_mark_ratio: 0.8
demand_water_mark_ratio: 0.001
# ratio of action
action_scope:
low: 0.05 # min ratio of available bikes to keep for current cell, to supply to neighbors
high: 1 # max ratio of available bikes neighbors can provide to current cell
filters: # filters used to pick destinations
- type: "distance" # sort by distance, from neareast to farest
num: 20 # number of output
reward: # reward options
fulfillment_factor: 0.4
shortage_factor: 0.3
transfer_cost_factor: 0.3
# timezone of the data
# NOTE: we need this if we want to fit local time, as binary data will convert timestamp into UTC
# name : https://en.wikipedia.org/wiki/List_of_tz_database_time_zones
time_zone: "America/New_York"
# path to read trip data binary file
trip_data: "tests/data/citi_bike/case_2/trips.bin"
# path to read weather data
weather_data: "tests/data/citi_bike/weathers.bin"
# path to read csv file that used to init stations, also with station id -> index mapping
stations_init_data: "tests/data/citi_bike/case_2/stations.csv"
# path to distance adj matrix
distance_adj_data: "tests/data/citi_bike/case_2/distance_adj.csv"

Просмотреть файл

@ -0,0 +1,3 @@
0,1
0,10.12
10.12,0
1 0 1
2 0 10.12
3 10.12 0

Просмотреть файл

@ -0,0 +1,3 @@
station_index,capacity,init,station_id
0,10,5,111
1,20,10,222
1 station_index capacity init station_id
2 0 10 5 111
3 1 20 10 222

Просмотреть файл

@ -0,0 +1,10 @@
start_time,duration,start_station_index,end_station_index
2019-01-01 00:00:00,5,0,1
2019-01-01 00:01:00,5,0,1
2019-01-01 00:01:01,5,0,1
2019-01-01 00:01:02,5,0,1
2019-01-01 00:01:03,5,0,1
2019-01-01 00:01:04,5,0,1
2019-01-01 00:01:05,5,0,1
2019-01-01 00:01:09,5,1,0
2019-01-01 00:05:00,5,0,1
1 start_time duration start_station_index end_station_index
2 2019-01-01 00:00:00 5 0 1
3 2019-01-01 00:01:00 5 0 1
4 2019-01-01 00:01:01 5 0 1
5 2019-01-01 00:01:02 5 0 1
6 2019-01-01 00:01:03 5 0 1
7 2019-01-01 00:01:04 5 0 1
8 2019-01-01 00:01:05 5 0 1
9 2019-01-01 00:01:09 5 1 0
10 2019-01-01 00:05:00 5 0 1

Просмотреть файл

@ -0,0 +1,28 @@
events:
RequireBike: # type name
display_name: "require_bike" # can be empty, then will be same as type name (key)
ReturnBike:
display_name: "return_bike"
RebalanceBike:
display_name: "rebalance_bike"
DeliverBike:
display_name: "deliver_bike"
"_default": "RequireBike" # default event type if not event type in column, such as citibike scenario, all the rows are trip_requirement, so we do not need to specified event column
entity:
timestamp:
column: 'start_time'
dtype: 'i8'
tzone: "America/New_York"
durations:
column: 'duration'
dtype: 'i'
src_station:
column: 'start_station_index'
dtype: 'i'
dest_station:
column: 'end_station_index'
dtype: 'i'
slot: 1
"_event": "type"

Просмотреть файл

@ -0,0 +1,9 @@
date,weather,temp
1/1/2019 0:00:00,0,30.5
1/2/2019 0:00:00,3,32.0
1/3/2019 0:00:00,1,30.0
1/4/2019 0:00:00,0,30.0
1/5/2019 0:00:00,1,34.5
1/6/2019 0:00:00,3,32.5
1/7/2019 0:00:00,3,32.5
1/8/2019 0:00:00,0,30.5
1 date weather temp
2 1/1/2019 0:00:00 0 30.5
3 1/2/2019 0:00:00 3 32.0
4 1/3/2019 0:00:00 1 30.0
5 1/4/2019 0:00:00 0 30.0
6 1/5/2019 0:00:00 1 34.5
7 1/6/2019 0:00:00 3 32.5
8 1/7/2019 0:00:00 3 32.5
9 1/8/2019 0:00:00 0 30.5

Просмотреть файл

@ -0,0 +1,12 @@
entity:
timestamp:
column: 'date'
dtype: 'i8'
tzone: "America/New_York"
weather:
column: 'weather'
dtype: 'i'
temp:
column: 'temp'
dtype: 'f'

Просмотреть файл

@ -0,0 +1,18 @@
entity:
timestamp:
column: 'start_time'
dtype: 'i8'
# used to specified time zone in source file, converter will convert it into UTC,
# default is UTC if not specified:
# name : https://en.wikipedia.org/wiki/List_of_tz_database_time_zones
tzone: "America/New_York"
durations:
column: 'duration'
dtype: 'i'
src_station:
column: 'start_station_index'
dtype: 'i'
dest_station:
column: 'end_station_index'
dtype: 'i'
slot: 1

Просмотреть файл

@ -0,0 +1,26 @@
events:
RequireBike: # type name
display_name: "require_bike" # can be empty, then will be same as type name (key)
ReturnBike:
display_name: "return_bike"
RebalanceBike:
display_name: "rebalance_bike"
DeliverBike:
display_name: "deliver_bike"
"_default": "RequireBike" # default event type if not event type in column, such as citibike scenario, all the rows are trip_requirement, so we do not need to specified event column
entity:
timestamp:
column: 'start_time'
dtype: 'i8'
tzone: "America/New_York"
durations:
column: 'duration'
dtype: 'i'
src_station:
column: 'start_station_index'
dtype: 'i'
dest_station:
column: 'end_station_index'
dtype: 'i'
slot: 1

Просмотреть файл

@ -0,0 +1,11 @@
entity:
durations:
column: 'duration'
dtype: 'i'
src_station:
column: 'start_station_index'
dtype: 'i'
dest_station:
column: 'end_station_index'
dtype: 'i'
slot: 1

Просмотреть файл

@ -0,0 +1,5 @@
start_time,duration,start_station_index,end_station_index
2019-01-01 00:00:00,5,0,1
2019-01-01 00:01:00,5,0,1
2019-01-01 00:01:00,5,1,0
2019-01-01 00:05:00,5,0,1
1 start_time duration start_station_index end_station_index
2 2019-01-01 00:00:00 5 0 1
3 2019-01-01 00:01:00 5 0 1
4 2019-01-01 00:01:00 5 1 0
5 2019-01-01 00:05:00 5 0 1

Просмотреть файл

@ -0,0 +1,7 @@
this case used to test if the vessel arrive and leave at specified tick
check points:
. vessel.location
. tick
. vessel.state

Просмотреть файл

@ -0,0 +1 @@
tick, source, target, order_number
1 tick source target order_number

Некоторые файлы не были показаны из-за слишком большого количества измененных файлов Показать больше