426898f137 | ||
---|---|---|
.github | ||
config | ||
csv_data | ||
env_data | ||
img | ||
models | ||
release/presales_evaluation | ||
sim | ||
source/presales_evaluation | ||
.gitignore | ||
CODE_OF_CONDUCT.md | ||
Dockerfile | ||
LICENSE.txt | ||
README.md | ||
SECURITY.md | ||
assess_config.json | ||
datamodeler.py | ||
env_data_modeler.py | ||
environment.yml | ||
generate.py | ||
interface.json | ||
machine_teacher.ink | ||
main.py | ||
plot_diff.py | ||
policies.py | ||
predictor.py | ||
requirements.txt | ||
train_bonsai_main.py |
README.md
Data driven model creation for simulators to train brains on Bonsai
Tooling to simplify the creation and use of data driven simulators using supervised learning with the purpose of training brains with Project Bonsai. It digests data as csv and will generate simulation models which can then be directly used to train a reinforcement learning agent.
🚩 Disclaimer: This is not an official Microsoft product. This application is considered an experimental addition to Microsoft Project Bonsai's software toolchain. It's primary goal is to reduce barriers of entry to use Project Bonsai's core Machine Teaching. Pull requests for fixes and small enhancements are welcome, but we do expect this to be replaced by out-of-the-box features of Project Bonsai in the near future.
Dependencies
conda env update -f environment.yml
conda activate datadriven
Main steps to follow
Step 1.
Obtain historical or surrogate sim data in csv format.
- header names
- a single row should be a slice in time
- ensure data ranges cover what we might set reinforcement learning to explore
- smooth noisy data, get rid of outliers
- remove NaN or SNA values
Refer to later sections for help with checking data quality before using this tool.
Step 2.
Change the config_model.yml
file in the config/
folder
Enter the csv file name. Enter the timelag, i.e. the number of rows or iterations that define the state transition in the data. A timelag of 1
will use every row of data, where if one makes a change in the system/process the result is shown in the next sample measurement.
Define the names of the features as input to the simulator model you will create. The names should match the headers of the csv file you provide. Set the values as either state
or action
. Define the output_name
matching the headers in your csv.
Define the model type as either gb, poly, nn, or lstm
. Depending on the specific model one chooses, alter the hyperparameters in this config file as well.
# Define csv file path to train a simulator with
DATA:
path: ./csv_data/example_data.csv
timelag: 1
# Define the inputs and outputs of datadriven simulator
IO:
feature_name:
theta: state
alpha: state
theta_dot: state
alpha_dot: state
Vm: action
output_name:
- theta
- alpha
- theta_dot
- alpha_dot
# Select the model type gb, poly, nn, or lstm
MODEL:
type: gb
# Polynomial Regression hyperparameters
POLY:
degree: 1
# Gradient Boost hyperparameters
GB:
n_estimators: 100
lr: 0.1
max_depth: 3
# MLP Neural Network hyperparameters
NN:
epochs: 100
batch_size: 512
activation: linear
n_layer: 5
n_neuron: 12
lr: 0.00001
decay: 0.0000003
dropout: 0.5
# LSTM Neural Network hyperparameters
LSTM:
epochs: 100
batch_size: 512
activation: linear
num_hidden_layer: 5
n_neuron: 12
lr: 0.00001
decay: 0.0000003
dropout: 0.5
markovian_order: 2
num_lstm_units: 1
Step 3.
Run the tool
python datamodeler.py
The tool will ingest your csv file as input and create a simulator model of the type you selected. THe resultant model will be placed into models/
.
Step 4.
Use the model directly
An adaptor class is available for usage in the following way to make custom integrations. We've already done this for you in Step 5
, but this provides a good understanding. Initialize the class with model type, which consists of either 'gb', 'poly', 'nn', or 'lstm'
.
Specify a noise_percentage
to optionally add to the states of the simulator, leaving it at zero will not add noise. Training a brain can benefit from adding noise to the states of an approximated simulator to promote robustness.
Define the action_space_dimensions
and the state_space_dimensions
. The markovian_order
is needed when setting the sequence length of the features for an LSTM
.
from predictor import ModelPredictor
predictor = ModelPredictor(
modeltype="gb",
noise_percentage=0,
state_space_dim=4,
action_space_dim=1,
markovian_order=0
)
Calculate next state as a function of current state and current action. IMPORTANT: input state and action are arrays. You need to convert brain action, i.e. dictionary, to an array before feeding into the predictor class.
next_state = predictor.predict(action=action, state=state)
The thing to watch out for with datadriven simulators is one cannot trust the approximations when the feature inputs are not within the range it was trained on, i.e. you may get erroneous results. One can optionally evaluate if this occurs by using the warn_limitation()
functionality.
features = np.concatenate([state, action]
predictor.warn_limitation(features)
Sim should not be necessarily trusted since predicting with the feature Vm outside of range it was trained on, i.e. extrapolating.
Step 5.
Train with Bonsai
Create a brain and write Inkling with type definitions that match what the simulator can provide, which you defined in config_model.yml
. Run the train_bonsai_main.py
file to register your newly created simulator. The integration is already done! Then connect the simulator to your brain.
Be sure to specify noise_percentage
in your Inkling's scenario. Training a brain can benefit from adding noise to the states of an approximated simulator to promote robustness.
The episode_start in
train_bonsai_main.py
is expecting initial conditions of your states defined inconfig_model.yml
to match scenario dictionary passed in. If you want to pass in other variables that are not modeled by the datadrivenmodel tool (except for noise_percentage), you'll likely have to modifytrain_bonsai_main.py
.
lesson `Start Inverted` {
scenario {
theta: number<-1.4 .. 1.4>,
alpha: number<-0.05 .. 0.05>, # reset inverted
theta_dot: number <-0.05 .. 0.05>,
alpha_dot: number<-0.05 .. 0.05>,
noise_percentage: 5,
}
}
Ensure the SimConfig in Inkling matches the names of the headers in the
config_model.yml
to allowtrain_bonsai_main.py
to work.
python train_bonsai_main.py --workspace <workspace-id> --accesskey <accesskey>
Optional Flags
Use pickle instead of csv as data input
Name your dataset as x_set.pickle
and y_set.pickle
.
python datamodeler.py --pickle <foldername>
For example one might use a <foldername>
of env_data
.
Hyperparameter tuning
Gradient Boost should not require much tuning at all. Polynomial Regression may benefit from changing the order. Neural Networks, however, may require significant hyperparameter tuning. Use the flag to use the specified ranges in the model_config.yml
file to randomly search.
python datamodeler.py --tune-rs=True
LSTM
After creating an LSTM model for your sim, you can use the predictor class in the same way as the other models. The predictor class initializes a sequence and will continue to stack a history of state transitions and pop off the oldest information. In order to maintain a sequence of valid distributions when starting a sim using the LSTM model, the predictor class takes a single timestep of initial conditions and will automatically step through the sim using the mean value of each of the actions captured from the data in model_limits.yml
.
from predictor import ModelPredictor
import numpy as np
predictor = ModelPredictor(
modeltype="lstm",
noise_percentage=0,
state_space_dim=4,
action_space_dim=1,
markovian_order=3
)
config = {
'theta': 0.01,
'alpha': 0.02,
'theta_dot': 0.04,
'alpha_dot': 0.05,
'noise_percentage': 5,
}
predictor.reset_state(config)
for i in range(1):
next_state = predictor.predict(
action=np.array([0.83076303]),
state=np.array(
[ 0.6157155 , 0.19910571 , 0.15492792 , 0.09268583]
)
)
print('next_state: ', next_state)
print('history state: ', predictor.state)
print('history action: ', predictor.action_history)
The code snippet using the LSTM results in the following history of states and actions. Take note that reset_state(config)
will auto populate realistic trajectories for the sequence using the mean action. This means that the 0th
iteration of the simulation does not necessarily start where the user specified in the config
. Continuing to predict()
will step through the sim and maintain a history of the state transitions automatically for you, matching the sequence of information required for the LSTM.
next_state: [ 0.41453919 0.07664483 -0.13645072 0.81335021]
history state: deque([0.6157155, 0.19910571, 0.15492792, 0.09268583, 0.338321704451164, 0.018040116405597596, -0.5707406858943783, 0.3023940018967715, 0.01, 0.02, 0.04, 0.05], maxlen=12)
history action: deque([0.83076303, -0.004065768182411825, -0.004065768182411825], maxlen=3)
Build Simulator Package
az acr build --image <IMAGE_NAME>:<IMAGE_VERSION> --file Dockerfile --registry <ACR_REGISTRY> .
Data Evaluation
Use the release version of the jupyter notebook to assist you with qualifying your data for creation of a simulator using supervised learning. The notebook is split up into three parts. The notebook uses the nbgrader
package, where the user should click the Validate
button to determine if all tests have been passed. You will be responsible for loading the data, running cells to see if you successfully pass tests, and manipulating the data in Python if the notebook finds things like NaNs and outliers. It will ask for desired operating limits of the model you wish to create and compare that against what is available in your provided datasets. Asssuming you pass the tests for data relevance, your data will be exported to a single csv named approved_data.csv
which is ready to be ingested by the datadrivenmodel
tool.
- Data Relevance
- Sparsity
- Data Distribution Confidence
jupyter notebook release/presales_evaluation/presales_evaluation.ipynb
Once you have successfuly qualified your data using the
Validate
button, it is recommended to export it as a PDF to share the results without requiring access to the data.
Contribute Code
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.microsoft.com.
When you submit a pull request, a CLA-bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repositories using our CLA.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.
Telemetry
The software may collect information about you and your use of the software and send it to Microsoft. Microsoft may use this information to provide services and improve our products and services. You may turn off the telemetry as described in the repository. There are also some features in the software that may enable you and Microsoft to collect data from users of your applications. If you use these features, you must comply with applicable law, including providing appropriate notices to users of your applications together with a copy of Microsoft's privacy statement. Our privacy statement is located at https://go.microsoft.com/fwlink/?LinkID=824704. You can learn more about data collection and use in the help documentation and our privacy statement. Your use of the software operates as your consent to these practices.
Trademarks
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.