Reinforcement 2 post-quiz
This commit is contained in:
Родитель
80e0986409
Коммит
724652c688
|
@ -17,7 +17,7 @@ We will use a simplified version of balancing known as a **CartPole** problem. I
|
|||
In this lesson, we will be using a library called **OpenAI Gym** to simulate different **environments**. You can run this lesson's code locally (eg. from Visual Studio Code), in which case the simulation will open in a new window. When running the code online, you may need to make some tweaks to the code, as described [here](https://towardsdatascience.com/rendering-openai-gym-envs-on-binder-and-google-colab-536f99391cc7).
|
||||
## OpenAI Gym
|
||||
|
||||
In the previous lesson, the rules of the game and the state were given by the `Board` class which we defined ourselves. Here we will use a special **sumulation environment**, which will simulate the physics behind the balancing pole. One of the most popular simulation environments for training reinforcement learning algorithms is called a [Gym](https://gym.openai.com/), which is maintained by [OpenAI](https://openai.com/). By using this gym we can create difference **environments** from a cartpole simulation to Atari games.
|
||||
In the previous lesson, the rules of the game and the state were given by the `Board` class which we defined ourselves. Here we will use a special **simulation environment**, which will simulate the physics behind the balancing pole. One of the most popular simulation environments for training reinforcement learning algorithms is called a [Gym](https://gym.openai.com/), which is maintained by [OpenAI](https://openai.com/). By using this gym we can create difference **environments** from a cartpole simulation to Atari games.
|
||||
|
||||
> **Note**: You can see other environments available from OpenAI Gym [here](https://gym.openai.com/envs/#classic_control).
|
||||
|
||||
|
|
|
@ -2640,48 +2640,52 @@
|
|||
"title": "Reinforcement 2: Post-Lecture Quiz",
|
||||
"quiz": [
|
||||
{
|
||||
"questionText": "q1",
|
||||
"questionText": "Where do we define all possible actions in an environment?",
|
||||
"answerOptions": [
|
||||
{
|
||||
"answerText": "a",
|
||||
"answerText": "methods",
|
||||
"isCorrect": "false"
|
||||
},
|
||||
{
|
||||
"answerText": "b",
|
||||
"answerText": "action space",
|
||||
"isCorrect": "true"
|
||||
},
|
||||
{
|
||||
"answerText": "c",
|
||||
"answerText": "action list",
|
||||
"isCorrect": "false"
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"questionText": "q2",
|
||||
"questionText": "What pair did we use as the dictionary key-value?",
|
||||
"answerOptions": [
|
||||
{
|
||||
"answerText": "a",
|
||||
"answerText": "(state, action) as the key, Q-Table entry as the value",
|
||||
"isCorrect": "true"
|
||||
},
|
||||
{
|
||||
"answerText": "b",
|
||||
"answerText": "state as the key, action as the value",
|
||||
"isCorrect": "false"
|
||||
},
|
||||
{
|
||||
"answerText": "the value of the qvalues function as the key, action as the value",
|
||||
"isCorrect": "false"
|
||||
}
|
||||
]
|
||||
},
|
||||
{
|
||||
"questionText": "q3",
|
||||
"questionText": "What are the hyperparameters we used during Q-Learning?",
|
||||
"answerOptions": [
|
||||
{
|
||||
"answerText": "a",
|
||||
"answerText": "q-table value, current reward, random action",
|
||||
"isCorrect": "false"
|
||||
},
|
||||
{
|
||||
"answerText": "b",
|
||||
"answerText": "learning rate, discount factor, exploration/exploitation factor",
|
||||
"isCorrect": "true"
|
||||
},
|
||||
{
|
||||
"answerText": "c",
|
||||
"answerText": "cumulative rewards, learning rate, exploration factor",
|
||||
"isCorrect": "false"
|
||||
}
|
||||
]
|
||||
|
|
Загрузка…
Ссылка в новой задаче