Skip to content

Commit

Permalink
Fix graph overlaps and some typos. (#16)
Browse files Browse the repository at this point in the history
* Minor typo fixes for 2_gym_wrappers_saving_loading.ipynb

* Adjust plot spacing and fix some typos.

* The first plot cell defines a nice spacing to use which ensures
  that the y axis of the second plot doesn't overlap the first plot.
  This just copies that block of code to the other plotting cells.

* Extract plotting into a helper function.

* Clean up plot util function.

* Undo unrelated colab changes.

* Fix my own typo.
  • Loading branch information
dan-pandori authored Oct 24, 2022
1 parent baca7e5 commit ea54c27
Show file tree
Hide file tree
Showing 2 changed files with 54 additions and 144 deletions.
11 changes: 5 additions & 6 deletions 2_gym_wrappers_saving_loading.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -378,7 +378,7 @@
"source": [
"## Second example: normalize actions\n",
"\n",
"It is usually a good idea to normalize observations and actions before giving it to the agent, this prevent [hard to debug issue](https://github.com/hill-a/stable-baselines/issues/473).\n",
"It is usually a good idea to normalize observations and actions before giving it to the agent, this prevents this [hard to debug issue](https://github.com/hill-a/stable-baselines/issues/473).\n",
"\n",
"In this example, we are going to normalize the action space of *Pendulum-v1* so it lies in [-1, 1] instead of [-2, 2].\n",
"\n",
Expand Down Expand Up @@ -425,7 +425,6 @@
" \"\"\"\n",
" Reset the environment \n",
" \"\"\"\n",
" # Reset the counter\n",
" return self.env.reset()\n",
"\n",
" def step(self, action):\n",
Expand Down Expand Up @@ -505,7 +504,7 @@
"source": [
"#### Test with a RL algorithm\n",
"\n",
"We are going to use the Monitor wrapper of stable baselines, wich allow to monitor training stats (mean episode reward, mean episode length)"
"We are going to use the Monitor wrapper of stable baselines, which allow to monitor training stats (mean episode reward, mean episode length)"
]
},
{
Expand Down Expand Up @@ -610,7 +609,7 @@
"source": [
"## Additional wrappers: VecEnvWrappers\n",
"\n",
"In the same vein as gym wrappers, stable baselines provide wrappers for `VecEnv`. Among the different that exist (and you can create your own), you should know: \n",
"In the same vein as gym wrappers, stable baselines provide wrappers for `VecEnv`. Among the different wrappers that exist (and you can create your own), you should know: \n",
"\n",
"- VecNormalize: it computes a running mean and standard deviation to normalize observation and returns\n",
"- VecFrameStack: it stacks several consecutive observations (useful to integrate time in the observation, e.g. sucessive frame of an atari game)\n",
Expand Down Expand Up @@ -760,7 +759,7 @@
"\n",
"# Reset the environment\n",
"\n",
"# Take random actions in the enviromnent and check\n",
"# Take random actions in the environment and check\n",
"# that it returns the correct values after the end of each episode\n",
"\n",
"# ====================== #"
Expand Down Expand Up @@ -851,7 +850,7 @@
" time_feature = 1 - (self._current_step / self._max_steps)\n",
" if self._test_mode:\n",
" time_feature = 1.0\n",
" # Optionnaly: concatenate [time_feature, time_feature ** 2]\n",
" # Optionally: concatenate [time_feature, time_feature ** 2]\n",
" return np.concatenate((obs, [time_feature]))"
],
"execution_count": 0,
Expand Down
Loading

0 comments on commit ea54c27

Please sign in to comment.