Replies: 1 comment
-
Recurrent PPO (or any other on-policy loss) is supported. Here are the key points to consider, which are similar to the Q-learning tutorial:
Besides that, the loss should work out of the box on the data sampled from the replay buffer, and using the RNNs-based models. But feel free to open an issue or share code. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi everyone,
I read the tutorial on recurrent Q learning, my question would be if recurrent PPO is also supported?
Any special caveats that I should be aware of when doing recurrent PPO in contrast to recurrent Q-learning?
I'm currently trying it out and receiving an error, if it turns out that it should be supported I will open an issue with more details.
Thanks for your input!
Beta Was this translation helpful? Give feedback.
All reactions