Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

a GAN idea #31

Open
alreadydone opened this issue Sep 27, 2018 · 1 comment
Open

a GAN idea #31

alreadydone opened this issue Sep 27, 2018 · 1 comment

Comments

@alreadydone
Copy link

alreadydone commented Sep 27, 2018

Thank you for the work. I recently start working on reinforcement learning of mathematical research (with the formal language and deduction system of a proof assistant as the environment); it's not straightforward to design a proper reward, but novelty is certainly a good measure of progress, and your work is inspiring.

One idea I have, which I also intend to apply in my project, is about the measurement of prediction error; it seems to me that some GAN idea is applicable here. The predictor can be seen as a generator, so how about training a discriminator (conditioned on the current state) with the predicted outcomes as negative samples and the actual outcomes as positive samples? Maybe then you can just predict the pixels, and the discriminator will extract features automatically and ignore any essentially unpredictable features, like the exact locations of tree leaves in a breeze. Also it would be unnecessary to distinguish between things that affect or can be controlled by the agent and things that do not.

I am a beginner in reinforcement learning apart from my participation in the Leela Zero project. I haven't looked much into the details of the various algorithms and NN architectures, and just want to get some feedback about whether the general idea is promising. Thank you in advance!

@AdarshMJ
Copy link

AdarshMJ commented Nov 8, 2018

My initial thoughts were the same. I read few papers which outline the similarities between RL algorithms and GAN, like for example - https://arxiv.org/pdf/1610.01945.pdf
Im not sure whether we can augment GAN with RL algorithms or would it just complicate the whole stuff

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants