waa007 t1_ix7yoy7 wrote
Very Good point!
It seems like that apply GAN(Generative adversarial network) in NLP, The main problem is that how to judge how much reward or penalty should to given extreme accurate,
blazejd OP t1_ix8il0p wrote
Can you rephrase the last part of your second sentence? Don't quite get what you mean.
koiRitwikHai t1_ix95poh wrote
It meant same as "how will you define an objective function?"
waa007 t1_ixayg3f wrote
In general RL, the environment will get a accurate reward after the agent have a step, In NLP, It's hard to give a accurate reward except that there is a really person to teach the agent.
So I think how to give a accurate reward is the main problem.
I'm sorry that it has so little contact with GAN.
Viewing a single comment thread. View all comments