Reliability and Learnability of Human Bandit Feedback for Sequence-to-Sequence Reinforcement Learning

Julia Kreutzer | Joshua Uyheng | Stefan Riezler |

Paper Details:

Month: July
Year: 2018
Location: Melbourne, Australia
Venue: ACL |