Imperfect also Deserves Reward: Multi-Level and Sequential Reward Modeling for Better Dialog Management

Zhengxu Hou | Bang Liu | Ruihui Zhao | Zijing Ou | Yafei Liu | Xi Chen | Yefeng Zheng |

Paper Details:

Month: June
Year: 2021
Location: Online
Venue: NAACL |