MixCE: Training Autoregressive Language Models by Mixing Forward and Reverse Cross-Entropies

Shiyue Zhang | Shijie Wu | Ozan Irsoy | Steven Lu | Mohit Bansal | Mark Dredze | David Rosenberg |

Paper Details:

Month: July
Year: 2023
Location: Toronto, Canada
Venue: ACL |