MoEfication: Transformer Feed-forward Layers are Mixtures of Experts

Zhengyan Zhang | Yankai Lin | Zhiyuan Liu | Peng Li | Maosong Sun | Jie Zhou |

Paper Details:

Month: May
Year: 2022
Location: Dublin, Ireland
Venue: ACL | Findings |