Repulsive Attention: Rethinking Multi-head Attention as Bayesian Inference

Bang An | Jie Lyu | Zhenyi Wang | Chunyuan Li | Changwei Hu | Fei Tan | Ruiyi Zhang | Yifan Hu | Changyou Chen |

Paper Details:

Month: November
Year: 2020
Location: Online
Venue: EMNLP |