On the Distribution, Sparsity, and Inference-time Quantization of Attention Values in Transformers

Tianchu Ji | Shraddhan Jain | Michael Ferdman | Peter Milder | H. Andrew Schwartz | Niranjan Balasubramanian |

Paper Details:

Month: August
Year: 2021
Location: Online
Venue: Findings |