Revisiting Block-based Quantisation: What is Important for Sub-8-bit LLM Inference?

Cheng Zhang | Jianyi Cheng | Ilia Shumailov | George Constantinides | Yiren Zhao |

Paper Details:

Month: December
Year: 2023
Location: Singapore
Venue: EMNLP |