LLM in a flash: Efficient Large Language Model Inference with Limited Memory

Keivan Alizadeh | Seyed Iman Mirzadeh | Dmitry Belenko | S. Khatamifard | Minsik Cho | Carlo C Del Mundo | Mohammad Rastegari | Mehrdad Farajtabar |

Paper Details:

Month: August
Year: 2024
Location: Bangkok, Thailand
Venue: ACL |

Citations

URL

No Citations Yet

No URLs Found

Field Of Study