Layer-Condensed KV Cache for Efficient Inference of Large Language Models

Haoyi Wu | Kewei Tu |

Paper Details:

Month: August
Year: 2024
Location: Bangkok, Thailand
Venue: ACL |

Citations

URL

No Citations Yet