NLPExplorer
Papers
Venues
Authors
Authors Timeline
Field of Study
URLs
ACL N-gram Stats
TweeNLP
API
Team
BASS: Batched Attention-optimized Speculative Sampling
Haifeng Qian
|
Sujan Kumar Gonugondla
|
Sungsoo Ha
|
Mingyue Shang
|
Sanjay Krishna Gouda
|
Ramesh Nallapati
|
Sudipta Sengupta
|
Xiaofei Ma
|
Anoop Deoras
|
Paper Details:
Month: August
Year: 2024
Location: Bangkok, Thailand and virtual meeting
Venue:
F |
i |
n |
d |
i |
n |
g |
s |
- |
A |
C |
L |
Citations
URL
No Citations Yet
https://github.com/microsoft/DeepSpeed
https://github.com/vllm-project/vllm
https://github.com/NVIDIA/cutlass
https://openai.com/blog/sparse-transformers
https://lmsys.org/blog/2023-11-21-lookahead-
Field Of Study