Attention Alignment and Flexible Positional Embeddings Improve Transformer Length Extrapolation

Ta-Chung Chi | Ting-Han Fan | Alexander Rudnicky |

Paper Details:

Month: June
Year: 2024
Location: Mexico City, Mexico
Venue: F | i | n | d | i | n | g | s | - | N | A | A | C | L |