Safety Alignment in NLP Tasks: Weakly Aligned Summarization as an In-Context Attack

Yu Fu | Yufei Li | Wen Xiao | Cong Liu | Yue Dong |

Paper Details:

Month: August
Year: 2024
Location: Bangkok, Thailand
Venue: ACL |