PRP: Propagating Universal Perturbations to Attack Large Language Model Guard-Rails

Neal Mangaokar | Ashish Hooda | Jihye Choi | Shreyas Chandrashekaran | Kassem Fawaz | Somesh Jha | Atul Prakash |

Paper Details:

Month: August
Year: 2024
Location: Bangkok, Thailand
Venue: ACL |