Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization

Zhexin Zhang | Junxiao Yang | Pei Ke | Fei Mi | Hongning Wang | Minlie Huang |

Paper Details:

Month: August
Year: 2024
Location: Bangkok, Thailand
Venue: ACL |

Citations

URL