Language Models are Homer Simpson! Safety Re-Alignment of Fine-tuned Language Models through Task Arithmetic

Rishabh Bhardwaj | Duc Anh Do | Soujanya Poria |

Paper Details:

Month: August
Year: 2024
Location: Bangkok, Thailand
Venue: ACL |