Removing RLHF Protections in GPT-4 via Fine-Tuning

Qiusi Zhan | Richard Fang | Rohan Bindu | Akul Gupta | Tatsunori Hashimoto | Daniel Kang |

Paper Details:

Month: June
Year: 2024
Location: Mexico City, Mexico
Venue: NAACL |

Citations

URL

No Citations Yet