Aligning Large Language Models with Human Preferences through Representation Engineering

Wenhao Liu | Xiaohua Wang | Muling Wu | Tianlong Li | Changze Lv | Zixuan Ling | Zhu JianHao | Cenyuan Zhang | Xiaoqing Zheng | Xuanjing Huang |

Paper Details:

Month: August
Year: 2024
Location: Bangkok, Thailand
Venue: ACL |