Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards

Haoxiang Wang | Yong Lin | Wei Xiong | Rui Yang | Shizhe Diao | Shuang Qiu | Han Zhao | Tong Zhang |

Paper Details:

Month: August
Year: 2024
Location: Bangkok, Thailand
Venue: ACL |