NLPExplorer
Papers
Venues
Authors
Authors Timeline
Field of Study
URLs
ACL N-gram Stats
TweeNLP
API
Team
Loose lips sink ships: Mitigating Length Bias in Reinforcement Learning from Human Feedback
Wei Shen
|
Rui Zheng
|
Wenyu Zhan
|
Jun Zhao
|
Shihan Dou
|
Tao Gui
|
Qi Zhang
|
Xuanjing Huang
|
Paper Details:
Month: December
Year: 2023
Location: Singapore
Venue:
F |
i |
n |
d |
i |
n |
g |
s |
- |
E |
M |
N |
L |
P |
Citations
URL
No Citations Yet
https://huggingface.co/datasets/Dahoas/
https://github.com/tatsu-lab/stanford_alpaca
https://github.com/cascip/ChatAlpaca
Field Of Study