NLPExplorer
Papers
Venues
Authors
Authors Timeline
Field of Study
URLs
ACL N-gram Stats
TweeNLP
API
Team
TLCR: Token-Level Continuous Reward for Fine-grained Reinforcement Learning from Human Feedback
Eunseop Yoon
|
Hee Suk Yoon
|
SooHwan Eom
|
Gunsoo Han
|
Daniel Nam
|
Daejin Jo
|
Kyoung-Woon On
|
Mark Hasegawa-Johnson
|
Sungwoong Kim
|
Chang Yoo
|
Paper Details:
Month: August
Year: 2024
Location: Bangkok, Thailand and virtual meeting
Venue:
F |
i |
n |
d |
i |
n |
g |
s |
- |
A |
C |
L |
Citations
URL
No Citations Yet
https://platform.openai.com/docs/models/gpt-4-and-gpt-
https://huggingface.co/spaces/lmsys/chatbot-arena-
https://github.com/tatsu-lab/alpaca_eval
Field Of Study