NLPExplorer
Papers
Venues
Authors
Authors Timeline
Field of Study
URLs
ACL N-gram Stats
TweeNLP
API
Team
Self-Evolution Fine-Tuning for Policy Optimization
Ruijun Chen
|
Jiehao Liang
|
Shiping Gao
|
Fanqi Wan
|
Xiaojun Quan
|
Paper Details:
Month: November
Year: 2024
Location: Miami, Florida, USA
Venue:
F |
i |
n |
d |
i |
n |
g |
s |
- |
E |
M |
N |
L |
P |
Citations
URL
No Citations Yet
https://sharegpt.com/
https://huggingface.co/datasets/openbmb/
https://huggingface.co/datasets/teknium/
https://huggingface.co/microsoft/phi-2
https://github.com/tatsu-lab/alpaca_eval
Field Of Study