A Pretrainer’s Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity

Shayne Longpre | Gregory Yauney | Emily Reif | Katherine Lee | Adam Roberts | Barret Zoph | Denny Zhou | Jason Wei | Kevin Robinson | David Mimno | Daphne Ippolito |

Paper Details:

Month: June
Year: 2024
Location: Mexico City, Mexico
Venue: NAACL |