Does anyone know if LLMs have been used to augment their own training data? I wo...

jph00 · on Sept 6, 2023

Yes, a lot of recent research uses LLM outputs as training data, and it's been an extremely successful line of work.

fpgaminer · on Sept 6, 2023

That's effectively what RLHF is; a means for LLMs to self train on their own output exclusively by using a small human curated dataset as guidance as to what a "good" and "bad" output is.

muxator · on Sept 6, 2023

It's interesting that this conclusion is the exact opposite of a sibling comment, which proposes that a small, human-curated corpus may be more effective than big, synthetic datasets.

Buttons840 · on Sept 6, 2023

I have no "conclusion". I'm just wondering.

ActivePattern · on Sept 6, 2023

If it's training on the same data that it generates, there's no new information being added into the system. You'd be reinforcing everything that it already gets right and wrong, which would lead to zero improvement.

That said, it's common to use large models to generate synthetic training data for training other smaller models. In this way, we're able to transfer knowledge from one model to another.

rsrsrs86 · on Sept 6, 2023

You can find the answer by trying the following: generate random data according to a model, fit a linear regression (or any other distribution), sample from the distribution, add it as to the training set.