Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It appears that a large part of OpenAI's success is the large volume of human feedback reinforcement learning that they've used to tune their models. The more feedback they get, the better the model becomes.

It's a bit like Tesla having an advantage developing self-driving AI models by having hundreds of thousands of their cars on the road ("the fleet") gather data about rare events for training the model.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: