Related trick: I found that training two Natural Intelligence (NI) models in parallel, and having them train each other for most of the time, leads to significant leaps in capabilities. Notably, when one NI picks up a skill, it often results in spontaneous transfer learning - the other NI picks that skill up very quickly, much faster than it would through direct training.
This scales well, too. There are facilities that provide services of co-hosting and cross-training up to ~two dozen NI models in a shared environment - in my experience, this provides similar training benefits to running multiple NIs on your own, at fraction of the cost.
(The facilities are exploiting some neat economies of scale. Talking to some employees, I learned that the transfer learning and co-activation are embarrassingly scalable: if you get two-three NIs to pick up a thing, all the rest immediately follow.)
This scales well, too. There are facilities that provide services of co-hosting and cross-training up to ~two dozen NI models in a shared environment - in my experience, this provides similar training benefits to running multiple NIs on your own, at fraction of the cost.
(The facilities are exploiting some neat economies of scale. Talking to some employees, I learned that the transfer learning and co-activation are embarrassingly scalable: if you get two-three NIs to pick up a thing, all the rest immediately follow.)