Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I skimmed through the article. It's very accessible and well-written, compared to my experience with some of the other literature in ML.

As a bit of an outsider I'd like to ask, how does this compare to the traditional analogue? In what scenarios is this superior to the classic BP NNs?

I understand that the point of the paper is to show that shared-weight training is possible, but the results look promising, hence the question.

There was a mention that one of the test cases achieved results competitive with current state of the art, but using 10 times less connections. That's pretty impressive.



Hi, thanks for the feedback.

We think this approach has the potential to develop networks useful in applications where we need to train the weights really quickly for the network to adapt to a given task.

We put in the discussion section that the ability to quickly fine-tune weights might find uses in few-shot learning and continual lifelong learning where agents continually acquire, fine-tune, and transfer skills throughout their lifespan.

The question is, which "super task" do you optimize your WANN for so that it is useful for many subtasks that you did not optimize for to begin with? We think perhaps optimizing a WANN to be good at exploration and curiosity might be a good start for it to develop good priors that are useful for new unseen tasks in its environment. There's a lot more work that can be done from here ...


If I understand correctly, the idea isn’t that you’d only use the architecture untrained, but that its goal is better architecture discovery for given tasks or groups of tasks.

Is that accurate?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: