They are demonstrating a new technique, Gated Linear Networks.
> gives rise to universal learning capabilities in the limit
they claim to show that with an unbounded amount of time and memory (network size / # params) this architecture can be used to learn/approximate any function
> with effective model capacity increasing as a function of network size
Model capacity here refers to the ability to memorize a mapping between inputs and outputs. They show that a network with more layers/weights will "memorize" more.
They are demonstrating a new technique, Gated Linear Networks.
> gives rise to universal learning capabilities in the limit
they claim to show that with an unbounded amount of time and memory (network size / # params) this architecture can be used to learn/approximate any function
> with effective model capacity increasing as a function of network size
Model capacity here refers to the ability to memorize a mapping between inputs and outputs. They show that a network with more layers/weights will "memorize" more.
> in a manner comparable with deep ReLU networks
"Deep ReLU networks" are referring to commonly used modern deep neural network architectures. ReLU is a popular activation function: https://en.wikipedia.org/wiki/Rectifier_(neural_networks)