dumbmachines

dumbmachines t1_iqx2g66 wrote

>So if we ignore the implementation details to accomplish this for large networks, are there any inherent advantages to using higher-order neurons?

I don't know what that might be, but there is an inherent advantage in stacking layers of act(WX+b) where act is some non-linear function. Instead of guessing what higher level function you should use for each neuron, you can learn the higher order function by stacking many simpler non-linear functions. That way the solution is general and can work over many different datasets and modalities.

3