Submitted by alex_lite_21 t3_10kjhhb in MachineLearning
arg_max t1_j5r8qe6 wrote
Reply to comment by gunshoes in [D] are two linear layers better than one? by alex_lite_21
What do you mean by "function represented by a neural network"? If you are hinting in the direction of universal approximation, then yes, you can learn any continuous function arbitrarily close with a single layer, sigmoid activation and infinite width. But similarly, there exist some results that show you can achieve a similar statement with a width-limited and "infinite depth" network (the required depth is not infinite but depends on the function you want to approximate and is afaik unbounded over the space of continuous functions). In practice, we are far away from either infinite width or depth so specific configurations can matter.
Viewing a single comment thread. View all comments