Viewing a single comment thread. View all comments

comradeswitch t1_itmhmqh wrote

> MLPs are universal function approximators

MLPs with non-polynomial activation functions with either arbitrary width or arbitrary depth have the ability to approximate a function f: S -> R with an arbitrary specified level of error where S is a compact subset of R^n.

Violate any of these assumptions and you lose those guarantees. Any finite MLP will only be able to approximate a subset of functions with the given support for an arbitrary error level. Nothing about their ability in practice contradicts this.

Much like how there exist matrix multiplication algorithms with better than O(n^2.4) running time but the naive O(n^3) algorithm outperforms them for all physically realizable inputs, the effects of finite sizes are very important to consider.

1