Viewing a single comment thread. View all comments

BrotherAmazing t1_iyaux7r wrote

A deep neural network can approximate and function.

A deep recurrent neural network can approximate any algorithm.

The are mathematically proven facts. Can the same be said about “a bunch of decision trees in hyperspace”? If so, then I would say “a bunch of decision trees in hyperspace” are pretty darn powerful, as are deep neural networks. If not, then I would say the author has made a logical error somewhere along the way in his very qualitative reasoning. Plenty of thought experiments in language with “bulletproof” arguments have led to “contradictions” in the past, only for a subtle logical error to be unveiled when we stop using language and start using mathematics.

2

Difficult-Race-1188 OP t1_iyaxhbe wrote

The argument goes much further, NNs are not exactly learning the data distribution. If they had, the affine transformation problem would have been already taken care of, there would have been no need for data augmentation by rotating or flipping. Also approximating any algorithm doesn't necessarily mean the underlying data is following a distribution made out of any known algorithm. Also, Neural network struggle even to learn simple mathematical functions, all they do in the approximation is make piecewise assumptions of algorithms.

Here's the grokking paper review that told that NN couldn't generalize to this equation:

x³ + xy² + y (mod 97)

Article: https://medium.com/p/9dbbec1055ae

Original paper: https://arxiv.org/abs/2201.02177

1

BrotherAmazing t1_iyazrs4 wrote

Again, they can approximate any function or algorithm. This is proven mathematically.

Just because people are confounded by examples of DNNs that don’t seem to do what they want them to do, and just because people do not yet understand how to construct DNNs that exist that can indeed do these things does not mean they are “dumb” or limited.

Perhaps you are constructing them wrong. Perhaps the engineers are the dumb ones? 🤷🏼

Sometimes people literally argue, just with plain english and not mathematics, that basic mathematically proven concepts are not true.

If you had a mathematical proof that showed DNNs were equivalent to decision trees or incapable of performing certain tasks, with a mathematical proof, neat! If you argue DNNs can’t perform tasks that can be reduced to functions or algorithms though, and do it in mere language without mathematical proofs, I’m not impressed yet!

2

Difficult-Race-1188 OP t1_iyc8451 wrote

https://arxiv.org/pdf/2210.05189.pdf

Read this paper, it's been proven that neural networks are decision trees, not a mere approximation but precisely that only. 3rd line in the abstract.

1

BrotherAmazing t1_iyeq8zq wrote

Interesting—I will have a read when I have time to read and check the math/logic. Thanks!

I do think I am allowed to remain skeptical for now because this was just posted as a pre-print with a single author a month ago and has not been vetted by the community.

Besides, if there is an equivalence between recurrent neural networks, convolutional neural networks, fully connected networks, policies learned with deep reinforcement learning, and all of this regardless of the architecture, how the network is trained, and so on, and there always exists a decision tree that is equivalent, then I would say:

  1. Very interesting

  2. Decision trees are then more flexible and powerful than we give them credit for, not NNs are less flexible and less powerful than they have been proven to be.

  3. What is it about decision trees that makes people not use them in practice for anything too complicated on full motion video, etc? How does one construct the decision tree “from scratch” via training except by training the NN first, then building a decision tree that represents the NN? I wouldn’t say “they’re the same” from an engineering and practical point of view if one can be trained efficiently and the other cannot, but can only be built once the trained NN already exists.

2