Professional_Poet489

Professional_Poet489 t1_j9gk545 wrote

Re: regularization - by using fewer numbers to represent the same output info, you are implicitly reducing the dimensionality of your function approximate.

Re: (a), (b) Generally in big nets, you want to regularize because you will otherwise overfit. It’s not about the output dimension, it’s that you have a giant approximator (ie a billion params) fitting a much smaller data dimensionality and you have to do something about that. The output can be “cat or not” and you’ll still have the same problem.

9

Professional_Poet489 t1_j9gh652 wrote

The theory is that bottlenecks are a compression / regularization mechanism. If you have a smaller number of parameters in the bottleneck than overall in the net, and you get high quality results from the output, then the bottleneck layer must be capturing the information required to drive the output to the correct results. The fact that these intermediate layers are often used for embeddings indicates that this is a real phenomenon.

32

Professional_Poet489 t1_j6zsleg wrote

There are smarter people than me out there, so maybe I’m missing something, but the market doesn’t change trajectories because of any move you make. All finance wants to do is guess what the movement will be (up, down, how much). This is a classification or regression problem, not RL.

0

Professional_Poet489 t1_j6zgk6h wrote

You can find good lectures on all of these topics on youtube, coursera, etc, but that's also true about Bayesian methods. RL is more fun IMO, but less employable for now. RL is used all over the place for things like recommender engines, ad promotion, etc. The concepts are super valuable. Bayesian methods are a bit more generic and common, and tbh are going out of vogue in most of robotics.

1