No_Lingonberry2565 t1_ivmkyiw wrote on November 9, 2022 at 1:32 AM

Reply to comment by omgitsjo in [D] Simple Questions Thread by AutoModerator

I suggested inf norm, because that will return a larger value, then when updating the weights through chain rule, it might lead to less sparse reduced states of your data

No_Lingonberry2565 t1_ivm7za6 wrote on November 8, 2022 at 11:55 PM

Reply to comment by omgitsjo in [D] Simple Questions Thread by AutoModerator

Yea you’re right, since loss function for auto encoder for X, and X’ (reconstructed X) would be matrix frobenius norm of X - X’, which would then be close to 0, and then I think the weights would approach zero -> lower dimensional embeddings close to 0 (Im trying to visualize it in my head with the chain rule and weight updates as you back propagate - I THINK it would be something like that lol)

Considering that, maybe make use of some modified loss function that is higher for values closer to 0?

The only difficulty then instead of using a nice Keras architecture and then training automatically, you would probably need to first define this custom loss function, then update Keras model weights with gradient tape, and then even then the loss function you choose might have really shitty behavior and your network may not converge well.

Edit: Ignore my weird comment of making a loss function that is higher for arguments closer to 0.

Maybe try infinity norm of X-X’ in autoencoder instead of just ||X-X’||_F

No_Lingonberry2565 t1_ivluuww wrote on November 8, 2022 at 10:23 PM

Reply to comment by omgitsjo in [D] Simple Questions Thread by AutoModerator

Given you’re working with images, maybe you could perform some non-linear dimensionality reduction, such as using an auto-encoder, or SkLearn has functionality to use PCA with a kernel, and resulting reduced images might be less sparse and easier to work with traditional models?

No_Lingonberry2565 t1_ivluilj wrote on November 8, 2022 at 10:21 PM

Reply to comment by CryptoSatoshi314 in [D] Simple Questions Thread by AutoModerator

A lot of people when starting, they want to go into the fancy and exotic methods - and go straight to learning about things like Deep Neural networks. The thing though, is that at a fundamental level, these more exotic models are composions of more “classical” models, for example, neural nets can be seen as a series of logistic regression problems

Saying that though, first make sure you have a good math back ground - linear algebra (matrix multiplication, understand eigen vectors, some matrix decomposition algorithms), statistics and probability - random variables, joint random variables, density functions for both of them, conditional probability and conditional distributions, and then calculus - understand single variable calculus and multi variable calculus as well, especially the topics of gradients and optimization

Then begin learning some simpler models such as:

linear regression, polynomial regression, decision tree algorithms, etc. then maybe move on to the more exotic models such as RNNs, Transformers, only after you have a strong grasp of the fundamentals.

Especially if you will go the self taught approach, you will not just learn topics once. I have found that as I have relearned topics throughout the years, each time I gain a better understanding of each model(when to use, what kind of data, limitations and advantages, etc.) each time I learn it.

Good luck! DM/comment if you have more questions