Dylan_TMB

Dylan_TMB t1_j8a0hrj wrote

If you want to be someone that understands it very deeply get REALLY good at linear algebra and REALLY good understanding of multi-variate calculus.

The not so deep answer to your questions is your understanding right now is right. You have a bunch of functions that take multiple inputs and spit out 1 output and that output is combined with other outputs to be put into other functions. Each function has parameters that can vary which changes the output. When you train you give a bunch of examples that in real life you know (hope) are related. The model learns parameters such that it maps input to output.

That's all that's happening.

1

Dylan_TMB t1_j2gc9va wrote

I actually think you are looking for this:

https://arxiv.org/abs/2210.05189

Proof that all neural networks can be represented by a decision tree. Navigating a decision tree is an algorithm so this would be a representation of the "algorithm"

So a question to ask would be if it is the minimal decision tree?

2

Dylan_TMB t1_j2g6nlc wrote

Good suggestion but not really applicable to this question. TL;DR matrix multiplication done the naive way is slow. There is a way to decompose the matrix and get and algebraic expression that is equivalent but has less multiplication steps making it faster. Finding the decomposition through an algorithm is too time consuming. They trained a reinforcement model that treated decomposition like a game to make a model that can find the decomposition for a set of matrices fast. (I am not 100% decomposition is the right word).

I think what OP is really describing is almost a subfield of explainable AI. I think making an AI that learns to write an algorithm (let's say a python function) that can solve a problem correctly could be attempted. Would be solving multiple loss functions here 1) valid code 2) solves the problem 3) is fast. And 4) isn't just hard coding a dictionary of the training data as a lookup table lol. However, I don't think if a model is learning a function that an algorithm can be extracted.

You have to remember a model is learning a FUNCTION that approximates the mapping of two spaces. So it isn't exact or always correct. So even if the algorithm could be extracted it would be "correct" and I would assume if it could it would not be comprehensible.

7

Dylan_TMB t1_j1l8o93 wrote

Maybe but I think these problems fall into a unique category. Those that know of the problems but don't know ML won't know how to ask or how to articulate them as ML problems. Those with ML knowledge but don't know if the problems obviously don't know they exist. Those that know the problem and ML are solving the problem themselves most of the time.

I think a list of problems is a nice idea but again I don't think those with knowledge of problems know what is a good ML problem and what isn't.

I will say I think a good blueprint for looking for these problems is to find a problem where the data for the problem is similar to a well understood problem.

For example if a problem in a niche can be framed as a seq2seq problem you can use translation models to try and solve it.

Another good one is trying to find problems that can be framed as a game. Reframing problems as games to use reinforcement learning is a good project.

2

Dylan_TMB t1_j1l7g1o wrote

That's kind of the secret sauce you don't know what you don't know. If you could just know what that niche is then people would just be doing it. You often have to already have domain knowledge in some areas.

An example may be GIS stuff (spatial imaging and mapping data). If you know a lot about environmental sciences and geology then maybe there is some interesting problem to be solved in that field and you can be the first to do it. But it requires you to know the problem.

There is no such thing as an easy to find problem that will also give quick results. If it exists it will be done already. If you don't have domain knowledge then you're out of luck and will have to put the work into getting more cross discipline knowledge or innovating on the archetecure side of things.

4

Dylan_TMB t1_iy7brke wrote

I would disagree. t-SNE is taking points in a higher dimensional space and is attempting to find a transformation that places the points in a lower dimensional embedding space while preserving the similarities in the original dimensional space. In the end each point will have it's original vector (more features) mapped to a lower dimensional vector (less features). The mapping is non-linear but that is all that's the result of the operation.

1